You are on page 1of 13

Understanding the Basics of Failure and Event Coding for EAM and CMMS

by Ralph Hanneman, Senior Consultant, Meridium

Purpose
The purpose of this paper is to discuss failure and event coding in enterprise software systems. It is aimed at asset intensive manufacturing and industry on a broad scale and is not intended for those wanting to track failures in the products they manufacturer or for those looking to track production and operational losses. It can be used as a guide in the development of failure codes or in the assessment of the effectiveness of current failure codes. While the development of failure codes during software implementation projects can be problematic, a clearer understanding of failure coding can eliminate much of the confusion and expedite implementation efforts - for example, a global pharmaceutical company undergoing an SAP implementation was able to develop their entire set of failure codes in under two weeks.

...a global pharmaceutical company undergoing an SAP implementation was able to develop their entire set of failure codes in under two weeks...
This white paper will address the technical aspects related to effective code sets.

Work Process
To begin with, lets note the basic phases in the table below to correct an equipment failure. These steps apply regardless of the system which may be in use.

Background
In todays world, the machines and systems used in manufacturing and industry are complex - comprising thousands of equipment assets at any one facility. Understanding why equipment failures occur is necessary to the development of strategies to improve profit and reduce risk With limited exception, most corporations today are standardized on a select few CMMS (computerized maintenance management system) or EAM (enterprise asset management) systems. Predominate systems include SAP, Oracles eAM, IBM's Maximo and Ventyx (formerly Indus) Passport. In terms of maintaining equipment assets at a facility, these systems are used to request, plan, approve, schedule and execute work. There is usually some form of integration with purchasing, spare parts, contracted services, resources, permits, etc. Equipment failure information - collected at various points in the repair process and documented by others (equipment operators, maintenance supervisors, planners, technicians, etc.) who use the system - is a key component in corporate reliability efforts. However, these reliability efforts are often hampered by incomplete or inaccurate failure information. The reasons surrounding this are twofold - technically there needs to be effective code sets in the system forming the foundation for reporting failure events and organizationally the system users need to fully understand the benefits of the coding process and how the usage of code sets will ultimately profit everyone.

It is obvious that there are a number of hand-offs between the groups involved in this process and that specifics may be overlooked due to the overall timeline. This, coupled with the fact that accurate reporting of a failure event is not inherently a top priority for anyone other than the reliability engineers, often leads to difficulties in capturing critical failure data. Clearly, the development of a failure coding system which is well-organized, user-friendly and clear

2 Understanding the Basics of Failure and Event Coding for EAM and CMMS

will encourage broad system adoption resulting in the capture of improved failure related data.

Organized System of Data


When analyzing the failures of equipment, it is important to know not only what took place, but also where and when it happened. Using system datetime stamps it is easy to determine the when associated with an event. However, given the size and scope of modern manufacturing enterprises, it is of equal importance to have a system for organizing data around the location (where) of the asset as well as the events (what) occurred. The term for this organized system of location, equipment and event data is taxon-

Clearly, the development of a failure coding system which is well-organized, user-friendly and clear will encourage broad system adoption resulting in the capture of improved failure related data.
omy. In terms of equipment reliability, taxonomy is commonly used to refer to the organization of equipment into a hierarchy and the relationships of equipment to various categories as well as specific characteristics for the equipment asset. These are all useful when sorting and grouping work history records. An asset is something which has a value to the corporation. While physical equipment is definitely considered an asset, in some corporations the locations which organize the equipment into systems are also considered an asset. This may be due to operating criteria, financial rules or other considerations. Because of this, in terms of asset performance management, either equipment or locations may be described as assets. A hierarchy (shown in the diagram on page 4) is the organization of data into a structure which represents both the summary and the arrangement. Often equipment is organized by locations in the hierarchy. These locations form the higher levels in the hierarchy, while

information about the equipment, including the components, forms the lower levels. In the EAM / CMMS system, the equipment record represents the physical asset which needs to be operated, maintained or repaired. The location record is used to represent the address within the facility. This address can take several forms - it might be a spatial position in the facility, or an organizational unit within a specific department, but in most cases it represents a position or location in the process (with top-most levels representing the plant locations in the organization). The arrangement aspect of the hierarchy shows where the assets are located. The summary aspect of the hierarchy provides a structured and consistent means of reporting by different levels within the hierarchy. In the hierarchy, the costs and other key figures for each level represent the aggregated values for all the subordinate (child) levels down to the lowest (leaf) level. For example, a report of work order costs for a system would show the costs for all the work performed on the equipment which belong to the specific system. Generally, it is best to have the equipment positioned in the lowest level in the hierarchy. This is where work actually takes place. Additional information about the performance requirements for the equipments installation point in the process should be maintained as characteristics on the location record, this is often called the location specification. Assume that this hierarchy is process based, and then a group of assets belonging to the same system would all have the same system location record as their parent record in the EAM / CMMS system. Similarly, all the systems which belong to the same area would have the same area location record as their parent record in the EAM / CMMS system. The hierarchy can easily be used to represent other structures besides the standard process model. In transmission and distribution systems, the hierarchy can be structured on a circuit model. In mining operations the hierarchy can be based on areas having mobile assets using the fleet concept. The model can be easily developed to accurately represent the actual business process involved in the company. Note: In SAP the location record is termed functional location and the equipment asset record is termed equipment. Collectively they are described as technical objects. The term technical object is also used to describe other objects like bills of materials (BOMs), measuring points, etc.

3 Understanding the Basics of Failure and Event Coding for EAM and CMMS

Factors Affecting System Hierarchy


Adjunct components are a common area of confusion. These are components which are separate but form a part of something else. They usually do not rise to the level of unique equipment and may be found in the margins in the equipment systems. For example, let's take a coupling which is used to connect a motor to a pump. Opinions may vary on whether the coupling belongs with the motor, the pump or something else. To ensure consistency, equipment boundary definitions are used to make sure that everyone has the same understanding about this. Companies may adopt different approaches to developing and documenting equipment boundary definitions. When well-documented and understood by operations, maintenance and engineering, these provide a means of communicating the specifics surrounding the failure event.

Operational (transactional) reports are used to perform work. Common examples are work order backlog or daily stores receipts. Managerial reports are used to manage and improve performance. Common managerial report examples are work order costs by department, work orders - planned and actual costs, or downtime by equipment type. Meridium normally consults with clients to develop their KPIs (key performance indicators) during the software implementation process.

Failure Event Codes: Bridge to Effective Reliability Analysis

For reliability practitioners, failure event codes from work history records are the bridge to effective analysis. Work history is the term used in Meridium to describe the completed work request and work orders. It is the history of work which is useful in reliability When it comes to selecting an EAM / CMMS, compaanalytics. This is event data, not location data. Work nies consider multiple factors. Make sure to underhistory event data is what was actually documented stand any considerations inherent in the EAM / during the process CMMS system of maintaining or which could affect repairing an the development of equipment asset. the hierarchy. Since the repair Meridium knows process may span first-hand that one a period of time size doesn't fit all. and involve a Company cultures, number of differprocesses and ent parties, the equipment systems timely entry of vary. Over the data into the sysyears, Meridium has tem is crucial. consulted with Use the system to clients on their taxdocument the onomies and has minimum necesthe experience to sary information assist corporations about the events in developing a taxas they occur. If it onomy which Typical location/equipment asset heirarchy wasn't documentincludes detailed ed, then it didn't failure codes. happen, at least It is important to consider what data you will get out in terms of data which may be used in failure analysis of the system based on its design. Generally, these at some point in the future. A failure event needs to structures have been designed to support reporting of be recorded at the correct equipment, but sometimes key figures or measures such as costs, hours of downthis doesnt occur. Sadly, some system users find it easitime, etc. according to different views or dimensions er to record the failure at a convenient place in the into the data. Apart from an array of financial data system rather than the correct place. The actual failure and other reporting variations, reporting information data can be easily missed if the work history is written falls into two basic categories - operational and manaanywhere else in the hierarchy other than at the gerial. equipment asset which malfunctioned.
4 Understanding the Basics of Failure and Event Coding for EAM and CMMS

The way that people document an event can be highly variable. Even the basic description of the malfunction can vary considerably from one author to another. Factors which may influence a simple narrative description include available time, motivation, attention to detail, understanding of the system and component interaction. It is not effective to filter, sort or group work history event records solely by the use of narrative descriptions. This is why failure event codes are very important. The use of codes in the system ensures a consistent way of documenting the key aspects of the event according to pre-defined categories. These codes are used in mining data in the system, which in turn makes the subsequent analysis possible. It may be useful to supplement a specific code entry with brief comments or other text. This is possible with some systems and is helpful in more detailed analysis of the failure events. Reliability practitioners need quality data as a starting point for their analysis efforts. Although it is technically possible for someone to re-code a failure event after the fact, this isn't the best solution. This is because the work order usually has to be re-opened in order to make these sorts of changes in the EAM / CMMS which usually affects the quality and accuracy of the information. Plus, this re-work isn't efficient. Ideally, it is best to have the data entered correctly closer to the event by those who investigated and corrected the failure. A more complete picture of the hierarchy showing additional details about the locations and equipment as well as event data may be found at right.

methodology for the development of taxonomy, equipment boundaries and failure event codes. It also contains guides to interpret and calculate reliability and maintenance data, as well as key performance indicators. Another initiative which has undergone significant development is the ExxonMobil Enterprise Equipment Taxonomy.3 This taxonomy documents a structured classification of all significant equipment, equipment components and maintenance activities that may be found in a given petrochemical facility and was developed by EMRE, the research and engineering arm of ExxonMobil (XOM) and is licensed by Meridium. EMRE developed the taxonomy to be a standardized method for classifying, measuring and tracking equipment specifications and performance across ExxonMobil operations worldwide. Through use of the taxonomy, ExxonMobil has simplified its own internal data collection while reducing maintenance costs.

Failure Codes and Standards


There have been several efforts over the years to create taxonomies and equipment failure codes. One that has been adopted internationally is ISO 14224: Petroleum, petrochemical and natural gas industries - collection and exchange of reliability and maintenance data for equipment.1 This international standard is also published by the American Petroleum Industry as API 689. ISO 14224 has its roots in OREDA, an organization sponsored by nine oil and gas companies in an effort to collect and exchange reliability data among its participants.2 The standard offers a good reference

Taxonomy Overview of Location, Equipment and Event Data

5 Understanding the Basics of Failure and Event Coding for EAM and CMMS

In the early 90s, another initiative resulted in ISO 1592: Industrial automation systems and integration integration of life-cycle data for process plants including oil and gas production facilities.4 The standard specifies a life cycle view of the information requirements of process industries. A part of this standard is the Reference Data Library, which holds technical class descriptions of main equipment items.5, 6 A more recent effort addressing asset management best practices may be found in PAS 55 - Asset management.7 This publicly available specification is published by the British Standards Institute and contains relevant guidance for utilities and transport organizations.

actors - so that strategies can be developed to mitigate future failures resulting in higher margins of safety, cost, environmental responsibility and productivity. Given these key business drivers, it is not crucial that an organization have the absolute best set of codes to begin with. It is more important to have a good set of codes which will be used across the organization while at the same time allowing the user community to suggest codes which should be added. Over the course of time, the code sets will be refined as part of the continuous improvement cycle. This interaction of the user community with the reliability practitioners will foster a naturally occurring development process that will result in greater participation, understanding and adoption.

Results Driven Failure Coding


Its apparent that there are a number of standards which may be used in the development of failure codes for industry and manufacturing. At the same time, there may be any number of consultants offering failure code packages. In terms of detail, code sets found today may range from simple to complex. Some purists would argue that successful failure coding requires the initial development of the most comprehensive and complete set of codes possible. However, developing the necessary taxonomy and failure coding to support the depth and breadth of this strategy requires significant time and resources which can often be counterproductive to fueling support for the effort. Also, it is easy during EAM / CMMS system implementations to fall into the trap of building unnecessarily extensive code libraries. This happens for a number of reasons: These projects demand considerable attention to detail There is a risk adverse culture which accompanies these large scale projects The functional stakeholders assigned perceive that post-implementation changes will be difficult On the other hand, I would argue for a more practical approach, one which is driven by results. While it is important to develop a model which is scalable across the enterprise, it is also important to match the effort to the desired results. Consider that the chief aim of failure event coding is to have data in a computerized maintenance management system which facilitates finding the assets which are experiencing the highest failure rates - the bad

Methodology and Failure Coding Essentials


Meridium has found the methodology outlined in ISO 14224 to be an effective means in which to anchor stakeholders during code development. However, since it's often not possible for a company to simply adopt ISO 14224, Meridium assists companies with failure coding development. Not only can this speed the development process, but it also helps to ensure that the failure code data integrated into Meridium from the EAM / CMMS system is a valuable input for reliability analytics.

While it is important to develop a model which is scalable across the enterprise, it is also important to match the effort to the desired results.
First, let's review some essentials of the asset hierarchy. This is the organization of the location records and equipment asset records in the EAM / CMMS. There needs to be an agreed upon structure for the

6 Understanding the Basics of Failure and Event Coding for EAM and CMMS

hierarchy. This structure supports the locations and equipment (technical objects), failure data and maintenance data. At the very least, the structure includes the location records and the equipment records. There may be at least four location levels for the technical objects. For example, site - area - unit - location. Some companies have more and ISO 14224 defines five levels at the use / location hierarchy structure. In any software system, a data record usually has an identification (ID) field and a description field. This is true for location and equipment asset records. When it comes to the location record, there is usually some sort of naming convention or indicator for the structure of the ID field. In the design of this structure, it is best to segment each level by some kind of delimiter usually a - (dash), although _ (underscores) and . (periods) have been used. The location record may also have another field on the data record to represent the parent or superior location. This is used in some systems (like SAP) to visually represent the hierarchy tree. The structure used should make it easy to read in terms of understanding the arrangement of any object in the hierarchy. Typically this is done by consistently including the data for the parent level in any child level. This means that all records representing an area should have the area segment preceded by the value representing site in the ID field, and so forth down to the location. For example: SITE01-AREA01-UNIT01-LOC01 SITE01-AREA01-UNIT01-LOC02 Using the location reference on the work orders helps later on when analyzing data. Often wildcard characters (i.e. * or %) can be used when querying the work order records in a system. By means of the structure shown above, its relatively easy to find all the work order costs for Area01 using an entry like SITE01-AREA01*. Recall that the location represents an address in the process, while the equipment represents the physical object which has to be repaired or maintained. Locations often have process specific design considerations, usually found in the P&ID (process and instrumentation drawing). So, the location record is a good place to maintain key data related to the process, like system pressures, static head, etc. On the other hand, the equipment may move from one place to another in the process. A pump, instru-

ment or valve which fails may be replaced with another unit while the failed unit is repaired for later use elsewhere in the process. The characteristics of these unique equipment assets should be maintained on the specific equipment record. Since the equipment asset may be moved around in some fashion during its lifetime in the plant, it is not necessary to have a structure for the equipment record ID field. Most systems have an internal number sequencer which auto-numbers equipment records. Although external numbering assignments are often possible, they aren't really necessary since all that is needed is a unique identifier in the ID field. Theres a separate hierarchy for equipment assets which is based on their function. Typically there are one or more fields on the equipment record which may be used to represent this hierarchy structure. This structure normally begins with the equipment category, such as fixed, rotating etc. Next is the equipment asset class, such as heat exchanger, pump, valve etc. Then it is down to the specific equipment asset type (see examples below). This categorization of equipment assets, down to the equipment type, is useful when grouping work history

records to understand where the failures are occurring and in support of other initiatives within the company.

Detection Phase: Writing the Work Request/Work Order


When failures occur, it is usually first recognized at the equipment level by operating or production units. The initial failure event is observed and reported during the detection phase.

When did the failure occur?


When a malfunction takes place, if the work request is

7 Understanding the Basics of Failure and Event Coding for EAM and CMMS

entered promptly into the system, then accurate time periods are automatically recorded. The same is true of the time that it takes to restore the equipment from the malfunction. These time periods are needed later on to determine the MTBF (mean time between failure) and MTTR (mean time to repair). In SAP, the system proposes the malfunction start and end dates / times of the notification (work request) based on the creation and completion dates / times. These proposed values can be updated by the user as needed.

How was the failure discovered?


The failure event was discovered in some fashion, this is the method of detection. It may have occurred because something affected the production of the manufactured product, was observed during normal rounds, during routine tests, or discovered in some other way like chance observation. This information is crucial to determine if the existing strategies are effective or if new strategies may be needed.

Where did the failure occur?


This is done by entering the work request in the system using the correct technical object (location or equipment record). Often system users find it easier to simply write the work request to some location that is easy to remember, rather than searching for the correct place in the system. It is better to use the lowest appropriate level in the hierarchy such as the record which includes the actual maintainable item rather than the higher level location or subsystem. Also, be sure to indicate that a failure exists in some fashion. This is done either by using a specific work request / order type, setting an indicator on the work request / order or entering additional failure information on the work request / order. Having this information in the system makes it possible to distinguish component replacements which were due to failure related events from those which were non-failure replacements. Reliability analysis can be adversely affected by treating a data point as a failure event when it should have been treated correctly as something else such as a component replacement without failure (for other reasons). It may be best to use a specific work request type or order type to report equipment failures or malfunctions. This will facilitate future analysis of the work history data; otherwise, it will be more difficult to understand the costs and labor needed to make repairs. These work requests / orders can be segregated from other activities such as preventive maintenance (PM), predictive maintenance (PdM), routine, improvements etc. In some systems, there is a unique indicator to show that a failure exists. For example, in SAP, the work request (notification) has a breakdown indicator. In fact, the notification in SAP is used to store all the technical history for the repair, while the maintenance work order is the controlling document used to plan, estimate and execute the work.

What is the symptom?


This first level of failure is called the failure mode in ISO 14224. It is the visible symptom at the equipment asset level. When this takes place, a work request or work order is usually entered in the EAM / CMMS. Similar to a parent taking their child to the clinic, all that is known at this time is the symptom. No detailed reporting or analysis is to occur at this point. For example, an operator may report that the pump failed to start. This is done by describing the problem and selecting the correct code for the failure mode in the system. Since the symptoms are somewhat generic, the same list of codes may typically be used for all equipment types. What was the effect of the failure? Usually, the person reporting the failure has some idea of the effect of the failure to the organization. There may have been no effect, or the failure may have affected the environment, safety or production. This information is not only useful in prioritizing any subsequent mitigating actions but is also an important aid in compliance reporting to third parties like environmental or safety regulatory agencies. If possible, indicate the degree of failure or functional loss. The equipment experienced a malfunction, but to what extent did the equipment malfunction? Was it a complete failure, such as when the pump fails to start? Was it a partial failure, such as when the pump cannot maintain the desired flow rate? Was it a potential or latent failure? While these may not seem like actual failures because they result from event conditions which do not trigger an active fault -it is likely that they will do so at a future point, as in the example of corrosion in a piping system. Categorizing these events correctly supports developing better mitigation strategies moving forward. Keep the guidelines simple, yet effective. When something malfunctions:

8 Understanding the Basics of Failure and Event Coding for EAM and CMMS

Write the work request / order to the equipment asset Indicate that a failure exists Select the correct code for the failure mode Note who found the failure and how the failure was discovered Describe the problem Optionally, indicate the operational effect and the degree-of-failure Many of the EAM / CMMS systems can make entering data a mandatory requirement in specific fields. This is good when limited to the minimum requirements, but can create problems if there are too many required fields. Do not require the system users to enter additional failure information, like maintainable item or failure mechanism, when creating the work request / order. At this point, they only know the observable symptom; it frustrates them and results in bogus entries into the system just to get past the required field entry. This additional information should be entered later by those making the repairs. It will be more accurate and meaningful during later analyses.

enters the information required in this phase; it could be the technician, the maintenance planner or supervisor. This information can be collected at various points in the process, but should be reviewed for accuracy just prior to completing the work request / order. While the codes which may be needed for the failure modes are somewhat generic or abstract, the codes needed for the maintainable item need to be more specific and tailored to the equipment type. There needs to be a good balance to these code sets. Keep in mind the need for just enough detail. It is easy to go overboard and build out the complete bill of material for an equipment asset as part of the maintainable item codes. The effort should be carefully considered before doing so - in many cases it is not necessary. Bear in mind that for the purpose of failure event reporting, the maintainable items which make up equipment asset types are codes which describe components found in the equipment. So, bearing is a code value which would apply to all the bearings in the equipment. This may be further divided by bearing type, such as ball bearing, roller bearing, roller thrust bearing, etc. Keep the results in mind. If subsequent analysis indicates that bearing failures are prevalent among the facilities' asset population, then perhaps a review of the predictive / preventive maintenance strategies involved is in order. At the same time, the failure history records should be well kept at this maintainable item level, because it is the failed components which are driving the equipment asset malfunctions. Reliability practitioners usually have a good perspective on the detail needed for these code sets. Sometimes the code sets are developed in-house by plant maintenance and engineering resources; not wanting to overlook anything, they may develop detailed code lists which are more itemized than necessary. In order to get results from the work history, the selection list of codes needs to be manageable. Too many codes in the selection list are confusing - it will not be easy for the system users to make the correct selections. In this situation, the easiest thing for the user would be to select an item at the top of the list and move on, negatively impacting the quality of the data in the system. From a practical standpoint, consider the quantity of codes which are visible on the screen. In SAP for example, 21 items are visible on the screen, more than that results in having to scroll down to see additional items. Something occurred which resulted in the failure of the maintainable item. For example, it may have been corrosion, fatigue or leakage. The method in which the

Correction Phase: Doing the Work and Updating the Work Request/Work Order
During the investigation / correction phase, the maintenance technician will usually find that some component failed which resulted in the malfunction of the equipment asset. This component is also known as a maintainable item, or object part (in SAP). It is important to distinguish components that were directly replaced because of the failure event from those components that were replaced for other reasons. Sometimes other parts are replaced during the repair process, for example when the failed component caused a secondary component failure or when there was an opportunistic component replacement. If this additional information is not recorded, then there is no difference between replacement event data and failure event data in the system. Where this occurs, it can certainly distort an analysis and requires extra effort to scrub the data in order to make it useful. If components are replaced as a result of inspections or other forms of predictive maintenance indicating a potential, latent or incipient failure event, then this should be documented accordingly, because it is still a failure. Typically, someone in the maintenance department

9 Understanding the Basics of Failure and Event Coding for EAM and CMMS

component failed is described as the failure mechanism. These codes are more broadly categorized and may be developed at the equipment asset category level. There may be a few notations within the higher level categories of mechanical, electrical, instrument and so forth. ISO 14224 provides an excellent reference in this regard. Often confusion or other problems arise around the cause of the failure. The actual cause may not be fully understood. Also, coding the cause in the system may assign blame to others in operations, maintenance or management. But, understanding the failure cause is important to determine not only the necessary actions but also the extent of those actions. If the equipment is routinely being operated well outside of its intended operating parameters, it may be necessary to consider many factors which may influence this. The codes used to document the cause are those which best categorize the underlying or root cause of the failure. The specific cause may not be well known at the time and in some cases it may take a formal root cause analysis, using methodologies like PROACT for Meridium to fully investigate and document the root cause. Again, the notations found in ISO 14224 are an excellent aid to developing these codes. When restoring the equipment asset to its normal function, a specific activity is performed. These are somewhat generic in nature, and include things like replaced, adjusted, modified, etc. Reliability practitioners use this in determining the nature of mitigating activities for the equipment asset. If a particular controlling mechanism is going out of tolerance over a specific time span (perhaps due to its operating environment) then periodic inspections and calibrations may be warranted.

the process Select the correct codes for: Maintainable item Failure mechanism Failure cause Activity

Data automatically recorded in the system


As mentioned earlier on the topic of failure date, some information is automatically captured in the EAM / CMMS system. This includes dates and times for specific processing stages of the work request / work order. There may be many types of date time stamps in the system. Basically, the document creation date and completion date may be used to determine the overall timeframe that the asset was unavailable. Sometimes there are data fields to specifically document the malfunction start and finish dates and times, as in the case of SAP. It is also important to understand the costs of the maintenance or repairs. This is another reason why it is important to create the work request / work order using the correct technical object. Some find it expedient on occasion to circumvent established approval processes by using multiple work orders to make a single repair, or to make a modification to existing equipment over a period of time. This should be guarded against because it distorts the true occurrences of the failure events. Production costs are useful in making decisions about the priority of strategies and to understanding the potential improvement benefits. While these costs may

Tasks should be mentioned in passing and may be documented in the system on the work order, but these are activities which may be performed in the future, and are usually not needed for failure analysis. Often the most uncomplicated approach is to create another work request in the system for the future activity. This new work request will be managed separately, while there will be no visibility for a closed work request / work order.
During the correction phase: Update the work request / order during key steps in
Recommended Event Data for Reliability Analysis

10 Understanding the Basics of Failure and Event Coding for EAM and CMMS

not be directly available in the EAM / CMMS, having a way to relate these costs to the failure events is invaluable input for reliability practitioners.

SAP SAP uses catalogs, code groups and codes to develop the items which may be used on the notification when coding the technical history of work. A specific collection of these are assigned to catalog profiles. Technical objects (functional location or equipment records) may have a single catalog profile assigned to them. When a notification is raised in the system, it is referenced to a technical object. It then uses the codes assigned to the catalog profile from the technical object. Options are available when the desired data for reliability does not map over directly to a corresponding field in SAP notifications.

Event Data for Reliability Analysis Summary Requirements


Although somewhat subjective, the example on page 10 summarizes the recommended data needed for reliability analysis and where the data may be captured. Event Data for Reliability Analysis EAM/CMMS Comparisons Terms used in describing event data for reliability analysis may vary between EAM / CMMS systems. This section will map the necessary event data for reliability analysis to corresponding fields in several EAM / CMMS systems and show the data entry method. Many EAM / CMMS systems are designed to support both configuration and customization. Configuration is the structure provided by the software provider which enables the company to tailor the software solution to their specific business processes. Configuration is integrated throughout the various modules of the software solution. It can include the fields that will be displayed on a specific screen as well as the data in the pick or selection list for those fields. Customization goes beyond configuration and extends the software solution using specific additional programming. Some of the data needed for reliability analysis are easily provided using configuration, while other data may require customization.

Indus PassPort Indus PassPort (also known as Asset Suite) supports the capture of failure codes at the work order task level. Any number of failure modes and root causes can be captured by task, plus reason category and reason code. All of these codes are user-defined and the failure mode codes are qualified by equipment type while the root cause codes are qualified by failure mode. In addition, a Trouble/Breakdown flag is set to indicate the task which represents the failure.

Coding structure differences


Coding structure may vary between different EAM / CMMS systems. The solutions may range from a static list of values to a dynamic interrelated selection list. It's important to understand the technical structure used before developing the set of failure codes down to the equipment asset type.

11 Understanding the Basics of Failure and Event Coding for EAM and CMMS

IBM Maximo Maximo is designed to support a series of cascading pick lists where the values at the current level are determined by the selection at the higher level. After the user selects the code for problem, the system presents the set of values for cause. The selection used for cause determines the set of values for remedy.

mines the order of placement in the list. Many systems today support multiple languages - in software terms this is called localization - so a 3 letter code for failed- to-start may be totally meaningless in another language. The most straightforward approach is to use an alpha - numeric sequence as the keys for the codes within a specific category. Be sure to allow for the insertion of additional codes later on in the existing list. The best way to do this is to originally number the items by decades (0010, 0020, 0030, etc.). This way a new item may be inserted between the existing items by using a different number (e.g. 0015).

Oracle eAM Oracle eAM uses an asset group hierarchy which includes a predefined item template to create the navigational category of assets in conjunction with optional intermediate levels. Several options exist to include the location / use information in the taxonomy. Oracle has a variety of solutions and provides considerable flexibility in setting up these codes. The Activity Type and Activity Cause fields are found on the work order header. Other information may be entered using Quality Collection Plans and Oracle FlexFields.

How Do I Use This Information?


To provide everyone in your organization with a basic understanding of failure coding As a starting point for individuals responsible for the development of failure codes To augment the experience of those guiding failure code implementations - aiding stakeholders, champions and ultimately, the business sponsors as they make decisions during implementation efforts To measure the meaningfulness of your current failure codes To aid in the review process if you currently have failure codes in your system, but their effectiveness is limited In closing, please keep in mind that having and using good failure event codes is a necessary beginning. The quality and accuracy of this data directly affects the ability of skilled practitioners to maximize the value of improvement opportunities.

Library of Failure Codes


Often it is helpful to use a workbook / spreadsheet to build the complete library of failure codes outside of the EAM / CMMS system to make sure the code keys and descriptions follow the naming convention and adhere to any enterprise data standards within the company. In years past, code keys were often developed using 3 or 4 letter abbreviations, which were supplemented with a description. For example, FTS would be used to indicate that the equipment failed to start. This isn't really necessary and is somewhat counterproductive. Originally, the 3 letter codes were used because of system limitations which made it difficult to display the descriptive text. In the EAM / CMMS system the codes are typically sorted in the selection list by the key value, so if the codes were developed using a 3 letter abbreviation, then the letter that is used deter-

12 Understanding the Basics of Failure and Event Coding for EAM and CMMS

References
1 http://www.iso.org/iso/iso_catalogue/ catalogue_tc/catalogue_detail.htm?csnumber=36979 http://www.sintef.no/static/tl/projects/oreda/ http://www.meridium.com/news_events/ articles/articles.asp?article_ID=14 http://www.iso.org/iso/iso_catalogue/ catalogue_tc/catalogue_detail.htm?csnumber=29556 http://en.wikipedia.org/wiki/ISO_15926_WIP http://projects.dnv.com/reference_data/ RD7Browser/doc/general.aspx http://www.bsi-global.com/en/Shop/PublicationDetail/?pid=000000000030077936

About the author


Ralph Hanneman, CMRP Senior Consultant, Meridium, Inc.
Ralph is a Senior Consultant for Meridium, Inc. who began his career in the Navy where he obtained considerable experience in the maintenance and operation of all facets of shipboard electrical systems, as well as the Navys preventive maintenance methodology. He has nearly 20 years maintenance management and project engineering experience in pulp & paper, automotive parts manufacturing and heavy construction (tunnel boring). Ralph has substantial experience in many facets of SAP gained with five years of ERP implementation projects in Plant Maintenance roles. He is experienced in SAPs BI solution, in particular Plant Maintenance reporting. Since joining Meridium in early 2007, Ralph has been involved in pilots and enterprise implementations of Meridium with PEMEX, BMA Coal, Hydro One, Hess, Rio Tinto, Bruce Power, Samarco Minerao and Flint Hills Resources in consulting and project management roles. He has conducted failure code development workshops for clients using SAP and Oracle eAM.

2 3 4

5 6 7

Corporate Headquarters Roanoke, Virginia, USA +1.540.344.9205 Regional Office Houston, Texas, USA +1.281.920.9616 Europe Walldorf, Germany +49.6227.7.33890 Middle East, Africa Dubai, United Arab Emirates +971.4.365.4808 Asia Pacific Perth, Australia +61.08.6465.2000 www.meridium.com info@meridium.com

Contact Meridium to develop a plan to learn more about how failure and event coding impact your Asset Performance Management initiative.
The views and statements made in this document are based on the experience and opinions of Meridium consultants and have not been evaluated by any authors or governing body.