You are on page 1of 22

OSF Service Support

Problem Management Process


[Version 1.1]

Table of Contents
About this document 1
Chapter 1. Problem Process 2
1.1. Primary goal

1.2. Process Definition


1.3. Objectives

1.4. Definitions

1.4.1. Impact
2
1.4.2. Incident
2
1.4.3. Known Error Record
1.4.4. Knowledge Base 3
1.4.5. Problem 3
1.4.6. Problem Repository
1.4.7. Priority
3
1.4.8. Response 3
1.4.9. Resolution 3
1.4.10. Service Agreement
1.4.11. Service Level Agreement
1.4.12. Service Level Target
1.4.13. Severity 4
1.5. Problem Scope

3
3
3

1.5.1. Exclusions 4
1.6. Inputs and Outputs
1.7. Metrics

Chapter 2. Roles and Responsibilities


2.1. OSF ISD Service Desk

2.2. Quality Assurance

2.3. Service Provider Group

2.4. Problem Reporter

2.5. Problem Management Review Team

Chapter 3. Problem Categorization, Target Times, Prioritization, and Escalation


3.1. Categorization

3.2. Priority Determination


3.3. Workarounds

3.4. Known Error Reord

3.5. Major Problem Review

Chapter 4. Process Flow

4.1. Problem Management Process Flow Steps


Chapter 5. RACI Chart

12

Chapter 6. Reports and Meetings

13

6.1. Reports 13
6.1.1. Service Interruptions

13

10

6.1.2. Metrics
13
6.1.3. Meetings 13
Chapter 7. Problem Policy

14

About this document


This document describes the Problem Process. The Process provides a consistent method for everyone to
follow when working to resolve severe or recurring issues regarding services from the Office of State
Finance Information Services Division (OSF ISD).

Who should use this document?


This document should be used by:
OSF ISD personnel responsible for the restoration of services and analysis and remediation of root
cause of problem
OSF ISD personnel involved in the operation and management of Problem Process

Summary of changes
This section records the history of significant changes to this document. Only the most significant changes
are described here.
Version

Date

Author

Description of change

1.0

1/14/2011

OW Thomasson

Initial version

Where significant changes are made to this document, the version number will be incremented by 1.0.
Where changes are made for clarity and reading ease only and no change is made to the meaning or
intention of this document, the version number will be increased by 0.1.

263666651.doc

Page 1 of 15

Chapter 1. Problem Process


1.1. Primary goal
Problem Management is the process responsible for managing the lifecycle of all problems. The primary
objectives of Problem Management are to:

prevent problems and resulting incidents from happening.

eliminate recurring incidents.

minimize the impact of incidents that cannot be prevented.

1.2. Process Definition


Problem Management includes the activities required to diagnose the root cause of incidents and to
determine the resolution to those problems. It is also responsible for ensuring that the resolution is
implemented through the appropriate control procedures.

1.3. Objectives
Provide a consistent process to track Problems that ensures:

Problems are properly logged

Problems are properly routed

Problem status is accurately reported

Queue of unresolved Problems is visible and reported

Problems are properly prioritized and handled in the appropriate sequence

Resolution provided meets the requirements of the SLA for the customer

1.4. Definitions
1.4.1. Impact
Impact is determined by how many personnel or functions are affected. There are three grades of impact:

3 - Low One or two personnel. Service is degraded but still operating within SLA specifications

2 - Medium

Multiple personnel in one physical location. Service is degraded and still functional but not operating
within SLA specifications. It appears the cause of the Problem falls across multiple service provider
groups

1 - High All users of a specific service. Personnel from multiple agencies are affected. Public facing
service is unavailable

The impact of the incidents associated with a problem will be used in determining the priority for resolution.

1.4.2 Incident
An incident is an unplanned interruption to an IT Service or reduction in the Quality of an IT Service. Failure
of any Item, software or hardware, used in the support of a system that has not yet affected service is also
an Incident. For example, the failure of one component of a redundant high availability configuration is an
incident even though it does not interrupt service.
An incident occurs when the operational status of a production item changes from working to failing or about
to fail, resulting in a condition in which the item is not functioning as it was designed or implemented. The
resolution for an incident involves implementing a repair to restore the item to its original state.

263666651.doc

Page 2 of 15

A design flaw does not create an incident. If the product is working as designed, even though the design is
not correct, the correction needs to take the form of a service request to modify the design. The service
request may be expedited based upon the need, but it is still a modification, not a repair.

1.4.3. Known Error Record


An entry in a table in CRM which includes the symptoms related to open problems and the incidents the
problem is known to create. If available, the entry will also have a link to entries in the Knowledge Base
which show potential work arounds to the problem.

1.4.4. Knowledge Base


A database housed within CRM that contains information on how to fulfill requests and resolve incidents
using previously proven methods / scripts.

1.4.5 Problem
A problem is the underlying cause of an incident.

1.4.6. Problem Repository


The Problem Repository is a database containing relevant information about all problems whether they have
been resolved or not. General status information along with notes related to activity should also be
maintained in a format that supports standardized reporting. At OSF ISD, the Problem Repository is
contained within PeopleSoft CRM.

1.4.7. Priority
Priority is determined by utilizing a combination of the problems impact and severity. For a full explanation
of the determination of priority refer to the paragraph titled Priority Determination.

1.4.8. Response
Time elapsed between the time the problem is reported and the time it is assigned to an individual for
resolution.

1.4.9. Resolution
The root cause of incidents is corrected so that the related incidents do not continue to occur.

1.4.10. Service Agreement


A Service Agreement is a general agreement outlining services to be provided, as well as costs of services
and how they are to be billed. A service agreement may be initiated between OSF/ISD and another agency
or a non-state government entity. A service agreement is distinguished from a Service Level Agreement in
that there are no ongoing service level targets identified in a Service Agreement.

1.4.11. Service Level Agreement


Often referred to as the SLA, the Service Level Agreement is the agreement between OSF ISD and the
customer outlining services to be provided, and operational support levels as well as costs of services and
how they are to be billed.

1.4.12. Service Level Target


Service Level Target is a commitment that is documented in a Service Level Agreement. Service Level
Targets are based on Service Level Requirements, and are needed to ensure that the IT Service continues
to meet the original Service Level Requirements. Service Level Targets are relevant in that they are tied to
Incidents and Assistance Service Requests. There are no targets tied to Problem Management.

263666651.doc

Page 3 of 15

1.4.13. Severity
Severity is determined by how much the user is restricted from performing their work. There are three
grades of severity:
3 - Low - Issue prevents the user from performing a portion of their duties.
2 - Medium - Issue prevents the user from performing critical time sensitive functions
1 - High - Service or major portion of a service is unavailable
The severity of a problem will be used in determining the priority for resolution.

1.5. Problem Scope


Problem Management includes the activities required to diagnose the root cause of incidents and to
determine the resolution to those problems. It is also responsible for ensuring that the resolution is
implemented through the appropriate control procedures, especially Change Management and Release
Management.
Problem Management will also maintain information about problems and the appropriate workarounds and
resolutions, so that the organization is able to reduce the number and impact of incidents over time. In this
respect, Problem Management has a strong interface with Knowledge Management, and tools such as the
Known Error Database will be used for both.
Although Incident and Problem Management are separate processes, they are closely related and will
typically use the same tools, and use the same categorization, impact and priority coding systems. This will
ensure effective communication when dealing with related incidents and problems.

1.5.1. Exclusions
Request fulfillment, i.e., Service Requests and Service Catalog Requests are not handled by this process.
Initial incident handling to restore service is not handled by this process. Refer to Incident Management.

1.6. Inputs and Outputs


Input

From

Problem

Service Desk, Problem Management Team, Service


Provider Group

Categorization Tables

Functional Groups

Assignment Rules

Functional Groups

Output
Standard notification to the problem reporter and QA
when case is closed

To
Problem Reporter, QA Manager

1.7. Metrics
Metric

Purpose

Process tracking metrics


# of Problems by type, status, and customer see
detail under Reports and Meetings

263666651.doc

Page 4 of 15

To determine if problems are being processed in


reasonable time frame, frequency of specific types of
problems, and determine where bottlenecks exist.

Chapter 2. Roles and Responsibilities


Responsibilities may be delegated, but escalation does not remove responsibility from the individual
accountable for a specific action.

2.1. OSF ISD Service Desk


Ensure that all problems received by the Service Desk are recorded in CRM
Delegates responsibility by assigning problems to the appropriate provider group for resolution based upon
the categorization rules
Performs post-resolution customer review to ensure that all work services are functioning properly

2.2. Quality Assurance


Owns all reported problems
Identify nature of problems based upon reported symptoms and categorization rules supplied by provider
groups
Prioritize problems based upon impact to the users and SLA guidelines
Responsible for problem closure
Prepare reports showing statistics of problems resolved / unresolved

2.3. Service Provider Group


Composed of technical and functional staff involved in supporting services
Perform root cause analysis of the problem and develop potential solutions
Test potential solutions and develop implementation plan

2.4. Problem Reporter


Anyone within OSF / ISD can request a problem case to be opened.
The typical sources for problems are the Service Desk, Service Provider Groups, and proactive problem
management through Quality Assurance.

2.5. Problem Management Review Team


This may be multiple teams depending upon the service supported
Composed of technical and functional staff involved in supporting services, Service Desk, and Quality
Assurance

263666651.doc

Page 5 of 15

Chapter 3. Problem Categorization, Target Times,


Prioritization, and Escalation
In order to adequately determine if SLAs are met, it will be necessary to correctly categorize and prioritize
problems quickly.

3.1. Categorization
The goals of proper categorization are:

Identify Service impacted

Associate problems with related incidents

Indicate what support groups need to be involved

Provide meaningful metrics on system reliability

For each problem the specific service (as listed in the published Service Catalog) will be identified. It is
critical to establish with the user the specific area of the service being provided. For example, if its
PeopleSoft, is it Financial, Human Resources, or another area? If its PeopleSoft Financials, is it for
General Ledger, Accounts Payable, etc.? Identifying the service properly establishes the appropriate Service
Level Agreement and relevant Service Level Targets.
In addition, the severity and impact of the problem need to also be established. All problems are important to
the user, but problems that affect large groups of personnel or mission critical functions need to be
addressed before those affecting 1 or 2 people.
Does the problem cause a work stoppage for the user or do they have other means of performing their job?
An example would be a broken link on a web page is an incident but if there is another navigation path to the
desired page, the incidents severity would be low because the user can still perform the needed function.
The problem may create a work stoppage for only one person but the impact is far greater because it is a
critical function. An example of this scenario would be the person processing payroll having an issue which
prevents the payroll from processing. The impact affects many more personnel than just the user.

3.2. Priority Determination


The priority given to a problem that will determine how quickly it is scheduled for resolution will be set
depending upon a combination of the related incidents severity and impact.
Problem Priority

Severity

3 - Low
Issue prevents
the user from
performing a
portion of their
duties.

263666651.doc

Page 6 of 15

2 - Medium
Issue prevents the
user from
performing critical
time sensitive
functions

1 - High
Service or major
portion of a
service is
unavailable

3 - Low

3 - Low

3 - Low

2 - Medium

2 - Medium

Multiple
personnel
in one
physical
location
Degraded
Service
Levels but
not
processing
within SLA
constraints
or able to
perform
only
minimum
level of
service
It appears
cause of
incident
falls across
multiple
functional
areas

2 - Medium

2 - Medium

1 - High

All users of
a specific
service
Personnel
from
multiple
agencies
are affected
Public
facing
service is
unavailable
Any item
listed in the
Crisis
Response
tables

1 - High

1 - High

1 - High

1 - High

Impact

One or two
personnel
Degraded
Service
Levels but
still
processing
within SLA
constraints

263666651.doc

Page 7 of 15

3.3. Workarounds
In some cases it may be possible to find a workaround to the incidents caused by the problem a temporary
way of overcoming the difficulties. For example, an SQL may be may be run against a file to allow a program
to complete its run successfully and allow a billing process to complete satisfactorily.
In some cases, the workaround may be instructions provided to the customer on how to complete their work
using an alternate method. These workarounds need to be communicated to the Service Desk so they can
be added to the Knowledge Base and therefore be accessible by the Service Desk to facilitate resolution
during future recurrences of the incident.
In cases where a workaround is found, it is important that the problem record remains open and details of
the workaround are always documented within the Problem Record.

3.4. Known Error Record


As soon as the diagnosis is far enough along to clearly identify the problem and its symptoms, and
particularly where a workaround has been found (even though it may not yet be a permanent resolution), a
Known Error Record must be raised and placed in the Known Error tables within CRM so that if further
incidents or problems arise, they can be identified and the service restored more quickly.
However, in some cases it may be advantageous to raise a Known Error Record even earlier in the overall
process just for information purposes, for example even though the diagnosis may not be complete or a
workaround found.
The known error record must contain all known symptoms so that when a new incident occurs, a search of
known errors can be performed and find the appropriate match.

3.5. Major Problem Review


Each major (priority 1) problem will be reviewed on a weekly basis to determine progress made and what
assistance may be needed. The review will include:
Which configuration items failed
Specifics about the failure
Efforts toward root cause analysis are being taken
Solutions are being considered
Time frame to implement solution
What could be done better in the future to identify the issue for earlier correction
How to prevent recurrence
Whether there has been any third-party responsibility and whether follow-up actions are needed.
Any lessons learned will be documented in appropriate procedures, work instructions, diagnostic scripts or
Known Error Records. The Problem Manager (Quality Assurance Manager) facilitates the session and
documents any agreed actions.

263666651.doc

Page 8 of 15

Problem Management Process

Chapter 4. Process Flow


The following is the standard problem management process flow outlined in ITIL Service Operation but represented as a swim lane chart with associated roles within OSF
ISD.

263666651.doc

Page 9 of 15

Problem Management Process

4.1. Problem Management Process Flow Steps


Role

Step

Description

Problem
Reporter

Problems can be reported by any group within OSF/ISD that has the
opportunity to recognize a situation that is likely to create incidents. The
Service Desk or the Service Provider Group may recognize there is a
problem because of multiple related incidents. Quality Assurance or other
groups may do trend analysis to identify potential recurring issues.

Problem
Management
Review Team

Problem detection
It is likely that multiple ways of detecting problems will exist in all
organizations. These will include:
Suspicion or detection of an unknown cause of one or more incidents by
the Service Desk, resulting in a Problem Record being raised the desk
may have resolved the incident but has not determined a definitive cause
and suspects that it is likely to recur, so will raise a Problem Record to
allow the underlying cause to be resolved. Alternatively, it may be
immediately obvious from the outset that an incident, or incidents, has
been caused by a major problem, so a Problem Record will be raised
without delay.
Analysis of an incident by a technical support group which reveals that
an underlying problem exists, or is likely to exist.
Automated detection of an infrastructure or application fault, using
event/alert tools automatically to raise an incident which may reveal the
need for a Problem Record.
Analysis of incidents as part of proactive Problem Management
resulting in the need to raise a Problem Record so that the underlying fault
can be investigated further.

Problem
Management
Review Team

Problem Logging
Regardless of the detection method, all the relevant details of the problem
must be recorded so that a full historic record exists. This must be date
and time stamped to allow suitable control and escalation.
A cross-reference must be made to the incident(s) which initiated the
Problem Record and all relevant details must be copied from the
Incident Record(s) to the Problem Record. It is difficult to be exact, as
cases may vary, but typically this will include details such as:
User details
Service details
Equipment details
Date/time initially logged
Priority and categorization details
Incident description
Details of all diagnostic or attempted recovery actions taken.

Problem Categorization
Problems must be categorized in the same way as incidents using the
same codes so that the true nature of the problem can be easily tied to the
supported service and related incidents.

263666651.doc Page 10 of 15

Problem Management Process

Role

Step

Description

Problem Prioritization
Problems must be prioritized in the same way and for the same reasons
as incidents but the frequency and impact of related incidents must also
be taken into account. Before a problem priority can be set, the severity
and impact need to be assessed. See paragraph 3.2 Incident
Prioritization. Once the severity and impact are set, the priority can be
derived using the prescriptive table.

Problem Investigation and Diagnosis


An investigation should be conducted to try to diagnose the root cause of
the problem the speed and nature of this investigation will vary
depending upon the priority.

Workarounds
In some cases it may be possible to find a workaround to the incidents
caused by the problem a temporary way of overcoming the difficulties. In
cases where a workaround is found, it is important that the problem record
remains open, and details of the workaround are always documented
within the Problem Record.

Raising a Known Error Record


As soon as the diagnosis has progressed enough to know what the
problem is even though the cause may not yet be identified, a Known
Error Record must be raised and placed in the Known Error Database so
that if further incidents arise, they can be identified and related to the
problem record.

Has the root cause been determined and a solution identified?

Problem resolution
As soon as a solution has been found and sufficiently tested, it should be
fully documented and prepared for implementation.

Problem
Management
Review Team /
Change
Management /
Solution
Provider Group

Changes to production to implement the solution need to be scheduled


and approved through the Change Management process.

Problem
Management
Review Team

Problem Closure
When any change has been completed (and successfully reviewed), and
the resolution has been applied, the Problem Record should be formally
closed as should any related Incident Records that are still open. A
check should be performed at this time to ensure that the record contains
a full historical description of all events and if not, the record should be
updated.
The status of any related Known Error Record should be updated to
shown that the resolution has been applied.

Service Provider
Group Managers
& CTO

Weekly review of the status of open major (priority 1) problems (See


Paragraph 3.5 Major Problem Review)

Solution
Provider Group

263666651.doc Page 11 of 15

Problem Management Process

Chapter 5. RACI Chart


Obligation

Role Description

Responsible

Responsible to perform the assigned task

Accountable (only 1 person)

Accountable to make certain work is assigned and performed

Consulted

Consulted about how to perform the task appropriately

Informed

Informed about key events regarding the task

Activity

Service
Provider
Group

Service
Desk Mgr

Service Desk

Service
Provider
Group Mgr

QA
Manager

Record Problem in CRM

Categorize problem according to service and priority

Perform Root Cause Analysis


Develop Solution

Document conditions for known problem record

Create known problem record

Document workaround solution

Enter workaround solutions into knowledge base

Update CRM with current status on problem


analysis & resolution

Verify solution with customer

263666651.doc

Page 12 of 15

Problem Management Process

Chapter 6. Reports and Meetings


A critical component of success in meeting service level targets is for OSF / ISD to hold itself
accountable for deviations from acceptable performance. This will be accomplished by producing
meaning reports that can be utilized to focus on areas that need improvement. The reports must then
be used in coordinated activities aimed at improving the support.

6.1. Reports
6.1.1. Service Interruptions
A report showing all problems related to service interruptions will be reviewed weekly during the
operational meeting. The purpose is to discover how serious the problem was, what steps are being
taken to prevent reoccurrence, and if root cause needs to be pursued.

6.1.2. Metrics
Metrics reports should generally be produced monthly with quarterly summaries. Metrics to be
reported are:

Total numbers of problems (as a control measure)

Breakdown of problems at each stage (e.g. logged, work in progress, closed etc)

Size of current problem backlog

Number and percentage of major problems

6.1.3. Meetings
The Quality Assurance Manager will conduct sessions with each service provider group to review
performance reports. The goal of the sessions is to identify:
Status of previously identified problems
Identification of work around solutions that need to be developed until root cause can be corrected
Discussion of newly identified problems

263666651.doc

Page 13 of 15

Problem Management Process

Chapter 7. Problem Policy


The Problem process should be followed to find and correct the root cause of significant or recurring
incidents.
Problems should be prioritized based upon impact to the customer and the availability of a
workaround.
Problem Ownership remains with Quality Assurance! Regardless of where a problem is referred to
during its life, ownership of the problem remains with the Quality Assurance at all times. Quality
Assurance remains responsible for tracking progress, keeping users informed and ultimately for
Problem Closure.
Rules for re-opening problems - Despite all adequate care, there will be occasions when problems
recur even though they have been formally closed. If the related incidents continue to occur under the
same conditions, the problem case should be re-opened. If similar incidents occur but the conditions
are not the same, a new problem should be opened.
Work arounds should be in conformance with OSF ISD standards and policies.

problem management process

263666651.doc Page 14 of 15

Problem Management Process

In Problem Categorization and Prioritization, it has been made clearer that categorization and
prioritization should be harmonized with the approach used in Incident Management, to
facilitate matching between Incidents and Problems.
A new sub-process Major Problem Review was introduced in ITIL V3 to review the solution
history of major Problems in order to prevent a recurrence and learn lessons for the future.
The primary objectives of Problem Management are to prevent Incidents from happening, and
to minimize the impact of incidents that cannot be prevented. Proactive Problem
Management analyzes Incident Records, and uses data collected by other IT Service
Management processes to identify trends or significant Problems.
Part of: Service Operation
Process Owner: Problem Manager

http://wiki.en.itprocessmaps.com/index.php/Problem_Management#ITIL_Problem_Management_Resolution

Incident Management process is oriented to be effective in quick resolution of incident


and its
reduction of any adverse impact to the service. On the other hand, Problem Management
process is primarily aimed at globally preventing and reducing amount of incidents by
organizing its tasks toward identification of actual cause of a problem. As illustrated on
Figure 1, Problem Management uses information provided by Incident Management and
Change Management (that receives inputs from various sources).

1. Incidents information is collected and sorted by tools that are used in Service Desk
and Incident Management process.
2. After that, historical data of incidents are analyzed by group of specialists with
single or multiple occurrence determination.
3. Before classification, when priority and solution significance is determined, it is
necessary to
evaluate business impact of incidents that are analyzed. Root cause analysis is the
main
point where proactive solutions are determined and provided.
1. After the root cause analysis, the proposal for proactive activities is communicated
with Change Management process that determines changes necessity and
evaluates proactive activities for impact on the other parts of services.
2. After the evaluation, proactive change resolutions are imbedded into system by
incident Management . in that way, we can increase organizations flexibility and
speed up resolution and incident implementation time, especially in large
enterprises without enough flexibility in process of change implementation.

263666651.doc Page 15 of 15

Problem Management Process

PERFORMANCE MEASURING AND MANAGEMENT


Moving from reactive to proactive maintenance management requires time, money,
human resources, as well as initial and continued support from management. Before
improving a process, it is necessary to define the improvement. That definition will lead to
the identification of a measurement, or metric. Instead of intuitive expectations of
benefits,
tangible and objective performance facts are needed. Therefore, the selection of
appropriate
metrics is an essential starting point for process improvement.
Metrics are a system of parameters or ways of quantitative assessment of a process that
is to be measured, along with the processes to carry out such measurement. Metrics
define
what is to be measured. Metrics are usually specialized by the subject area, in which case
they are valid only within certain domain and cannot be directly benchmarked or
interpreted outside it. Although attractive, implementation of metrics can be a two-egged
sword because questionable and inaccurate indicators can cause bad management
decisions.
Depending on the type of data that are collected, a given process may be measurable in
many different ways.
Based on current research, we organized metrics into some distinct and recognizable
operational and financial categories. For these we developed Key Performance Indicators
(KPIs), significant factors that directly and indirectly influence the effectiveness of a
product or process. They are used on its own, or in combination with other key
performance
indicators, to monitor how well a business is achieving its quantifiable objectives. The
basic
idea of KPIs is to provide some mechanism for quantification of the maintenance process.
As targets, KPIs must be widely understandable and accepted concepts, appropriate to be
set within an SLA.
Listed are some recommended KPIs grouped by areas of management:
1. Availability is a measure of time that a service unit or facility is capable of providing
service, whether or not it is actually in service. Typically this measure is expressed as a
percent available for the period under consideration. An Uptime is calculated as total time
minus all known losses due to equipment failures measured in time. Extended losses
could
include also losses due to process set-up, start-ups, adjustments - breaks, lunch,
weekends
etc.
-Availability = Uptime / Total time
-Number of hours (minutes) off Total hours (or minutes) when some equipment
or system was unable to perform its normal functionality
-Lost hours rate - Number of hours (minutes) off / Number of hours operating
-Number of log incidents - Any incident requiring some delivery of maintenance
services
-Incident rate = Number of log incidents / Number of hours operating
2. Reliability is the probability of performing a specified function without failure under
given conditions for a specified period of time.
-MTBF (mean time between failures) is the average time a system will operate
without a failure. The MTBF is a commonly-quoted reliability statistic, and is
usually expressed in hours (even intervals on the order of years are instead

263666651.doc Page 16 of 15

Problem Management Process

typically expressed in terms of thousands of hours [22],


-MTTR (mean time to repair) is the average amount of time required to resolve
most hardware or software problems with a given device or system and indicates
its maintainability,
-MTBR (mean time between repairs) = MTBF MTTR,
-OEE (overall equipment effectiveness) is a combined formula that shows the
overall performance of a single piece of equipment, or even an entire system, by
multiplying Availability x Performance x Quality. OEE has been initially
developed for production and not for services. For that reason we have redefined
availability as percent of scheduled service time available), performance rate as
percent of outputs (service units) delivered compared to standard and quality as
percent of outputs delivered compared to outputs started.
3. Productivity is used to measure the efficiency of delivery of services, and is most often
expressed as a ratio of outputs (delivered services) over time and other resource inputs
used
in accomplishing the assigned task. It is often considered as output per person-hour.
Outputs generally include all labour (hours worked, including overtime) or Equivalent
service units (ESUs) delivered. ESUs are standardized standard service contents used to
aggregate delivered work when there is a service mix with different labour content.
-Labor Productivity = Outputs (service units) delivered / Labor Hours,
-Crew efficiency = Actual labor hours on scheduled work / Estimated labor hours,
-Value added cycle time a portion of the total cycle time where value is actually
added to the product or service,
-Maintenance Process Efficiency = Maintenance costs / Total revenue
or Maintenance costs per delivered service
unit.
4. Planning and Management Quality the basic idea is to predict and plan as much as
possible so that it expresses the proportion of total maintenance time vs. corrective or
unplanned actions.
-Involvement of the preventive & predictive maintenance (PPM) = PPM labor
hours or work orders / Emergency labor hours or work orders,
-Work order discipline = Labor hours accounted on work orders / Total labor hours,
-Planned labour hours / Scheduled labor hours,
-Unplanned labour hours / Total labor hours.
5. Management Performance
-Delivery on-time % - percent of service deliveries made on or before the due date,
-Number of complaints - total number of warranty claims or "Things Gone Wrong"
(TGW's) reported in some period, may be divided by the total number of work
orders or service hours,
-Customer satisfaction - may be measured directly by survey and expressed as a
percentage, such as Percent of satisfied customers.
Based on actual empirical research, the proposed metric gives promising results.

Problem management relies upon historical data on changes, incidents, and users that
may be related to the problem [2].
Most enterprises, which have ambition to exist on the modern market, have already
developed
reactive incident resolution by using single point of contact, tools and analytical methods
for incident classification and monitoring.

263666651.doc Page 17 of 15

Problem Management Process

At this point it is important to develop incident root cause analysis approach that can
summarize and evaluate incidents. This would be the first step in reaching reactive
problem resolution level.
Currently many organizations offer their solutions how to monitor and analyze incidents,
or patented algorithms for trend analysis [1, 4, 21, 22], yet it is important to distinguish
incidents that have single occurrence from those that have deeper and long-term
consequences on business benefit. Such software solutions can be useful, nevertheless it
would be wise to precisely support organization needs and, if necessary, even develop
own solutions to adequately fit business needs.
As the next step how an organization can approach proactive incident and problem
resolution level, we would like to emphasize the possibility of upgrading maintenance
performance by applying solutions or, where possible, tools that are able to speed up root
cause analysis. In some of the already mentioned software solutions there are ideas how
it is possible to do so or, if it is not possible to automate such process, the
recommendation is to perform that job manually. Organizations can assemble a team of highly
specialized and experienced personnel with the task to analyze, evaluate, classify,
diagnose and change incidents and imperfections in system with the ultimate goal to
increase system stability and reduce overall number of incidents. According to Figure 1,
that personnel should be part of Problem Management process and it should work tightly
with the personnel that is part of
Incident Management and Change Management process.

The Figure 3 describes our suggestion regarding proactive activities imbedded in


Problem Management process with the goal to reduce overall number of incident through
incident and problem resolution process, root cause analysis and Change Management
process activities. Incidents information is collected and sorted by tools that are used in
Service Desk and Incident Management process. After that, historical data of incidents are
analyzed by group of specialists with single or multiple occurrence determination. Before
classification, when priority and solution significance is determined, it is necessary to

263666651.doc Page 18 of 15

Problem Management Process

evaluate business impact of incidents that are analyzed. Root cause analysis is the main
point where proactive solutions are determined and provided. After the root cause
analysis,
the proposal for proactive activities is communicated with Change Management process
that determines changes necessity and evaluates proactive activities for impact on the
other
parts of services. After the evaluation, proactive change resolutions are imbedded into
system by incident Management . in that way, we can increase organizations flexibility
and speed up resolution and incident implementation time, especially in large enterprises
without enough flexibility in process of change implementation.

263666651.doc Page 19 of 15