You are on page 1of 26

ITIL – Problem Management

Problem Management is structured to address the causes of incidents which pose


the greatest risk. (Negative risk) Therefore it focuses on the heavy hitter recurring
service affecting events; it doesn’t find the root cause or permanent fix for every
incident. Success is measured in terms of what has been removed from the
environment.

• How many problems are identified and removed from our IT environment.
• Problems which have a status of resolved and closed.

So let’s walk the process, for problem management.

First an incident occurs. An incident is any unplanned outcome from the operation
of an information system. Incidents interrupt the IT service which the customer
receives. Incidents are normally reported to the service desk, and an incident
record is created.

Next, the incident is assessed. If the cause of the incident isn’t know, then the
incident is escalated to a problem. A problem is an incident whose cause is not
known.

As the problem is reviewed, the cause of the problem and a workaround maybe
determined. As soon as these two aspects occur, the problem is changed to a
known error.

Finally the known error is assessed to determine if the symptoms of the incident
match an already existing problem record. If so, the new incident is cross-
referenced to that problem

However if the known error doesn’t match any existing symptoms, a net new
problem record is created.

The terminology incident, problem, and known error portray the effect and root
causes of unexpected events in an information system. Identifying the cause of
these events and minimizing their impact is the primary purpose of the problem
management process.

The goal of problem management activities is to ascertain the root causes of


incidents and to minimize their impact on the business operations of a company.
This is done through the following processes:

• Problem control - The purpose of problem control is to identify problems


within an IT environment and to record information about those problems.
Problem control identifies the configuration items at the root of a problem
and provides the service desk with information on workarounds.
• Error control - The purpose of error control is to keeps track of known
errors and to determines the resource effort needed to resolve the known
error. Error control monitors and removes known errors when it's feasible
and worthwhile.
• Proactive problem management - The purpose of proactive problem
management is to find potential problems and errors in an IT infrastructure
before they cause incidents. Stopping incidents before they occur provides
improved service to users.
The primary measure of the success of the problem management process is how
many problems are identified and removed from an IT infrastructure. Therefore,
the primary output from this IT service management process renders problems
that are resolved and closed.

The work of problem management produces the following outcomes:

• Records of known errors and available workarounds - These records


are kept in the configuration management database (CMDB), and they
provide information to the service desk and other ITSM processes.
• Requests for change (RFCs) - RFCs describe changes needed to remove
a known error. Problem management does not approve or perform the
change. RFCs are sent to another ITSM process, change management.
• Changed records in the CMDB - Information about a known error and
any affected CIs is forwarded to the configuration management process,
the IT service management process that maintains the CMDB.

When the problem management process is used to identify the root causes of
problems, it's far more likely that they will be diagnosed correctly and fixed
properly. As a result, problems are permanently eliminated.

Problem management includes the following two types of approaches to address


problems:

• Reactive problem management - Reactive problem management seeks


to cure the symptoms of problems. The reactive approach responds to
reports of incidents that have already occurred. Reactive problem
management can be viewed as two activities
o Problem control activities - The major problem control activities
are:
 Identification and recording - Problem management
receives information about reported incidents from the
incident management process and the service desk.
Members of the problem management team analyze this
information, looking for similarities in the symptoms of
reported incidents. They look for records of previously
identified problems that can explain the symptoms. If none
can be found, a record describing a new problem is created.
 Classification - This control activity identifies the
importance of new problems and designates resources for
addressing them.

Problems are classified by category, such as hardware, software, or


other types. Then they can be assigned to the corresponding
support personnel. Problems are also classified by priority ranking.
Problems with higher priority rankings are addressed before
problems with lower priority rankings.

Investigation and diagnosis - Problem management teams look


for the root cause of problems. If the cause is determined, problem
management recommends a workaround or a temporary fix for the
problem.
 Identify cause of problem and devise a workaround -
In the automated service management system, the status of
the problem is changed to that of a known error.

When an IT department applies problem control activities, it


prioritizes the problems that present the biggest threat to the
information system or the company's ability to conduct business.
When the root cause of a problem has been found and a
workaround has been devised, problem control activities end. Then
the second group of activities in reactive problem management
begins.

• Error control activities- Now the problem becomes a known error in the
IT infrastructure, and error control activities begin. Error control activities
include:

 Error identification and recording - This means creating


a record that identifies a known error and all the
configuration items (CIs) that cause the error or are affected
by it.
 Error assessment - This activity prioritizes errors and
places them into groups according to their importance.
 Error resolution recording - The resolution to a known
error may include changes to hardware or software, user
training, or operational procedures. Error control creates a
request for change (RFC) and forwards it to change
management. The RFC is cross-referenced to the known
error in the automated service management system.
 Error resolution monitoring - Changes are planned and
implemented by other IT service management processes.
Problem management monitors the effect of problems on
service provided to users and the progress of requested
changes until they're complete.
 Error closure —The final error control activity is error
closure. When recommended changes to fix a known error
have been completed, the known error record in the service
management system can be closed. Records of incidents
and problems associated with that known error may also be
closed.

• Proactive problem management - Proactive problem management


seeks to inoculate IT systems against problems. The proactive approach
identifies potential problems before they emerge.

o Trend analysis - This is the process of examining problem and


incident reports to discover what types of problems are happening
more frequently. Trend analysis of existing problems and incidents
can reveal where similar problems may occur in other places within
the infrastructure. It can also show that repeated failures have not
been adequately resolved and are likely to continue to happen.
o Targeting preventative action - This process applies the same
techniques used in reactive problem management to a select few
potential problems with a high degree of business impact. Targeting
preventative action may include creating RFCs, training users and
service desk team members, or recommending procedural changes
within the IT department.
The groups of problem management activities; (problem control, error
control, and proactive problem management) identify and resolve
problems which have the greatest potential impact on a company's
business.

The success of problem management depends on having the right people


performing the right actions. Responsibility for leading the problem management
process is assigned to one person designated as the problem manager. The roles
of the problem manager are:

1. To maintain and develop problem control activities - It's the problem


manager's job to make sure that information about incidents within the
system is being received and reviewed in a systematic way.
2. To monitor the effectiveness of error control activities and make
recommendations for improvement - She must also ensure that
relationships among configuration items are considered in proposed
solutions to problems.
3. To cascade information about workarounds or fixes to those who
need it - Communication with the service desk and incident management
is a key role performed by the problem manager.
4. To monitor the progress of problems and known errors toward a
final resolution - If solutions aren't implemented as quickly as necessary,
the problem manager may follow procedures to escalate the priority of the
problem.

Each of these four roles contributes to the ability of problem management to


identify and resolve problems and known errors quickly. The problem manager will
also perform typical supervisory roles to direct the activities of any other problem
management team members.

The problem manager's duties should never be combined with the duties of the
service desk supervisor. The priorities of the service desk and problem
management are often incompatible.

The success of problem management also relies on critical factors before, during,
and after the main activities in the problem management process. The critical
factors for success are:

• Performance targets - It's important to decide how the performance of


problem management will be measured before the process is
implemented. If possible, use statistics from the previous support activities
to set goals for problem management.
• Periodic audits - Perform periodic audits to determine whether problem
management procedures are being followed. Problems that aren't properly
reported or investigated are more likely to cause interruptions of service to
users or a major impact on the business.
• Problem reviews - Conduct major problem reviews after problems with
high urgency or impact have been resolved. Look for ways to improve the
way problems are identified and resolved. Problem management
procedures should be continually improved.

Problem management will succeed when an effective problem manager fills the
required roles, and critical factors for the success of the process are included in
everyday operating procedures.
Implementing problem management brings many benefits to a company and its IT
department. However, there are also some problems and costs that arise during
the implementation of problem management.

Among the most common problems companies experience is a difficulty


establishing adequate communication between problem management and
another IT service management process, incident management. Communication
between the two can be difficult because they pursue the following conflicting
goals:

• Problem management - The goal of problem management is to


investigate the root cause of a problem. The speed with which a solution is
found is an important, but secondary, consideration.
• Incident management - The goal of incident management is to recover
from incidents and restore service to users as quickly as possible.
Determining the cause of a problem is less important.

Companies also often have difficulty establishing lines of communication between


the software development process and problem management. Programmers and
developers are frequently aware of known errors in the software they create, but
they can be reluctant to identify them.

In many companies, employees resist new procedures. Many companies report


that employees cling to previous informal problem management methods. It takes
time for employees to accept the discipline of problem management.

Companies should expect to incur some costs with the implementation of problem
management. However, it isn't necessary to create a vast problem management
process that's capable of handling every single problem that arises. As a result,
the incremental costs of problem management are negligible. The hardware and
software tools needed are shared with other IT service management processes,
and the additional personnel costs are small.

Problems and costs arise frequently during the introduction of problem


management. However, the problems and costs are manageable and bring
worthwhile improvements in the performance of the IT infrastructure.

Problem management seeks to identify the underlying causes of incidents in an IT


infrastructure and to remove those causes. The problem management process
addresses the causes of incidents reactively and proactively.
ITIL: Service Continuity Management Process

Oh the disaster, the sudden unexpected disaster. Some of us have been through
the major ones – Hurricanes, Tornados, and Ice Storms. Others of us have been
through the smaller ones – Boiler Exploding in a building, a building falling into the
Normanskil, and lightening hitting the building. IT service continuity management
(ITSCM) is to proactively assure IT services can be recovered and provisioned
based upon the established business continuity management timeframes.
Basically once you rate your critical applications, assuring the critical fabulous
four are up first within the agreed upon timeframe. This helps to have a pre-
defined process in place to help the organization recover to normal operating
procedures after a disaster. On the reactive side of the equation, once the disaster
has occurred. IT service continuity management is the process responsible for
assessing the impact of the disruption on IT services.

ITSCM focuses on the IT services required to support the organizations critical


lines of business. For example, in a hospital having the ability to register patients
is a critical business process. In the scenario where a disaster has occurred, not
only is the patient registration application needed but additionally any supporting
IT infrastructure and services such as active directory, networks,
telecommunications, technical support, and the service desk. In addition to any
census balancing to have the patients referred to the right rooms. Obviously, this
pursuit needs to be a join collaboration with patient access and maybe even
nursing to assure it is well documented what is needed when.

Commonly the business continuity life cycle is as a foundation building block to


help assure that the organization’s IT service continuity management process is
successful. The business continuity life cycle consists of four stages:

• Initiation - The initiation stage is starting point for the life cycle. The goal
of initiation is to define the ITSCM policy and charter the endeavor.
Chartering is the project charter which defines the high-level scope, team
needed, and critical success factors for the project. The ITSCM policy is the
bought into and formalized plan to influence and determine decisions,
actions, and other matters regarding IT continuity. The initiation stage
outcome will be the charter, project scope, project timeline, and main
ITSCM policy all documents will be referred to in subsequent stages.
• Requirements and Strategy - During the requirements and strategy
stage, a business impact analysis (BIA) and risk assessment are
conducted. The business impact analysis evaluates the what-if scenarios to
consider what might happen after a disaster. BIA points out the critical
business processes ad the potential damage which can result from a
service disruption. BIA is the requirements part. A business continuity
strategy is produced from the results determining which risk reduction
measures are necessary and which recovery option supports the
organization’s needs. This stage typically involves identifying services
critical to the business that require additional preventive measures.
• Implementation - During the implementation stage, previous stages
outputs are reviewed so that recovery plans can be developed which
contain all the details an organization needs to survive a disaster and
restore normal services. This stage also defines the actions necessary to
prevent, detect, and mitigate the effects of potential disasters. One of the
activities conducted in this stage is developing implementation plans,
including the emergency response plan, the damage assessment plan, and
the salvage plan.
o Implementing standby arrangements - includes defining,
creating, and solidifying the underpinning contracts (UCs) with
standby providers. A UC is a contract with an external supplier that
supports the IT organization in its delivery of services. This contract
could be a support or maintenance agreement, and it should be
capable of supporting targets agreed to in service level agreements
(SLAs). Once completed, the UCs should be listed in the
configuration management database and linked to the recovery
plan and the associated SLAs. Necessary equipment also needs to
be purchased.
o Developing procedures – Developing procedures which detail
exactly what each member of the disaster recovery (DR) team must
do if the plan is invoked. One staff aid may explain the exact steps
for immediately transferring data to the backup site if the DR plan
is implemented.
o Undertaking initial tests - Undertaking initial tests typically
involves performing some initial testing of procedures before they
are finalized. Actual, final testing occurs in the fourth stage:
operational management.

By performing each of these activities, organizations can be sure that they have
successfully completed the third stage of the business continuity life cycle. After
implementation has been completed, the process needs to be maintained as part
of business as usual.

• Operational management – As the ITSCM process needs to be


maintained. The operational management stage helps ensure that
maintenance occurs. To help maintain the process, a commitment to
training, reviewing the process, and testing the process needs to occur. It
may be a good idea to have a yearly mock test budgeted. An effective
ITSCM plan cannot be developed without taking into consideration the
needs of the entire business. When you follow the stages of the business
continuity life cycle, a plan which fully supports the organization will be
established.

From the business continuity life cycle, one output is the recovery plan. The
recovery plan should detail the instructions and procedures to recover or continue
the operation of systems, infrastructure, services, or facilities. The ultimate goal
of the recovery plan is to maintain service continuity.

The elements of a recovery plan are as follows:

• Strategy – Strategy explains what systems, infrastructure, services, or


facilities will be recovered and how they will be recovered. It also specifies
the amount of time it will take for the recovery and when the recovery
should be completed.
• Invocation – Invocation details everyone who has the authority to invoke
the recovery plan.
• General guidelines - The general guidelines of behavior for notifying
personnel of a potential or actual disaster. It also lists the defined
operational escalation procedures.
• Dependencies - Dependencies is concerned with the system,
infrastructure, service, facility, or interface dependencies in order of
importance. Identifying the interdependencies will bring to light other
procedures which may need to be enacted in conjunction with the recovery
plan. In continuing with our model, census balancing.
• Team and checklist – The team and checklist is the list of the staff
members who are responsible for enacting the procedures and noting any
problems they encounter. It also includes a checklist of key tasks.
• Procedures – The procedures for installing and testing hardware and
network components, and for restoring applications, databases, and data.

By following this best practice, organizations can have a level of confidence in


their recovery plans. Success will be determined by have effective recovery plans
which recovery the critical IT services within the agreed to timeframe.

Recovery options need to be considered for IT systems and networks, and critical
services such as telecommunications and power. The various recovery options are
as follows:

• Do nothing - However, few organizations can afford to forgo all business


activities supported by IT services and simply wait until services are
restored.
• Manual system - For businesses without a large number of critical IT
services, manual workarounds may present a feasible option until IT
services can resume.
• Reciprocal arrangement - This option involves forming an arrangement
with another company that uses similar technology. For instance, a
company and its main supplier might discuss an arrangement where they
can share facilities in times of disaster.
• Gradual recovery - This option is often chosen by organizations that
don't need to use the business processes supported by IT services for 72
hours or longer. This often involves the use of a location that provides
power and telecommunications, where companies can use their own
equipment.
• Warm start - This is an option used by organizations that need to recover
IT services and facilities within a 24- to 72-hour period. To accomplish this,
organizations often use commercial facilities that include operations,
system management, and technical support.
• Hot start - This is also known as an immediate recovery. This option is
used for critical services that cannot be down for any length of time. A hot
start provides for immediate restoration of IT services. It is also one of the
most expensive options to implement.

One way to warrant that the IT service continuity management (ITSCM) process is
both efficient and effective is to assign an IT service continuity (ITSC) manager.
We all know establishing accountability in a role is necessary for a successful
process.

The ITSC manager is responsible for:

• establishing plans to provide agreed-on levels of service within agreed


timelines following a disaster
• ensuring that IT service areas are able to respond to an invocation of the
continuity plans
• maintaining a comprehensive IT testing schedule and undertaking regular
reviews.

Through her primary responsibilities, the ITSC manager will ensure that the ITSCM
process is implemented and maintained in accordance with the organization's
requirements and business continuity management process. One way ITSC
managers can make sure that ITSCM is effective is through continued
communication with the other IT service management (ITSM) processes.

ITSCM should not work in isolation from an organization's business requirements,


nor should it work in isolation from the other ITSM processes. Each ITSM process
relates to ITSCM in the following ways:

• Service desk - ITSCM uses historical data, usually statistics, provided by


the service desk. This is a focal point for reporting incidents and making
service requests. Whenever possible, many companies use the service
desk as the communication center in the invocation of the disaster
recovery plan.
• Configuration management - Configuration management helps to
define the core infrastructure. Configuration management contains
current, accurate, and comprehensive information about all components of
the IT infrastructure.
• Availability management – Availability management delivers risk
reduction measures to maintain business as usual. This process is
concerned with designing, implementing, measuring, and managing IT
services to ensure that requirements for availability are consistently met.
• Change management - The impact of any change to the recovery plan
has to be analyzed. Change management works with ITSCM to make sure
that any changes made are reflected in the recovery plan and related
documents so that documentation is kept up-to-date.
• Capacity management - This ensures that business requirements are
fully supported by the appropriate IT hardware resources. Through these
resources, ITSCM has access to the capacity it needs to develop and test
plans.

Like any quality business process, IT service continuity management (ITSCM) has
expenses and common problems.

Common problems associated with ITSCM are any issues that prevent an
organization from committing to continuity management—in terms of both
implementing the process and maintaining it. One example is when firms seem
unable to move out of the planning stage and into actual implementation.

Other examples are being unable to find facilities or resources, having someone
unfamiliar with the business implement the process, not understanding ITSCM's
role in disaster recovery, or thinking IT has already handled continuity planning.

Common costs associated with ITSCM are the expenses incurred from risk
management and recovery arrangements. An example of a common cost is the
investment required by the introduction of risk management.

Additional examples of common costs are returning operational costs and the
hardware needed to support the ITSCM process, and fees for the recovery facility.

There will always be problems and costs associated with implementing ITSCM. But
the resulting benefits, especially when a disaster is prevented or quickly
controlled, outweigh the associated difficulties and costs.

Posted by Elyse at 9:32 PM | Comments (2)

June 23, 2007

Six Sigma: Defects per Unit / Defects per Opportunity


Quantifying what you are going to measure is important for setting a baseline. It
simply let’s you know that the changes will have a positive or negative affect. As
one looks to remove defects from a business process, one must measure them.
Six Sigma has two tools for measure defects, defects per unit (dpu) and defects
per opportunity(dpo).

Defects Per Unit, DPU, evaluates the average number of defective units which
occur the total sample size. A unit is the item being processed, such as a incident
ticket, or the product being coded, or the service being rendered. DPU counts
each unit as either defective or not defective. If a unit which is defective has one
or several defects, it is counted as a single defective unit.

Calculating DPU
Calculating a process's DPU involves three steps:

1. Determine the total number of units you will sample.


The first step determines the size of your sample group. A sample group
should be small enough to be manageable, yet large enough to reflect
whatever problem is undermining the process.
2. Count the number of defective units that occur within that
sample.
The second step shows how many units in the sample group contain at
least one defect, or error.
3. Divide the number of defective units by the number of total units.
The third step gives you the DPU as a decimal number, which can be
converted to a percentage.

Defect per Opportunity depicts the average number of defects which occur in the
total number of opportunities in a sample group. An opportunity is the chance for
a defect to occur within a unit. DPO counts each defective opportunity within a
unit as one defect. So in examining a service desk response, some opportunity for
defects are:

• Response does not occur within 4 hours.


• Response categorized within the wrong support team.
• Team working the response doesn’t have the customer approval of a
working machine afterwards.
• Wrong customer contact information gathered.
• Wrong problem detail reported.
• Wrong problem resolved.

Within our service desk example, each of these opportunity. So if the incident
response is mis categorized to the wrong response team and the customer is not
responded to within 4 hours. We have two defects within one unit.

Calculating DPO
Calculating a process's DPO involves five steps:

1. Determine the total number of units to be sampled.


The first step determines the size of your sample group. The sample size
should be small enough to be manageable, yet large enough to reflect
whatever problem is undermining the process.
2. Determine the number of defect opportunities per unit.
In the second step, opportunities are determined by creating a list of
potential defects customers will care about; focusing on places where
something can go wrong, not on the ways it can go wrong; focusing on
routine, rather than rare defects; and grouping related defects into one
category.
3. Determine the total number of defect opportunities for the
sample.
In the third step, the total number of defect opportunities is determined by
multiplying the number of units in the sample group by the number of
defect opportunities per unit.
4. Count the defective opportunities within the sample group.
The fourth step shows how many opportunities within the sample group
contain defects, or errors.
5. Divide the total defects by the total opportunities.
The fifth step gives you the DPO as a decimal number, which can be
converted to a percentage.

So when should you use DPU or DPO as a technique? This depends on how many
performance standards exist within a process. A performance standard is an
expectation from someone inside or outside the organization has for the process,
product, or service.

There are ways to choose which measurement technique to use in particular


circumstances:

• DPU measurements are easier to obtain and understand than DPO


measurements.
• The DPU technique is best used when there is a single performance
standard to measure because DPU measurements may hide the actual
number of defects in a unit.
• DPO measurements provide more information than DPU measurements
about how many actual defects exist in a unit.
• The DPO technique is best used when there are multiple performance
standards to measure because obtaining those numbers is complicated.

When choosing between the DPU and DPO measurement techniques, it is


necessary to determine the number of performance standards that exist for that
process. The DPU technique is best used when there is only one performance
standard. The DPO technique is best used when there are multiple standards.

Posted by Elyse at 12:44 PM

Six Sigma: Data Stratification Model

We have all been there from one time to another in a large meeting room, hearing
a problem from everyone’s perspective. Needless to say, when you are in the mix,
it is good to start analyzing. To gain an understanding of what is going on with the
process, lets start with the basics: who, what, when and where. Answering these
questions in detail is the data stratification model, and it clarifies the collected
information to reveal the root cause of the problem.

Six sigma teams use the data stratification model whenever data is collected.
Because it quickly clarifies who is associated with the problem, what type of
problem is occurring, when the problem is happening, and where the problem is
occurring. The purpose of the model is to help the team clarify the problem and its
impact. The information is also very useful when analyzing the problem to find the
root cause.

The process of data stratification is to break it down into layers (AKA strata). The
questions who, what, when, and where represent the layers.
• The who layer – The question who is intended to define who is associated
with the process problem. This information could be further subdivided by:
o Vendor
o Department
o Individual
o Customer Type
• The what layer – The question what defines the type of problem which is
occurring. This information can be further categorized by:
o defect category
o type of complaint
o reason for a complaint.
• The when layer – The question when is pretty much self explanatory it
clarifies when. This information can be further clarified by:
o time of day
o day of week
o month of year
o fiscal quarter.
o Shift
• The where layer – The where question clarifies where a problem occurs.
This can be clarified by:
o facility
o region
o location on product
o location in service.

By asking the key questions from the data stratification model, the who, what,
when, and where of a process problem are further brought into focus.

While process data may not be available to answer all the model's questions, the
team should nevertheless consider using the model whenever it collects data.
Answering only some of the questions can still greatly benefit an investigation.
Here is an example, for quantifying the problem surrounding OR charges.

Data Stratification – Late OR Charges

Factors Definitions Examples

Who: The question "Who?" is intended • patient billing


to define the person or • operating room
organizational entity that is • finance
associated with the process • materials management
problem.

• Rate Book

What: The question "What?" is intended • Late Charges


to define the type of problem that • Item Master doesn’t relate
is occurring in the process. to Charge Master
• Inability to report supplies
used by physician.
• Supply Charge generation
is not timely.
• Replacement parts (hip,
knee, spine) are especially
difficult to quantify and
charge

• Rebilling

When: The question "When?" is intended • With the new vendor


to define the time frame in which supplies
the problem is occurring in the
process. • After Surgery

Where: The question "Where?" is • At Main Facility


intended to define the location in
which the problem is occurring in • Within OR Department
the process.

Posted by Elyse at 9:45 AM | Comments (0)

June 22, 2007

Six Sigma: How to assure a valid measurement

In measuring it is all about the quality of the data, good data is a result of a good
measuring technique. So before you measure, let’s validate the method of
measurement.

Your technique to measure should be:

• Precise - A precise technique does not deviate in the result of what is


being measured. The way you measure and what you measure should be
the same item.
• Repeatable – A repeatable technique let’s you measure the same object
multiple times and get the same result.
• Reproducible - A reproducible technique can be repeated and the same
outcomes are achieved.
• Stable - A stable technique is steady and constant and does not change
over time. If repeated in the future, the reading would remain the same.

Posted by Elyse at 6:26 AM | Comments (0)

June 21, 2007

Six Sigma: What to Measure?

When examining a process, it is import to understand what parts of the process


change be measured. Measuring a process involves an investment of time, money
and skin. All processes have three main discrete parts - the inputs,
transformation, and outputs. Each part should be measured individually.
The Six Sigma improvement team will carefully considers which items to measure.
Commonly this decision process is dependent on the type of information the team
is seeking to obtain, and the pros and cons of measuring each part.

Six Sigma divides all organizational processes into three discrete parts, each of
which should be measured individually:
• The process inputs - For example, the process inputs of change
management is the Request for Change and the information gathered upon
it.
• The transforming processes – For change management, there are
several transforming processes, the assessment, planning, approval and
scheduling,
• The delivery of the outputs - The delivery of the outputs is the closure
of the RFC based upon permission from the customer, an updated RFC for
release management, some reports, and the decisions made by the
change manager.

Posted by Elyse at 10:10 PM | Comments (1)

ITIL Change Management

Change Management’s purpose is to control the process of implementing changes


to the environment. Several organizations have change review meetings or
change control. Commonly they are a stop gap measure of finally approval of
introducing a change. Let’s talk a little bit on how to improve or “ITILize” the
process.

First, let’s clearly distinguish the difference between change request and service
requests. A change requests typically affect many individuals. (If you don’t have a
change process these may be unplanned and significant changes of your
environment’s configuration.) Service requests are routine procedures that affect
a small number of users. This type of request is normally a minor change to a
system. For example, helping a new employee gain access to the systems
needed.

Change Management consists of five steps.

• Filtering Requests for Change (RFCs) – To be frank, requests for


changes need to be reviewed by the management team. Some requests
simply are not practical or feasible. Some requests may be a duplication of
efforts. So the first step of change management, is to review the requested
changes and assure only practical realistic changes are setup to be given
focus by the IT resources.
• Assessing the impact of changes – After you have the list of requested
changes, they should be assessed and rated on the benefits to the
company, preferably based on the business strategy of the organization.
For example, in a hospital the popular pillars model is: service, people,
quality, financial, and growth. Rank the change against the strategy, and
then to give a full overview also assess impact, cost, and benefits of each
change. Together with the customer, prioritize all change requests.
• Authorizing changes – Since the business customers are engaged in the
process, use their expertise in evaluating the impact of the changes. This
Change Advisory Board (CAB) should offer insights to the change manager
before the changes are approved and scheduled. A forward schedule of
changes should now come into existence.
• Reviewing Changes – After the changes have been made and
implemented, a honest and open review will point out those things done
very well and not so well. These lessons can be brough into the change
management process for immediate improvement.
• Closing RFC – Finally the customer who has submitted the RFC should
have the opportunity to review and accept the results of the change. After
the user has accepted the change, the Request for Change (RFC) should be
closed.
According to the ITIL doctrine, performing changes is not included in a list of
change management steps because it isn't part of the change management
process. Change management team members don't typically execute the changes
they plan and review.

Change management produces certain results which enable members of other IT


service management teams to execute changes. The outputs from the change
management process are:

• Forward schedule of changes (FSC) – A forward schedule of changes is


a listing of upcoming changes which are be scheduled and communicated
to other IT service management processes and to the rest of the company.
The forward schedule of changes shows the changes that have been
approved and given a date for implementation.
• Updated Requests For Changes – Request for Changes are updated
with approvals and forward to release management. ITIL release
management is the process responsible for implementing the changes.
• Change advisory board decisions - The discussions of the change
advisory board provide the basis for the change manager's decisions to
approve or reject proposed changes. The members of the CAB help
determine if changes make sense from technical, financial, and business
viewpoints.
• Change management reports - Change management reports allow
managers to evaluate the effectiveness of the process in the past and its
direction in the future. These reports should be distributed to senior IT and
business managers.

Obviously proposed changes have far-reaching affects within an organization, this


is why a group with a strategic borderless cross sections of the community should
help evaluate proposed changes. A change advisory board advises the change
manager before changes are approved and scheduled.

The major activities of the change management process include:

1. Logging and filtering - The first major activity is logging and filtering
change requests. Obviously change requests will come from anywhere
within a company that has an idea and a computer. Some changes result
from an incident in which IT services are interrupted or configuration items
fail to work properly. Other requests may simply ask for improved or
additional service. Some requests maybe capital changes or departmental
system modifications. However all proposed changes should be submitted
to change management in a standardized request for change (RFC)
document. When an RFC is received, change management logs the request
and allocates a unique identification number for the document. An RFC
document provides definitive information about the change that's being
requested.
2. Assessing and classifying - The process for assigning priorities to
changes is similar to the process for prioritizing problems. If a problem
receives a high priority, the change that corrects the problem will have a
high priority. However, not all changes address problems. For example all
enhancement requests, so changes need to be classified based upon
priority. Commonly ratings such as emergency, high, medium or low are
used.
3. Approving - The authority to approve or reject proposed changes is what
gives change management the ability to make the change process more
effective. The final decision and authority to act is with the manager in
charge of change management. Obviously with something this significant
the change manager doesn’t act in a silo, the change manager consults
with his trusted advisors the change advisory board. The CAB examines
each proposed change from three perspectives: financial, technical, and
business. The board evaluates changes from a financial perspective to
determine whether the benefits of each change justify the costs of
completing it. Evaluation from a technical perspective assures that each
change is feasible and will not have a negative effect on the infrastructure.
Business evaluation ensures that changes support the company's business
objectives.
4. Planning and coordination - Change management doesn't actually
implement the requested changes. However, change management team
members help plan the implementation and then monitor the progress of
changes to be sure the plans are being carried out. In an ideal world,
change management could allow only one change at a time. Because
configuration items in an infrastructure affect one another, that's not
possible. A hardware change may require an accompanying software
change, and both of those may require changes in documentation and
procedures. Change management and the change advisory board examine
how changes will affect the IT infrastructure. Changes are scheduled to
accommodate the dependencies among changes to different configuration
items. After a change has been scheduled, change management forwards
the approved RFC to the support team that will build and test the change.
The team chosen to actually build and test a change depends on the type
of change. The team must include people with the skills needed for specific
building and testing tasks.
5. Reviewing and closing - Completed changes may be reviewed in CAB
meetings. The examination of a completed change is called a post-
implementation review (PIR). The purpose of a post-implementation review
is to study recently completed changes and apply the experience gained to
subsequent changes. Post-implementation reviews allow change
management to learn from the past and continually improve the change
management process.

As with all processes, an exception or emergency guideline needs to exist. When a


problem interrupts service to many users or will have an immediate and severe
business impact, an emergency change is needed. Change management uses a
slightly different process to handle emergency changes. The emergency change
process allows more flexibility in dealing with failed attempts to implement
changes. In the standard change process, failed changes are backed out, and the
previous status is restored until a new change can be planned. In an emergency,
however, change management may allow repeated attempts to make a successful
change if the initial attempts fail. In an emergency, restoring service is the main
and only priority.

The one person who is most responsible and accountable for change management
is the change manager. The change manager is more of a role, depending on the
size, complexity and structure of your IT organization, there may be more than
one person playing the role of change manager for assisting with several
activities. The change manager plays a leading role in many of the major
activities of change management as follows:

• Receives and filters requests for change (RFCs) - There must be a


single point of collection for RFCs. The change manager reviews new RFCs
and filters out those that are impractical or duplicates of existing requests.
The change manager serves as the gatekeeper to the activities of the
change management process.
• Coordinates the activities of the change advisory board (CAB) -
The change manager serves as leader and facilitator for the change
advisory board and its emergency committee. The change manager
convenes the CAB and makes sure a cross section of departments within
the company is represented.
• Issues and maintains the forward schedule of changes (FSC) - The
change manager is the keeper of the FSC. When the change advisory
board approves changes, the change manager adds the change to the FSC
and distributes it within the company.
• Closes RFCs - The authority to declare a change complete and closed
should be limited to the change manager. A change should be removed
from open status only when it has been verified that the change has been
successfully implemented and accepted by affected users.

Having an effective change manager is one requirement for the success of change
management. There are several other critical factors that need to be present for
the implementation of change management to be successful.

Change management isn't an isolated process. Its success depends on how well it
meshes with the rest of the IT department and the company. Three factors that
are critical to the success of change management are:

• Appropriate tools —An integrated service management tool shares


information among all the IT service management (ITSM) processes.
Information sharing provides the ability to seamlessly handle change
requests and related incidents, problems, and errors.
• Supporting processes —Change management will be far more effective
if it's supported with the implementation of related processes. It should be
implemented along with configuration management and release
management.
• Management commitment —Without support from senior managers,
change management cannot succeed. Change management may introduce
significant changes in a company. Management commitment gives change
management authority to direct the change process.

The successful implementation of change management increases the value of a


company's information resources. Change management succeeds when the
change manager performs key duties and the environment for success exists.

Most companies that implement change management encounter some problems


and costs in the early going.

Some of the lessons learned that can be shared from other having gone there
before are:

• One common problems encountered is a continuing reliance on the


previous methods of managing changes. Whether the previous system was
highly structured or very informal, many employees will cling to a familiar
but less effective system. As is always the case with change,
communication and seeing the vision helps to bring everyone to the same
page as to the value of the new system.
• Another common problem is the unwillingness of some members of an
organization to relinquish control of changes to a centralized process.
Managers who were able to enact changes independently may be reluctant
to cooperate with change management at first. A common trait to resolve
this problem is to show everyone the value, perhaps explain we just need
to look with a broader perspective. We don’t want to release the new flag
ship product on the week with the network switch OS upgrade and
Microsoft Patch releases.
• Clinging to previous systems and a reluctance to give up independent
control may combine to create another common problem: attempts to
bypass change management. Some members of a company may resist the
structured process of change management and attempt to bypass it
entirely.

Companies implementing or improving change management should also expect


to incur some costs associated with change management. These costs may be
shared among several IT service management processes and not allocated
entirely to change management. The major categories of costs are:

• Personnel costs - A change manager is needed to develop and direct the


process. More members of the change management team may be added.
Members of the change advisory board must devote time to approving and
reviewing changes.
• Software costs -An automated IT service management tool should be
purchased or developed internally. This tool provides better control of
requested and pending changes and facilitates communication among
different IT service management process teams.
• Hardware costs - Some additional hardware may be needed to support
the software tool. Workstations may be purchased for change management
team members. In some cases, more data storage capacity may be
required as well. This is usually a minor cost.

The value of change management are well worth the original investment in
capital, resources and headaches.

Posted by Elyse at 9:42 PM | Comments (3)

June 18, 2007

ITIL Problem Management

Problem Management is structured to address the causes of incidents which pose


the greatest risk. (Negative risk) Therefore it focuses on the heavy hitter recurring
service affecting events; it doesn’t find the root cause or permanent fix for every
incident. Success is measured in terms of what has been removed from the
environment.

• How many problems are identified and removed from our IT environment.
• Problems which have a status of resolved and closed.

So let’s walk the process, for problem management.

First an incident occurs. An incident is any unplanned outcome from the operation
of an information system. Incidents interrupt the IT service which the customer
receives. Incidents are normally reported to the service desk, and an incident
record is created.

Next, the incident is assessed. If the cause of the incident isn’t know, then the
incident is escalated to a problem. A problem is an incident whose cause is not
known.

As the problem is reviewed, the cause of the problem and a workaround maybe
determined. As soon as these two aspects occur, the problem is changed to a
known error.
Finally the known error is assessed to determine if the symptoms of the incident
match an already existing problem record. If so, the new incident is cross-
referenced to that problem

However if the known error doesn’t match any existing symptoms, a net new
problem record is created.

The terminology incident, problem, and known error portray the effect and root
causes of unexpected events in an information system. Identifying the cause of
these events and minimizing their impact is the primary purpose of the problem
management process.

The goal of problem management activities is to ascertain the root causes of


incidents and to minimize their impact on the business operations of a company.
This is done through the following processes:

• Problem control - The purpose of problem control is to identify problems


within an IT environment and to record information about those problems.
Problem control identifies the configuration items at the root of a problem
and provides the service desk with information on workarounds.
• Error control - The purpose of error control is to keeps track of known
errors and to determines the resource effort needed to resolve the known
error. Error control monitors and removes known errors when it's feasible
and worthwhile.
• Proactive problem management - The purpose of proactive problem
management is to find potential problems and errors in an IT infrastructure
before they cause incidents. Stopping incidents before they occur provides
improved service to users.

The primary measure of the success of the problem management process is how
many problems are identified and removed from an IT infrastructure. Therefore,
the primary output from this IT service management process renders problems
that are resolved and closed.

The work of problem management produces the following outcomes:

• Records of known errors and available workarounds - These records


are kept in the configuration management database (CMDB), and they
provide information to the service desk and other ITSM processes.
• Requests for change (RFCs) - RFCs describe changes needed to remove
a known error. Problem management does not approve or perform the
change. RFCs are sent to another ITSM process, change management.
• Changed records in the CMDB - Information about a known error and
any affected CIs is forwarded to the configuration management process,
the IT service management process that maintains the CMDB.

When the problem management process is used to identify the root causes of
problems, it's far more likely that they will be diagnosed correctly and fixed
properly. As a result, problems are permanently eliminated.

Problem management includes the following two types of approaches to address


problems:
• Reactive problem management - Reactive problem management seeks
to cure the symptoms of problems. The reactive approach responds to
reports of incidents that have already occurred. Reactive problem
management can be viewed as two activities
o Problem control activities - The major problem control activities
are:
 Identification and recording - Problem management
receives information about reported incidents from the
incident management process and the service desk.
Members of the problem management team analyze this
information, looking for similarities in the symptoms of
reported incidents. They look for records of previously
identified problems that can explain the symptoms. If none
can be found, a record describing a new problem is created.
 Classification - This control activity identifies the
importance of new problems and designates resources for
addressing them.

Problems are classified by category, such as hardware, software, or other types.


Then they can be assigned to the corresponding support personnel. Problems are
also classified by priority ranking. Problems with higher priority rankings are
addressed before problems with lower priority rankings.

Investigation and diagnosis - Problem management teams look for the root
cause of problems. If the cause is determined, problem management
recommends a workaround or a temporary fix for the problem.

 Identify cause of problem and devise a workaround -


In the automated service management system, the status of
the problem is changed to that of a known error.

When an IT department applies problem control activities, it prioritizes the


problems that present the biggest threat to the information system or the
company's ability to conduct business. When the root cause of a problem has
been found and a workaround has been devised, problem control activities end.
Then the second group of activities in reactive problem management begins.

o Error control activities- Now the problem becomes a known error


in the IT infrastructure, and error control activities begin. Error
control activities include:
 Error identification and recording - This means creating
a record that identifies a known error and all the
configuration items (CIs) that cause the error or are affected
by it.
 Error assessment - This activity prioritizes errors and
places them into groups according to their importance.
 Error resolution recording - The resolution to a known
error may include changes to hardware or software, user
training, or operational procedures. Error control creates a
request for change (RFC) and forwards it to change
management. The RFC is cross-referenced to the known
error in the automated service management system.
 Error resolution monitoring - Changes are planned and
implemented by other IT service management processes.
Problem management monitors the effect of problems on
service provided to users and the progress of requested
changes until they're complete.
 Error closure —The final error control activity is error
closure. When recommended changes to fix a known error
have been completed, the known error record in the service
management system can be closed. Records of incidents
and problems associated with that known error may also be
closed.
• Proactive problem management - Proactive problem management
seeks to inoculate IT systems against problems. The proactive approach
identifies potential problems before they emerge.
o Trend analysis - This is the process of examining problem and
incident reports to discover what types of problems are happening
more frequently. Trend analysis of existing problems and incidents
can reveal where similar problems may occur in other places within
the infrastructure. It can also show that repeated failures have not
been adequately resolved and are likely to continue to happen.
o Targeting preventative action - This process applies the same
techniques used in reactive problem management to a select few
potential problems with a high degree of business impact. Targeting
preventative action may include creating RFCs, training users and
service desk team members, or recommending procedural changes
within the IT department.

The groups of problem management activities; (problem control, error control, and
proactive problem management) identify and resolve problems which have the
greatest potential impact on a company's business.

The success of problem management depends on having the right people


performing the right actions. Responsibility for leading the problem management
process is assigned to one person designated as the problem manager. The roles
of the problem manager are:

1. To maintain and develop problem control activities - It's the problem


manager's job to make sure that information about incidents within the
system is being received and reviewed in a systematic way.
2. To monitor the effectiveness of error control activities and make
recommendations for improvement - She must also ensure that
relationships among configuration items are considered in proposed
solutions to problems.
3. To cascade information about workarounds or fixes to those who
need it - Communication with the service desk and incident management
is a key role performed by the problem manager.
4. To monitor the progress of problems and known errors toward a
final resolution - If solutions aren't implemented as quickly as necessary,
the problem manager may follow procedures to escalate the priority of the
problem.

Each of these four roles contributes to the ability of problem management to


identify and resolve problems and known errors quickly. The problem manager will
also perform typical supervisory roles to direct the activities of any other problem
management team members.

The problem manager's duties should never be combined with the duties of the
service desk supervisor. The priorities of the service desk and problem
management are often incompatible.
The success of problem management also relies on critical factors before, during,
and after the main activities in the problem management process. The critical
factors for success are:

• Performance targets - It's important to decide how the performance of


problem management will be measured before the process is
implemented. If possible, use statistics from the previous support activities
to set goals for problem management.
• Periodic audits - Perform periodic audits to determine whether problem
management procedures are being followed. Problems that aren't properly
reported or investigated are more likely to cause interruptions of service to
users or a major impact on the business.
• Problem reviews - Conduct major problem reviews after problems with
high urgency or impact have been resolved. Look for ways to improve the
way problems are identified and resolved. Problem management
procedures should be continually improved.

Problem management will succeed when an effective problem manager fills the
required roles, and critical factors for the success of the process are included in
everyday operating procedures.

Implementing problem management brings many benefits to a company and its IT


department. However, there are also some problems and costs that arise during
the implementation of problem management.

Among the most common problems companies experience is a difficulty


establishing adequate communication between problem management and
another IT service management process, incident management. Communication
between the two can be difficult because they pursue the following conflicting
goals:

• Problem management - The goal of problem management is to


investigate the root cause of a problem. The speed with which a solution is
found is an important, but secondary, consideration.
• Incident management - The goal of incident management is to recover
from incidents and restore service to users as quickly as possible.
Determining the cause of a problem is less important.

Companies also often have difficulty establishing lines of communication between


the software development process and problem management. Programmers and
developers are frequently aware of known errors in the software they create, but
they can be reluctant to identify them.

In many companies, employees resist new procedures. Many companies report


that employees cling to previous informal problem management methods. It takes
time for employees to accept the discipline of problem management.

Companies should expect to incur some costs with the implementation of problem
management. However, it isn't necessary to create a vast problem management
process that's capable of handling every single problem that arises. As a result,
the incremental costs of problem management are negligible. The hardware and
software tools needed are shared with other IT service management processes,
and the additional personnel costs are small.

Problems and costs arise frequently during the introduction of problem


management. However, the problems and costs are manageable and bring
worthwhile improvements in the performance of the IT infrastructure.
Problem management seeks to identify the underlying causes of incidents in an IT
infrastructure and to remove those causes. The problem management process
addresses the causes of incidents reactively and proactively.

Posted by Elyse at 8:19 PM | Comments (7)

June 13, 2007

Ascertaining Business Need for that new Widget

You know that nifty new tool, just around the corner. Maybe it is Surface, a 30-inch
tabletop display which is used through a touch or gesture. The tabletop display
can recognize more than one touch at a time, enabling multiple users to
collaborate and interact. It also has the cool feature to recognize physical objects
marked with a special ID tag. So now the mouse and keyboard aren’t apart of the
computer interaction experience.

This is revolutionary. Imagine a bed board application right on the table top. User
access is determined by their barcode on the badge. Triggers for services are
automatic paging to housekeeping. Which in this model is staffed to have a bed
cleared within 15 minutes.

Now obviously this is a little to new in the hype cycle to be implemented, but let’s
go through the exercise of assessing how this potential technology will be useful
to our healthcare facility.

Table Top Computing Bed Boards

Description Application accessible via countertops in hospitals


and on the walls by patient rooms.. Wireless service
triggers beepers for cleaning service.

Impact on Customers Clear concise picture of who is in which bed. Also


helps with the patient flow monitors.

Impact on Capital Streamlining patient flow improves capacity.

Impact of Resources Decrease overhead in bed maintenance in patient


access.

Impact on Cycle time Accelerates the cycle time by improving the location
of each patient based on movement.

The goal of this exercise is to cut through the hype and understand the purpose,
business intent, and goals of the emerging technology. Obviously more detail is
needed, and a larger pool of individuals ascertaining impact. However, the point is
to relate and describe the process.

Posted by Elyse at 7:58 AM

June 7, 2007

Select Seller Process


The select sellers process is used to evaluate the vendor proposals by applying
the evaluation criteria to ascertain who makes the cut and who doesn’t. So the
selection process consists of gathering all proposals, evaluating the proposals
(against criteria not other proposals), and finally contracting with the seller.
So what do you need before you begin the process? Here are the inputs to the
select sellers process.

• Organizational Process Assets – The organization process asset we are


specifically referring to are the formal policies which impact how a proposal
is evaluated.

• Procurement Management Plan – The procurement management plan


is used because it describes the complete procurement process from
development to contract closure.

• Evaluation Criteria - The evaluation criteria was developed for the


expressed purpose of rating proposals. So now is the ideal time to use this
document.

• Procurement Document Package – The procurement document


package is the guide which you provided to the vendor so the vendor could
appropriately respond and submit the proposal.

• Proposals – The proposal is the vendor’s written response to the request.

• Qualified Sellers List – The Qualified Sellers List contains the list of
sellers who were approved to participate in the bidding process.

• Project Management Plan – The Project Management plan has the risk
register and risk-related contractual agreements. These documents will be
used within the selection process.

These inputs will allow the project team to make a reasonable, informed decision.
The next step is going about making the decision for the best candidate.
Thankfully once again we have some best practice guidelines to follow. The
recommended tools and techniques for the select sellers process are:
• Weighting Systems – The Weighting systems are employed to quantify
personal perspectives. To use a weighting system, assign a ranked weight
to each of the evaluation crieteria. Next, rate the vendor on each criteria,
multiply by the weight, and total the result for the overall score.

• Independent Estimates – Independent Estimates depict the intended


costs of the project. This is basically a check and balance technique. It can
be done in house or with an independent consultant.

• Screening Systems – Screening is a pass/fail method for certain criteria


deemed essential for project success. For example, if you are
implementing an Ambulatory EMR is e-prescribing a vital component?

• Contract Negotiations – Contract Negotiation is the process of coming to


a mutual agreement regarding terms and conditions.

• Seller Rating Systems – Seller Rating systems rates sellers upon past
performance commonly reviewing quality, delivery performance, and
contractual compliance.

• Expert Judgment – Expert Judgment is used to have a multi-discipline


team evaluate the vendor’s proposal.

• Proposal Evaluation Techniques – Proposal Evaluation Techniques is a


combination of the above – most commonly expert judgment and
evaluation criteria.

After using the above tools and techniques, you should be able to generate the
outputs of the select sellers process. Those outputs are:
• Selected Sellers – The selected sellers are vendors who have chosen.

• Contract – The contract is the agreement between the buyer and the
seller documenting the master service agreement, breakage terms,
addendums, statement of work, specifications and price.

• Contract Management Plan – The contract management plan details


how to manage a major purchase. Sometime details payment
arrangements, performance requirements and documentation delivery.

• Resource Availability – Resource availability state the quantity and


availability of resources, along with documented dates of when the
resource is active or idle.

• Updates to the Procurement Management Plan – Updates of any


approved change affecting procurement management.

• Requested Changes – Certain changes to the plan should be processed


through integrated change control.

You might also like