You are on page 1of 103

Lean IT Kaizen

Official Publication

V 1.04
March 2016
Table of Contents
Scope and Purpose 5
Target audience ................................................................................................................................................................................................ 5

1 Introduction 6
1.1 Definitions...................................................................................................................................................................................................... 6
1.2 The Kaizen Mindset.................................................................................................................................................................................. 7
1.3 Improvement Methods.......................................................................................................................................................................... 8
1.4 DMAIC............................................................................................................................................................................................................. 8
1.5 Lean and Problems.................................................................................................................................................................................10
1.6 Problems in IT............................................................................................................................................................................................ 10

2 Organizing Kaizen 12
2.1 Daily Kaizen................................................................................................................................................................................................. 12
2.2 Improvement Kaizen............................................................................................................................................................................. 12

3 A3 Method 16
3.1 A3...................................................................................................................................................................................................................... 16
3.2 Contents of a Problem-solving A3................................................................................................................................................. 16
3.3 A3 Status Report and A3 Proposal............................................................................................................................................... 19
3.4 Skills for completing an A3.................................................................................................................................................................20
3.5 Building communication.......................................................................................................................................................................20

4 Define Phase 23
4.1 Problem Statement.................................................................................................................................................................................23
4.2 Validating the Problem.........................................................................................................................................................................24
4.3 Types of Problems..................................................................................................................................................................................26
4.4 Validating the Value of Solving the Problem............................................................................................................................27
4.5 Ensuring Support for a Kaizen.........................................................................................................................................................30
4.6 Stakeholder Analysis.............................................................................................................................................................................30
4.7 Define Phase and A3............................................................................................................................................................................. 31
4.8 Key Steps in the Define Phase.........................................................................................................................................................32
4.9 Case Study: Define Phase...................................................................................................................................................................35

2
5 Measure Phase 37
5.1 Data................................................................................................................................................................................................................. 37
5.2 Measurement Systems........................................................................................................................................................................39
5.3 Baseline and Benchmark.....................................................................................................................................................................42
5.4 Value Stream Map..................................................................................................................................................................................43
5.5 Measure Phase and A3........................................................................................................................................................................45
5.6 Key Steps in the Measure Phase....................................................................................................................................................45
5.7 Case Study: Measure Phase...............................................................................................................................................................48

6 Analyze Phase 50
6.1 Seven Basic Tools of Quality..............................................................................................................................................................50
6.2 Finding the Root Cause........................................................................................................................................................................62
6.3 Analyzing a Value Stream Map........................................................................................................................................................64
6.4 Analysis in IT..............................................................................................................................................................................................65
6.6 Key Steps for Analyze Phase............................................................................................................................................................68
6.7 Case Study: Analyze Phase................................................................................................................................................................70

7 Improve Phase 72
7.1 Idea Generation..........................................................................................................................................................................................72
7.2 Option selection and prioritization.................................................................................................................................................73
7.3 Testing Solutions......................................................................................................................................................................................75
7.4 Solutions used in IT.................................................................................................................................................................................76
7.5 Improve Phase and A3..........................................................................................................................................................................77
7.6 Key Steps for Improve Phase............................................................................................................................................................77
7.7 Case Study: Improve Phase................................................................................................................................................................79

8 Control Phase 80
8.1 Achieving Control.....................................................................................................................................................................................80
8.2 Control Plan...............................................................................................................................................................................................80
8.2.2 Monitoring..............................................................................................................................................................................................82
8.3 Communication Plan..............................................................................................................................................................................84
8.4 Closure..........................................................................................................................................................................................................85
8.5 Control Phase and A3...........................................................................................................................................................................86
8.6 Key steps in the Control phase........................................................................................................................................................87
8.7 Case Study: Control Phase.................................................................................................................................................................89

3
9 Appendix 1: References 90
9.1 Lean Six Sigma Pocket Toolbook (chapters 1-4, 9).................................................................................................................90
9.2 Understanding A3 Thinking...............................................................................................................................................................90
9.3 A Leader’s Framework for Decision Making.............................................................................................................................90

10 Appendix 2: Glossary 91

11 About the author 102


11.1 Niels Loader..............................................................................................................................................................................................102

4
Scope and Purpose
The purpose of this document is support the Lean IT Kaizen qualification. All of the exam questions
can be answered based on the information in this document.

Target audience

The target audience for this document is:

•• Candidates for the Lean IT Kaizen Exam


•• Accredited Training Organizations
Copyright notes

ITIL® is a registered trade mark of AXELOS Limited.

COBIT® is a trademark of ISACA® registered in the United States and other countries.

PRINCE2® is a Registered Trade Mark of AXELOS Limited.

PMI® is a registered Trade Mark of the Project Management Institute, Inc.

Acknowledgements

The author would like to thank everyone who put their time and effort into improving this document.

The author would especially like to thank Troy DuMoulin (Pink Elephant) for the inspiring discussions
to get the right content into the Kaizen syllabus and the first reviews.

Many thanks to the following people for their critical reviews, which helped to improve this
publication:

•• Mike Orzen, Mike Orzen and Associates, Member of Lean IT Association Content Advisory
Board
•• Barry Fairingside, APMG
•• Gary Case, Pink Elephant
•• Rita Pilon, Exin, Member of Lean IT Association Content Team
•• Hans van den Bent, CLOUD-linguistics
•• Marianne Hubregtse, Exin
•• Natasja Soselisa, Quint Wellington Redwood
•• Ilona op de Weegh, Quint Wellington Redwood

5
1 Introduction
As one of the pillars of Lean and Lean IT, It translates as “change for the better”. Kai
ensuring that an IT organization is competent at means change, Zen means for the better. Kaizen
ensuring continuous improvement in line with is an approach for solving problems and forms
the interest of the customer(s) is absolutely the basis of incremental continual improvement
vital to the success of Lean within IT. in organizations. A problem is a difficulty that
has to be resolved or dealt with. When applied
In the Lean IT Foundation, we looked at the to the workplace, Kaizen means continuous
basics of Kaizen (continuous improvement) and improvement involving everyone, managers
the DMAIC problem-solving method. In this and workers alike, every day and everywhere,
document, we will build on this material to help providing structure to process improvement.
you on your journey to mastering continuous Kaizen is about continuously improving:
improvement, and becoming a Lean IT Kaizen everyday, everyone and everywhere. Many
Lead. The Lean IT Kaizen Lead is someone who small improvements implemented with Kaizen
is involved with Lean improvement that could produce faster results with less risk. In IT
be at any level of the IT organization, in any backdrop, we can equate this to a minor update
‘department’. to a piece of software.

We will be using terms defined in the Lean Lean also recognizes that there are moments
IT Foundation publication. If a term is central where more radical, step change is necessary.
to this document, we may repeat foundation This type of change is known as Kaikaku. This
course material. If you are not familiar with a refers to a revolutionary change to the existing
particular term, please refer to the foundation situation. Following the software example,
course publication. Kaikaku would be the upgrade of an application
currently in use from a release level to a new
We will take an in-depth look at the key aspects release level. Software providers will often
of organizing and running a Kaizen event. We substantially change both the technical basis
will also investigate the DMAIC problem-solving of the software and its functionality. For both
method in substantially more detail than we IT and the user community, this means a large
did in the Foundation document. On top of step change.
this, we will use the A3 method to record and
communicate the findings of our Kaizen event. A third type of improvement known within
Lean is Kakushin. The idea contained within
1.1 Definitions Kakushin is that some change will form a
complete deviation from the current situation.
There are three main words used within the It is about innovation, transformation, reform
domain of improvement: Kaizen, Kaikaku and and renewal. Again, in our software example,
Kakushin. this may mean replacing a complete application
with a different application that supports
Kaizen is the Japanese word for continuous the process in a completely different way, for
improvement using small incremental changes. example a web-based application that fully

6
automates the registration of orders, the mindset?
submission of invoices and the generation of
a picking order at the order fulfilment. This 1. Seeing and prioritizing problems: Are
both managers and employees truly
kind of change will entail the disappearance of
prepared to uncover problems, accept
many roles and functions within a business,
them as a part of daily life and initiate
both from technological and business process action to identify the problems that
perspectives. The example represents a most need solving?
complete deviation from the current way
of working. Another example of Kakushin is 2. Solving problems: Are both managers
and employees prepared to invest time
where the organization standardizes a process
and other resources to understand the
and supporting software across the entire
root causes of problems and resolve
organization where previously various groups problems completely?
had different processes and applications to
achieve similar goals. 3. Sharing lessons learned: Are both
managers and employees driven to
In this document, we will focus on the use of share the lessons learned as a result of
solving problems with others in the IT
Kaizen within IT organizations. The reason
organization, so that they may benefit
for this is that purposeful improvement of from the lessons learned?
IT services to customers is, generally, not
done consistently and continuously within IT It is important to note at this point that
organizations. The aim of this document is to problem solving is not about reactively waiting
describe how to embed purposeful continuous for problems to appear and then resolving
improvement in any IT organization. them as they occur. A problem-solving mindset
is to first establish a desired state of the service
1.2 The Kaizen Mindset or process, understand the current baseline
and gap and, then, to incrementally close
As we already described in the Lean IT the gaps towards the desired state through
Foundation, Lean is a way of thinking and Kaizen improvement steps. The essence is
acting. We will be discussing what needs to that identifying problems and solving their
be done to successfully introduce and execute root cause drives individual and organizational
Kaizen within an IT organization. At each step, learning.
we will discuss critical ‘thinking’ aspects.
There are, in fact, two types of Kaizen:
Before we can start, we must investigate the Improvement Kaizen and Daily Kaizen. The
starting point of Kaizen, and that is developing one we will be dealing with in detail in this
a Kaizen mindset. What do we mean by document is the former and we will refer to this
this? We mean that there must be a belief as Kaizen. It is focused on carrying out Kaizen
throughout the IT organization, both among events to bring about incremental change. Daily
managers and employees, that improving IT Kaizen will be further discussed in the Lean IT
services and the way they are delivered can Leadership certification.
and must be done on a daily basis.
Daily Kaizen is more closely related to the
So what are the core elements of a Kaizen Kaizen mindset as it entails continuously

7
looking at the environment in which we operate method is the Shewhart Cycle, often referred to
and changing things to make it easier for the as the Deming Circle. This is the Plan-Do-Check-
people in this environment to deliver a higher Act (PDCA) sequence. This cycle is applicable
level of customer value, more quickly and more in any situation, and forms the basis for all
consistently. Why is Daily Kaizen more closely improvement within Lean. Its premise is that
related to the Kaizen mindset? Because Daily by following the plan, do, check and act steps
Kaizen means being constantly alert to minor in that order, we are able to purposefully take
(and major) issues that need to be addressed steps to improve the capabilities of individuals,
directly. organizations, processes and technology. The
Deming Circle consists of the following steps:
A simple example of Daily Kaizen is as follows:
•• PLAN: Establish a desired future state
or reference: design. Establish the
Imagine a printer on a table. The paper for the
current gap define the plan to or revise
printer is stacked in boxes, each containing the business process components to
five packs of paper, under the table. The result improve results
is that whenever the paper drawer is empty, •• DO: Implement the plan and measure its
someone must bend under the table to get performance
a pack of paper out of a box. A Daily Kaizen •• CHECK: Assess the results of
the mitigation actions through
action would be to mark out a rectangle of
measurements and report the results to
the size of a piece of paper with red tape on
decision makers
the table next to the printer. A box of paper is •• ACT: Decide on changes needed to
placed on the rectangle. When the last pack of improve the process
paper is used and the box is discarded, the red
The Deming Circle creates a feedback loop to
tape will signal that a new box of paper needs
ensure that improvements are identified and
to be put on the table. This simple example
implemented.
means that most people will not need to bend
anymore to get paper for the printer’s paper
Within IT, the IT Infrastructure Library 1
drawer. Since this makes life more pleasant for
framework identifies a Continual Service
everyone for a substantial length of time, we
Improvement cycle. This cycle uses the PDCA-
can see this as a small improvement, and an
cycle as its basis. Other frameworks and
example of Daily Kaizen.
reference models within IT, such as COBIT2 and
ISO/IEC 20000, all contain improvement cycles
Both types of Kaizen (daily and improvement)
based on the Deming PDCA cycle.
must be present in an IT organization it to
continuously improve.
1.4 DMAIC
In Lean IT, our mindset is that we accept that
our world is filled with problems and we act to Within Lean IT, we also recognize the Plan-
solve the problems on a continuous basis. Do-Check-Act cycle as an integral part of
continuous improvement and recommend its
use in all circumstances. We have, however,
1.3 Improvement Methods
1 ITIL® is a Registered Trade Mark of Axelos Limited,
2 COBIT ® is a Registered Trade Mark of ISACA
The most well-known continuous improvement

8
chosen a more specific problem-solving method actions of our project.
to support the actual execution of Kaizen
Our objective is to improve the delivery of
within IT organizations.
value to the customer. For this we apply Lean
principles and techniques in the realm of
Kaizen events consist of five phases starting
IT. We use five dimensions to support the
with a problem statement towards embedded
effectiveness of our improvement activities.
improvement implementation. These steps are:
We use the DMAIC steps in a disciplined way
define, measure, analyze, improve and control,
to solve problems and learn from them. This
also known as DMAIC, the preferred method
is how we continuously improve business
for problem solving. This method has a proven
performance, through continuously improving
track and it’s easy to understand and adopt,
IT.
and suitable for the majority of problems
encountered within IT.
Taking a brief look at Kaikaku and Kakushin,
•• In the Define step, we define the problem DMEDI (Define, Measure, Explore, Develop, and
statement, describe the goal statements, Implement), an approach similar to DMAIC,
analyze the cost of poor quality, define is strongly based on data and statistical
the scope with a SIPOC (suppliers, inputs, analysis, and can be used for more radical
process, outputs, customers) diagram,
improvements. It requires the application
establish the Kaizen project team, create
the project charter and planning, get of creativity in using data to design new
stakeholders’ support and start the project. processes, products, and services. DMEDI
•• In the Measure step, we build understanding aims at taking a step-change leap over existing
of current KPIs and performance, develop processes, products, or services, and seeks to
the Critical to Quality (CTQ) flowchart, write generate a competitive advantage.
a data collection plan, we try to understand
process behavior and variation, and relate DMEDI is principally used in situations where
current performance to the Voice of the
existing processes, products or services work
Customer
•• In the Analyze step, we collect data and so poorly that they need to be designed from
verify the measurement system, study the scratch, or where the gap between the desired
process with Value Stream Mapping, identify state and the current performance remains
the types of waste, develop hypotheses huge. DMEDI can be used if an IT service or
about the root cause, analyze and identify process continues to fail to meet customer
the data distribution and study correlation expectations even after DMAIC has been used.
•• In the Improve step, we generate
The application of DMEDI requires a longer
potential solutions by brainstorming,
design assessment criteria for impact and lead-time and considerable resources compared
feasibility, decide the improvement which to DMAIC. DMEDI is unsuited for Kaizen.
can be implemented, implement or pilot the
improvement, and measure the impact on
the CTQs.
•• In the Control step, we implement ongoing
measurement, we anchor the change in the
organization through effective controls, and
we quantify the improvement, capture the
learning, and replicate it across the board.
We write the project report and close the

9
1.5 Lean and Problems networks, and amount of data generated
unprecedented. The result is that within IT
Lean has a relatively relaxed relationship we are continuously confronted with new
with problems. Particularly in the western situations. Inevitably, new situations generate
organizations, problems do not always appear new problems that need to be solved.
to be particularly welcome. There is an
inclination towards belief that higher levels of However, not only new situations require
management prefer to hear a positive story, problem solving. Within IT, we are confronted
and is not open to problems. Whether this is on a daily basis with disruptions to existing
true or not is irrelevant, the inference drawn services based on unplanned outages or
is more important, and that inference is that failures. These service outages impact the
problems are not proactively identified, often users of IT services and require support.
ignored, or masked. Support means restoration of the service via
Incident Management but also the application
Lean takes a completely different approach. of Problem Management practices to identify
Problems are fully accepted as a part of people the root cause and establish solutions to ensure
working together. In fact, the way Lean looks at the disruption does not occur again in the
problems can be summarized by: future.
•• Most problems are solvable (or partially
solvable, or at least their impact can be The necessity to solve problems permanently
minimized). has been long recognized within IT. The term
•• Problems are opportunities in disguise ‘Problem’3 is defined as ‘the cause of one or
•• Problems are challenges that encourage more incidents’. In this document, we will
people to overcome them refer to this term with a capital P, in order to
Lean sees problem solving as a leadership distinguish it from the more generic ‘problem’.
activity regarding the identification of future The Problem is one of the key units of work
or desired state and the relentless pursuit of within an IT organization.
closing the current gaps between the desired
state and the current baseline. Problem solving Problem Management is one of the core
is about establishing the way forward, making operational IT processes, as defined in ITIL.
the difference between the status quo and a Its aim is to prevent problems and incidents,
better situation. eliminate repeating incidents, and minimize the
impact of incidents that cannot be prevented.
When we solve a problem within an IT
organization, in essence, we are removing muri, Problem Management is made up of two parts:
mura and/or muda from people, process and/ •• The first part is aimed at uncovering the root
or technology. cause of incidents. A Problem for which the
root cause and a workaround are known, is
called a Known Error. A workaround is a way
1.6 Problems in IT of reducing the impact of the problem and
associated incidents when the full resolution
IT is one of the fastest developing areas is not yet known.
of economy; the phenomenal increase in •• The second part focuses on removing the
processing power, transmission of data across
3 ITIL definition of a Problem

10
Problem from the IT service infrastructure.
In many cases, this is done by carrying out a
change.
The DMAIC methodology is completely
compatible with the Problem Management
process. In fact. using DMAIC to solve technical
problems provides additional structure and
discipline. It also helps to broaden the scope,
identification and resolution of problems
in other Lean IT areas such as Process,
Performance, Organization and Behavior &
Attitude.

11
2 Organizing Kaizen
As we saw earlier, the aim of Kaizen is to we discussed in the Lean IT Foundation. Both
ensure that problems are identified and solved, the day start and the week start give ample
and that lessons learned are shared within the opportunity to discuss problems. These
IT organization. It is vital that leaders within problems may give rise to short term, quick
the IT organization emphasize the need for the solutions, but may also trigger an improvement
Kaizen mindset. Kaizen.

2.1 Daily Kaizen 2.2 Improvement Kaizen

Daily Kaizen is the act of responding to This is the most popular and visible form of
everyday occurrences such as incidents, Kaizen within IT. Simply said: improvement
mistakes and other quality issues and Kaizen is about bringing together a group
addressing quality issues at the source rather of people who have an interest in having a
than being satisfied with quick fixes. It is highly particular problem solved, and getting them to
dependent on the fact that management and solve this problem. It is sometimes referred to
technical leaders adopt the Kaizen mindset, as a Kaizen event. Improvement Kaizen does
to ensure that they provide their staff with have the drawback of requiring a substantial
the authority and time necessary to address time investment and the results may not
the quality issue. The focus of leaders on always be as successful as desired.
uncovering and dealing with problems is vital
in encouraging employees to see and tackle This sounds simple but this generally requires
problems they meet in their daily work. some organization and management to ensure
that the right people are involved and the right
Daily Kaizen is about ‘stopping the line’ when things happen. In this chapter, we will look into
a problem is uncovered. This is principally the governance and organizational aspects of
a Kaizen mindset issue: do we continue Kaizen events.
programming and let the testers find the errors
in the code, or do we create an environment in 2.2.1 Sources of Kaizen Initiatives
which quality is built in at the source, even if Most organizations have no shortage of known
this means stopping or delaying delivery. What errors or opportunities for improvement.
is the ‘Andon Cord’ in your IT organization? Deciding which Kaizen initiatives deserve
Daily Kaizen may well lead to quick fixes of the resources, involves deciding which is
everyday problems. However, it tends to focus most important to the customer and the
on solving smaller and simpler problems. organization. Having made this primary
The analysis required is less intense than in decision, we must check the feasibility of the
Improvement Kaizen. This automatically means initiative.
there is less deep learning achieved through
Daily Kaizen. As we saw in the Lean IT Foundation, Kaizen
initiatives may arise from one or more sources.
Within IT, we see that Daily Kaizen is an integral These sources are known as the ‘Voices’.
part of the daily and weekly meeting structures

12
•• The most important voice is the Voice 2.2.2 Kaizen Team
of the Customer (VoC) which gives the
In order to solve a problem using Kaizen, we
IT organization feedback on how the
customer, the user of the IT service, actually must accept that the problem is not solvable by
experiences the IT service. The only person an individual; that it is only with the power of a
who can truly give us this feedback is the diversity of points of view that the problem will
person who uses the IT service. There are, of be adequately addressed.
course, other voices that help us to uncover
problems: This brings us to two major questions:
•• Voice of the Business (VoB): For IT, this
concerns the ‘business’ of the IT organization 1. How many people do I need in the
itself; not to be confused with the fact that Kaizen team?
the customer of IT is regularly referred to
as “the business”. Even if the VoC does not 2. Which roles are there in the team?
identify any problems, the VoB may well find
problems to be solved. An example could To start with the first question, practice has
be that the customer is very happy with the shown that 5 to 8 people is the optimum range.
quality of the IT services, but the Voice of the With fewer than 5 participants, the diversity
Business tells us that cost levels are too high
of points of view can be compromised and the
and that budgets will be exceeded before
the end of the year. The VoB would indicate work that needs to be done is spread over a
that the IT organization needs to carry out a small group. It has also been found that where
Kaizen to understand where cost is excessive larger teams are required, the scope of the
and how it can be reduced. problem is probably too large.
•• Voice of the Process (VoP): This is about
processes not working correctly. Again, the Within the team, we find three basic roles:
VoC may indicate that the results of the
process may be satisfactory and the VoB •• Kaizen Sponsor: He or she is the owner of
may not have any issues with the costs or the problem, the person who has a direct
quality. However, the process may indicate interest in having the problem solved. In
that, for example, even though changes are some cases, we may find that the manager
delivered on time and with few incidents, of the problem owner is identified as the
the variability of the process gives cause for Kaizen sponsor. Generally, this will happen
concern. for budgetary or visibility reasons. This
•• Voice of the Regulator (VoR): It may seem person must want to see the Kaizen event
that regulators primarily have their sights through to its conclusion, i.e. resolution of
set on particular business sectors. IT is the problem. Without this person, there is no
also directly affected by regulators. The point carrying out a Kaizen event. Especially,
Sarbanes-Oxley act specifically stipulates when time (and maybe some money) will
how IT must create an audit trail of changes. be spent understanding and solving the
As IT becomes more entwined with the problem. The Kaizen sponsor must have an
primary processes of business, or even affinity with the problem and must also be
replaces these primary processes with prepared to do what is necessary to get the
systems that only require humans to see the problem solved. This does not mean that the
exceptions, IT will find itself more directly resolution can be at any price.
affected by the regulator. •• Kaizen Lead: This person manages the
Kaizen process on behalf of the sponsor
and the team. This role ensures that the
correct steps are followed as efficiently as

13
possible, so that the right actions can be up on a daily and weekly basis. The problems
taken as quickly as possible to remove the are posted on the improvement board. There
problem. This person must be experienced in is thus always a ready inventory of problems
managing the Kaizen process and ensuring
to be solved. This inventory of problems will
that the team stays on track. A Kaizen lead
contain both Daily Kaizen initiatives that need
must have facilitation and team-building
skills in order to turn the group into an to be picked up; there will also be problems
effective team in a short time. that need some more attention in the form of a
•• Kaizen Team Member: The people executing Kaizen event.
this role will do the required work. They
must be involved with the problem as it It is from this inventory of problems that the
occurs on the work floor. They must have problem with the highest priority must be
intimate knowledge of the process in which
picked and investigated through a Kaizen event.
the problem occurs, i.e. they must work
How priority is defined, will be discussed later.
in the process on a daily basis. It is useful
to have people who are ‘upstream’ and
‘downstream’ of where the problem occurs. The fact that a problem has found its way
Also, having someone who is involved onto an improvement board means that there
with the problem but can look at it from a is someone who thinks it is important to be
dispassionate point of view, can be useful to resolved. The question is: does this person have
avoid tunnel vision. the support of others, especially those in the
Selecting the correct team members for a position to allocate resources for the resolution
Kaizen team is the next challenge. It is clear of the problem?
that we need diversity. This means the team
must include people who work in the process, Assuming there is sufficient need to solve a
but also a manager who is close to the process, particular problem, usually the Kaizen sponsor
but not necessarily the manager of the process or a small team of people including the sponsor
(who may be the sponsor). The team will will create a short Kaizen charter in which the
need technical skills, e.g. understanding the problem is described and an indication is given
technology involved in supporting the process, about resources (people, time, and money)
or business and regulatory rules governing the requirement for the resolution of problem..
process. Also, the time within which a solution should
be found will also be indicated. This means that
2.2.3 Preparing a Kaizen Event an initial stakeholder analysis must have been
done.
The DMAIC cycle exists within an organizational
context. As we saw in the Lean IT Foundation,
Based on the Kaizen charter, the Kaizen event
Lean IT organizations work with visual
can be planned and prepared. This means
management as part of the Jidoka principle.
organizing basic things, such as:
Jidoka is all about creating an environment in
which disturbances to the flow of work through •• A location where the Kaizen team can meet
the value streams are made visible, that is, •• Requirement of whiteboards, flip-overs,
marker pens, post-it notes
problems are not left covered up.
•• Access to data sources
•• Invitations to all participants, including the
In day-to-day working within a Lean IT
sponsor and Kaizen lead
organization, we see that problems are brought

14
On top of this, there must be an agreement
on how and when the team will communicate
progress. The minimum communication must
be through daily updating the improvement
board containing the relevant problem. This
should be supplemented by regular submission
of the current state of the A3.

Planning a Kaizen is often described as a


relatively straightforward affair in which
activities are planned in a week. Ideally, a
Kaizen is planned within a short time, wherein
the Kaizen team dedicates their time to solving
the problem.

In practice, within an IT organization, this


kind of planning is quite difficult. Especially
at the start of a Lean IT transformation, the
organization is not attuned to the fact that
people are out of the ‘production’ process for
a full week. Even after a transformation has
taken hold of the IT organization, it remains
difficult to clear agendas completely to focus on
a Kaizen.

A more realistic way of planning the Kaizen is


to set up five or six meetings of three hours
per meeting at regular intervals over a period
of two weeks. This gives engineers the time to
carry out operational work in the meantime.
The agreement must also be made that work
related to the Kaizen be carried out in between
meetings, e.g. data collection, processing of
data or preliminary analysis.

These preparatory activities bring the Kaizen


event to its point of initiation. In the Define
phase, we will see how this input is validated,
enriched, and brought to a point where the
problem can be fully investigated.

15
3 A3 Method
One of the powerful tools that Toyota has the A3 problem-solving sheet. It includes the
institutionalized within Lean is working with A3 following elements:
reports. It supports and promotes continuous
•• Background: In this section, the context
improvement, and is based on the PDCA cycle. in which the problem exists is described.
This may include a brief history of the IT
3.1 A3 organization or department in which the
problem exists. The background section will
include a description of the problem.
A3 is not a clever acronym, it simply refers to
•• Current Condition: Here we describe the
the size of a piece of paper. A3 is 29,7 cm by
current condition surrounding the problem.
42 cm (11,7 in by 16,5 in). It is twice the size This may include complications that cause
of A4 and half the size of A2. The beauty of the problem to remain in place.
the A3 sheet is that it provides enough space •• Future State Goals: This is a description
to explain a relatively complicated story, but of the way the situation should be if the
limits the writers in their verbosity. The aim problem did not occur. Preferably, we should
be able to define in concrete terms what
of the A3 is to encourage conciseness in the
would happen if the problem no longer
communication of a message. It also works as
existed. ‘Concrete’ may even mean setting a
a checklist to ensure strict adherence to the numerical target that should be achieved as
chosen problem-solving methodology, in our a result of the resolution of the problem.
case DMAIC. •• Analysis: This section includes a short
description of the analysis that was done to
It is important to understand that there is no discover the root cause of problem.
hard and fast way to complete an A3 problem- •• Proposed Options: Here we find the list of
possible solution candidates to the problem.
solving sheet. Most A3 sheets tend to have 7
•• Plan/Improvement: This is where the
or 8 sections, as we will see below. However, if
improvements to be implemented are
you wish to have 5 sections focusing on DMAIC, described and a brief plan is created for their
then this is acceptable. It is important that the implementation.
problem-solving A3 covers the complete PDCA •• Follow-Up: After the chosen solutions have
cycle. The key determinants for a good A3 been implemented, there must be one or
sheet are: more follow-up actions to ensure that the
adopted solution remains in place. There
1. Does it help the team compiling it to must at least be one action to inform others
follow a structured problem-solving of the lessons learned from the problem-
method? solving action and/or to communicate the
solution to other parts of the organization
2. Does it help people who need to take where they may be suffering from the same
action on the outcome, to understand issue.
the logic that led to the outcome?
The associated A3 may look like this:

3.2 Contents of a Problem-solving A3

Let us start with a basic much-used version of

16
Alternatives may include the following models. The first model includes a flow in which the position of
the Analysis step is clearly seen as an intermediate step.

And the second model is based on the DMAIC method.

17
Each of the three models presented is valid as it helps the team carrying out the Kaizen to both
follow a process and to communicate a result.

As we stated earlier, within IT, we not only recognize problems to be solved. We also recognize
Problems. These tend to be issues of a technical nature that are the root cause of incidents. An A3
model for the resolution of these Problems could be the one below.

18
3.3 A3 Status Report and A3 Proposal and enhances the continuous improvement
mindset.
In their book ‘Understanding A3 Thinking’,
Sobek and Smalley describe two other forms The A3 proposal is used for creating a
of A3 report: the A3 Status report and the A3 recommendation for action. Generally, the A3
Proposal report. These are again variations on proposal will be aimed at implementing new
the above themes, but with different purposes. policy or for carrying out a project that entails
substantial investment of time and/or money.
The A3 status report is aimed at informing all This A3 report focuses principally on the Plan
stakeholders of the progress of the execution phase of the PDCA. It will also describe how
of a long-running project or action. This type the Check and Act phases need to be carried
of A3 is not some much focused on analysis, out, i.e. it should indicate how the proposal will
rather it aims to continually check whether the be monitored as it is being executed and post
assumptions made continue to be correct and implementation phase.
ensure that it is clear which actions need to be
taken. An A3 status report will tend to focus on The A3 proposal report is more similar to the
the Check and Act aspects of the PDCA cycle. A3 problem-solving report.

•• Background: As with the A3 problem-solving


The key components of the A3 status report
report, this section includes the context
are: within which the proposal is being written.
•• Background: In this section, the context is •• Current Condition: This is the key section of
described. This may be a concise version of this A3. It should be clear from this section
the problem-solving A3 for which the A3 is a why the proposal needs to be made and
status report. why it is important to seriously consider its
•• Current Conditions: Here, the progress of the execution. The main issues must be clear to
project is described. The changes that have the reader.
already been made are described. •• Proposal: This is a description of the
•• Results: This is the key section of an A3 proposed course of action.
status report. The current conditions are the •• Analysis/Alternatives: This section is all
consequence of actions taken. These actions about the business case for the proposal.
have led to results. It is the results on which •• Plan Details: In this section, the reader is
the decisions are taken whether to continue given the details of what will be involved
and, if so, which course of action to take. with carrying out the proposed change. It is
•• Remaining Issues/Action Items: The A3 vital that stakeholders, necessary resources
status report ends with the upcoming and consequences are made clear.
actions. These may be based on issues •• Unresolved Issues: In this section, issues
encountered during the process of getting that are not (sufficiently) addressed that
into the current condition or they may be may have an impact on the execution of the
actions based on the original plan. proposal, are dealt with. In essence, these
are risks that may affect the proposal.
The A3 status report is an important document •• Implementation Schedule. This is a high-
to support the learning process within the level plan of how the proposal would be
organization. Each status report must lead to implemented.
some kind of reflection, with lessons learned In all cases, the text in an A3 must be created
that lead to action. In this way, the A3 status in such a way that the audience clearly
report is embedded into the Daily Kaizen,

19
understands what problem has been solved, the subject of A3) to turn your story into a
what the status is of a particular project, or visual experience using pictures and graphics
what the proposal is. The A3 must be written to explain what has been investigated and
what is proposed as a solution.
from the perspective of the reader!
These skills will be further specified through
3.4 Skills for completing an A3 the examples in this document.

Using A3 reports requires practice. There are 3.5 Building communication


skills that need to be acquired and honed to
ensure that an A3 becomes a powerful tool. Using the aforementioned skills will help to
determine the parts of the story about your
In order to ensure the information in the A3 Kaizen. You will then need to construct the
is accurate, there are four skills that must be story in a way that is easy for the stakeholders
practiced: to understand. This will help stakeholders to
•• Summarize: The first key skill is the ability accept the solution you are proposing.
to express thoughts, facts, and other
information concisely. Although an A3 sheet There are many ways to construct a story.
looks quite large when it is blank, the act of The one we will deal with in this publication
filling it with the relevant information can be is Barbara Minto’s Pyramid Principle. This
quite a challenge. It is vital, therefore, to stick is a method that is fully compatible with A3
to the information that has a direct bearing thinking. In fact, it helps to structure the
on the issue at hand, be it a problem, a
information and insights gained during the
proposal or a status. In order to summarize,
we need the two other skills. Kaizen event.
•• Analyze: Analysis is part of most A3
reports in some form or other. What does The problem is framed using the following
it mean? The aim of analyzing is literally framework:
to separate something into its constituent
parts or elements. It is vital when writing Situation: The current situation and ambition
an A3 report to understand the parts of the of what the situation will look like when the
problem so that only the right information is
problem is solved.
given. If we are able to discern the parts of
a problem, we can also determine which of
Complication: As description of the things that
these parts are relevant to the reader.
•• Synthesize: The opposite is also true. One are keeping the current situation the way it is
of the best ways of summarizing is by or preventing the problem from being solved.
combining parts or elements. The ability to
Synthesize can be defined as combining a Key Question: This is the question to be
number of disparate elements to make a answered; the problem to be solved (in
coherent whole. This is important when the question form).
parts do not immediately appear to have
individual relevance to the issue.
Answer: This is where the elements of the
•• Visualize: Once we have analyzed,
analysis are structured in order to present
synthesized, and summarized, we need
to tell a story succinctly. In line with the ‘a a coherent set of motivations supported by
picture tells a thousand words’ adage, it is arguments, completed by the proposed course
strongly recommended (by all authors on of action.

20
The Situation-Complication-Key Question trilogy will be recognized in the next chapter as a problem
statement. The answer includes the structuring process required to bring the Measure, Analyze, and
Improve steps together.

Using the Pyramid Principle means using a bottom-up approach for grouping arguments (the A’s in
the above figure) in a logical way such that they support a motivation for the answer you give. The
Answer should be supported by three clear motivations as to why this answer is the best answer to
the Key Question. The arguments and motivations will come from the Analyze phase of your Kaizen
event. The Answer will be the result of the Improve phase.

A useful technique in constructing an argumentation pyramid is MECE. This stands for Mutually
Exclusive, Collectively Exhaustive. Mutually Exclusive means that all items in a particular category
only belong to that category, and no other category. Collectively Exhaustive means that all
possibilities have been covered.

In an IT context, we may encounter a situation where there is a lack of satisfaction with two services.
Based on a data set including a variety of calls, we would need to have each call put into a single
category, e.g. the call may be an incident, a service request, a request for information or a complaint.
These categories would need to be defined in such a way that all calls in the data set fall into one of
the four categories, and only one of the four categories. In this way, the set and the analysis on which
the data is based would be assured to be MECE.

Subsequently conclusions drawn and proposals suggested would also be relevant to the correct
calls. Analysis may show that the calls for a particular application are distributed 80% incidents and
20% service requests, whereas a second application may have 20% incidents, 40% requests for
information and 40% service requests. Assuming for one moment that the absolute volumes of calls
are the same, the analysis may conclude that application 1 is technically unsound since it has many
technical disruptions. Further analysis may identify the causes of these disruptions. Secondly, the

21
analysis may show that there has been insufficient training of users regarding application 2 because
there are many calls for support.

The result is two motivations: resolve the technical problems and train the users, and a series of
arguments leading to these motivations. The answer to the key question may, then be: we need to
invest differently in applications 1 and 2, to increase the user satisfaction of the two services.

22
4 Define Phase
“'The beginning of wisdom is the definition of Possibly more important than defining a
terms” is a quote attributed to Socrates. He problem, is the fact that someone believes that
might just well have said “The beginning of it IS in fact a problem and is prepared to invest
solutions is the definition of problems”. And, time and, possibly, money to get the problem
in practice, it turns out to be true. Once a solved. In short, we need a sponsor for the
problem has been defined, the problem can problem.
appear to diminish in size or importance. As our
understanding increases, so does our feeling of Identifying the sponsor of the Kaizen is an
our ability to solve the problem. absolutely indispensable step that must be
confirmed regularly throughout the DMAIC
This issue of the perceived diminishing size of process. As soon as no one feels a need to
a problem gets more significant the more we solve the problem, stop the process instantly.
understand about the problem. We will return Any further action is waste, since when it
to this issue as we go through the DMAIC cycle. comes to actually taking action, no one will feel
inclined to make the effort.
Unsurprisingly, ‘Define’ is the starting point
for DMAIC, namely with the definition of the As we said in 2.2.2, there are three roles that
problem to be solved. must be identified and fulfilled before a Kaizen
event can be organized. The first, we have
Before we can start, we need to identify which just seen, is the sponsor. Mostly, this will be
problem we are going to solve. This may appear someone who has a vested interest in getting
to be simple, especially since one of the most the problem solved. This person must then
prevalent starts to a sentence within IT is “The ensure that other people directly involved in
problem is …” followed by a problem statement the situation where the problem exists are
of dubious quality. On top of this, the ‘problem brought together as a team to work on solving
statement’ usually includes the preferred the problem. Lastly, the Kaizen lead must be
solution somewhere in an adjoining sentence, added to the team to ensure that the Kaizen
e.g. “The problem is that [x] is not possible with process is followed. The Kaizen lead must
Windows/Linux, but is possible with Linux/ also keep an eye on the way the team works
Windows”, or “The problem is that development together.
doesn’t provide us in Operations with a decent
handover document.”
4.1 Problem Statement

A good way to start developing the Kaizen


Problems are mostly visible difficulties which
mindset within the IT organization is to have
confront the IT organization or individuals.
a standard response to these ‘The problem
However, problems never exist in isolation.
is …’ statements. The standard response
There is always a cause. The cause and factors
could be something like “But is that the real
that keep the cause in place are the entities
problem?” Experience has shown that this
that we are trying to understand when we
simple question, gets IT people thinking about
seek to solve a problem. One of the most
problems in a constructive manner.

23
difficult parts of defining a problem is that based on people’s beliefs as a result of their
every problem has symptoms; phenomena that observations. These are by definition selective
accompany the problem, or serve as evidence and biased, and very much in need of testing
that the problem exists. They are, however, not through thorough analysis of the data and facts
the problem itself. that can be found.

Before we can start to investigate our problem, Example of a problem statement and
we must have a statement that helps the team hypotheses:
investigating the problem to focus its attention.
We call this a problem statement. The problem We need all of our software changes to go into
statement may be in the form of a question production seamlessly, without defects, where
or in the form of a statement. The former is everyone is aware of and informed about the
preferable because it is then clear when you outcomes and status of the change. Right now,
have found the answer to the question. we have too many release failures, requiring
rollbacks. If we do not address this problem
A complete problem statement should include a in the short term, we will need to increase
description of the current situation, the reason the resources needed to handle the ensuing
why this is not acceptable and an indication of incidents and rework. Consequently, we may
what the ideal situation looks like. Next, the miss customer deadlines potentially resulting in
problem itself is described, followed by the lost revenue, SLA penalties, lost business, and
question to be answered. It is vital that the further damage to our quality reputation.
problem statement is Specific, Measurable,
Achievable, Realistic, Time-bound (SMART) How can we halve the number of release
to ensure that the team solving the problem failures in the next two months?
knows when it would be successful.
Three associated hypotheses that could be
The other reason for using a question as the tested while investigating this problem could
form to describe the problem statement be:
is that we can then discern the problem
1. We think that our ability to test changes
statement from any hypotheses we may
is not good enough
have. A hypothesis is ‘a proposition, or set
of propositions, set forth as an explanation 2. We believe that the adherence to
for the occurrence of some specified group the change and release process is
of phenomena, either asserted merely as a inconsistent across the IT organization
provisional conjecture to guide investigation
3. We think that the technology
(working hypothesis) or accepted as highly supporting certain software
probable in the light of established facts.’ development and release processes is
(Dictionary.com) unstable.

A hypothesis is a statement that will start 4.2 Validating the Problem


with the words “I/We think/believe that
…”. The hypothesis is as yet not supported As we indicated earlier, people have an
by any factual basis. The hypothesis is automatic ability to state that there are flaws.

24
In fact, if we take a look around an average that IT organizations can take excessive amount
IT organization, we will be able to find a of time to deliver new functionality to their
seemingly unending list of problems. This may customers is also an ongoing problem within
seem somewhat disheartening and there are the vast majority of IT organizations. And
certainly examples of IT organizations where providing high quality advice in a timely manner
there are so many (perceived) problems that is not always possible. So we cannot deny
people give up making the effort to solve them, that, in a generic sense, IT organizations have
and resort to a fire-fighting mentality. This is problems to solve.
not the Lean way.
We are, however, particularly interested
As we said earlier, in Lean IT, our mindset is in the specific problems within our own IT
that we accept that IT is filled with problems organization. And this means taking a detailed
and we must act to solve the problems on a look at the Voice of the Customer. We will go
continuous basis. More than anything this is a into more detail in voice of customer, later in
behavioral aspect. Traditionally, management this chapter.
has a tendency to want to hear the good news
and a smattering of problems is ok as long as Voice of the Process
they are on the way to being solved. Lean IT is
about seeking out problems and understanding The second most obvious place to look for
what impact they have on the IT organization’s problems is in the processes flowing from the
ability to deliver high-value products and customer into the IT organization, or looking at
services to its customers. the internal processes of an IT organization.

Earlier in this document, we looked at where This is where we find the link to the well-
problems can be found. The four ‘voices’ tell us known IT frameworks. ITIL, ISO/IEC 20000 and
where things are going wrong and when we COBIT, as prime examples of IT frameworks
need to actually solve a problem. and standards, define the future states of IT
organizations, the ideal situations. By matching
For now, let’s look at the two voices that are these ideal situations to your current situation,
most likely to provide us with problems: you can undoubtedly find discrepancies. These
frameworks take a variety of views on the IT
Voice of the Customer organization. However, in essence they aim
to ensure that the processes work reliably
Customers of IT have three basic requirements: and effectively. We can therefore use these
•• Make sure my IT services work frameworks to understand how far an IT
•• Give me new IT capabilities as and when process is from its ideal state. Thanks to the
I need them fact that these frameworks describe the ideal
•• Give me advice on the new or better state very specifically, it is quite easy to create
usage of IT a problem statement.
Each of these statements is the potential
source of a problem statement. The fact The key aspect is, therefore, not so much
that the customers of IT are confronted with finding a problem as determining which
incidents is a problem to be solved. The fact problem to tackle.

25
4.3 Types of Problems this within IT organizations when IT engineers
find they need more time to solve a particular
There are many problems and in order to solve incident than they had previously thought, even
them, it’s essential to know the characteristics though the incident was relatively innocent. The
of problem. In 2002, Dave Snowdon published engineer may oversimplify the solution leading
the Cynefin (Welsh for ‘habitat’) model, in which to an incomplete solution for the user of the IT
he categorized decision making into one of five service. This type of problem is often seen as
types. Decision making is directly related to operational problems. Obvious problems are
the underlying problem about which a decision particularly suited to daily Kaizen.
must be made. We can, therefore, use the same
categorization to identify the type of problem The second type of problems are the
we are dealing with: complicated problems. The relationship
between cause and effect requires analysis
The Cynefin model contains the following five which implies expert knowledge is necessary.
types: simple, complicated, complex, chaotic, Having said this, the problem does follow rules.
and disorder. However, the rules may be more difficult than
expected. This type of problem can be solved
by using good practices, scenario-planning, and
system thinking. Once understood, the rules
for resolution can be defined and followed. We
find this kind of problem within the technology
of IT. Although, we sometimes do not
understand what happened, by investigating
trends and analysis, we can understand what
went wrong and how to solve the problem
(often by rolling back a change). There is a right
answer that can be found. These are the types
of problems where technical experts may
disagree because they focus on the elements
of the problem they recognize. Generally, these
problems are seen as tactical problems.

The first type of problem is the simple (or Simple and complicated problems always
obvious) type. This is a problem that is caused require analysis, that is. breaking the problem
by the fact that the rules have not been down into a sequence of technical events.
followed. The relationship between cause This is one of the reasons why it is important
and effect is obvious to (almost) everyone, it to record what activities have been carried
is reproducible, repeatable, and predictable. out within an IT organization as it makes
These problems can be solved with a Standard understanding the causes of simple and
Operating Procedure (SOP). As long as the complicated problems easier and quicker.
SOP is used, the problem should not reoccur.
Unfortunately, simple problems can be Complex problems are problems for which
underestimated by experts. We regularly see the cause and effect are explainable in

26
retrospect. The issue has not been seen problems, complex and chaotic problems
before and there are no known solutions require synthesis. The team solving these kinds
or best practices available. The team needs of problems needs to investigate how factors
to look for completely new ways to solve and symptoms interact to create the problem.
the problem. This kind of problem does not
repeat in exactly the same way; outcomes The final area to address is disorder. This is
may be unforeseen and patterns emerge over the situation in which we do not know into
time. To understand these problems, these which of the four other categories the problem
patterns must be investigated. This means falls. Causality is not understood at all. From
learning while solving the problem. This type disorder, we can use the Cynefin model to
of problem is particularly related to the human determine which state the problem is in and
factors within IT organizations. We may need then act accordingly. The danger with the
to carry out several different experiments to disorder is that experts see the problem’s
understand the dynamics of the problem and symptoms as being part of a simple type of
find a solution. There is not necessarily a single problem. This may cause the problem to be
right answer and we can use guidelines to solve underestimated or incorrectly diagnosed.
the problem. Generally seen as more strategic
decisions and problems, they tend to affect Generally, within IT, we use Kaizen to
social systems, rather than technical systems. investigate complicated and complex problems.
The Define session described at the end of this If a problem turns out to be simple, the solution
chapter is an example of a complex problem. will probably be found during the Define phase.
We generally refer to the solutions found in the
Lastly there are chaotic problems. With these Define phase as quick wins.
problems, no cause and effect relationship is
directly perceivable. These problems are not 4.4 Validating the Value of Solving the
detectable before the fact, there are no clear Problem
answers and there are elements of the problem
that we cannot know when it is happening. The best Kaizen selection is based on
These problems require crisis management that identifying the problems that best match the
focuses on relieving symptoms to create some current needs, capabilities, and objectives of
kind of stability. Typically a Leader will have to the IT organization, related to the Voices.
act quickly based on the information available
to stabilize the situation in order to buy time Each problem needs to be investigated at a
for experimentation. In order to tackle chaotic high level from three perspectives:
problems, we need to use principles. The crisis •• Results for the customer or business
team must act to cause change to the existing benefits
situation. All action must be aimed at trying to •• Feasibility
create order. When taking action, it is important •• Organizational impact
to do a risk analysis of the action (use Failure Remember: It is important to not get lost in a
Mode Effect Analysis) to understand what its ‘mini-Kaizen’ when investigating which problem
consequences could be. needs to be solved. It is about matching the
aforementioned criteria in order to create
As opposed to simple and complicated
a broadly prioritized list of possible Kaizen

27
initiatives. Each time, a Kaizen initiative must and are they available?
be selected. The previous prioritization needs •• Complexity: How complicated or difficult
to be reviewed to ensure that it is still valid. do we anticipate it will be to develop the
improvement solution and implement it?
Kaizen candidates on the list may in the
•• Likelihood of Success: Based on what we
meantime have a lower priority due to a series
know, what is the likelihood that this Kaizen
of daily Kaizen actions, or as a result of changes event will be successful in a reasonable
in the customer’s environment. timeframe?
•• Support or Buy-in: How much support for
Results or Business Benefits Criteria this Kaizen can we anticipate from key
stakeholders within the value stream and
First, the sponsor (often together with the will we be able to make a good case for
Kaizen team) will carry out an assessment doing this Kaizen event?
of the benefits that will be achieved if the Organizational impact criteria:
problem is solved. Aspects to be considered
and questions to be asked may be: Lastly, the team must look at whether solving a
particular problem will provide the organization
•• Impact on External Customers and
Requirements: How beneficial is the problem with the following additional benefits:
of our customers? •• Learning Benefits: What new knowledge
•• Impact on Business Strategy: What value can be acquired from conducting
will this potential Kaizen have in helping us Kaizen?
to realize our business vision or improve our •• Cross-functional Benefits: To what
competitive position? extent will this event help to break
•• Financial Impact: What is the expected cost down barriers between groups in
reduction, improved efficiency, increased the organization and create better
sales, or market share gain? collaboration in the entire value stream?
•• Urgency: What kind of lead time do we have •• Core Competencies: How will this
to address this issue or capitalize on this possible Kaizen event affect our mix and
opportunity? capabilities in core competencies?
•• Trend: Is the problem or opportunity getting
bigger or smaller over time and what will A useful question to ask is: what will happen if
happen if we do nothing? we do not solve this problem, but a different
•• Sequence or Dependency: Are other possible problem instead?
initiatives or opportunities dependent on
dealing with the issue first? Problems in IT
Feasibility Criteria
Within IT, we find many different types of
The sponsor and Kaizen team members problems to solve. Some may seem trivial,
must try to understand what effort must be whereas others are clearly quite significant. In
expended to solve the problem. The following the table below, we present a number of typical
aspects may be investigated: IT problems. This list is merely a selection and
by no means complete.
•• Resources Needed: How many people, how
much time, and how much money is this
Kaizen event likely to need?
•• Expertise Available: What knowledge or
technical skills will be needed for this event

28
Problem Explanation

Technical performance problems This problem may come in a multitude of forms. Every
piece of technology (hardware or software) may be a
source of problems. Often, a piece of technology does not
so much fail as just not perform well for any number of
reasons.

‘Fire-fighting’, focus on solving IT organizations seem to have the time to repeatedly


incidents rather than structural solve incidents, but do not make the time to remove
resolution the sources of these incidents. This leads to a highly ad
hoc way of working, in which the number of incidents
(both per unit of time and open at any one moment)
continuously creeps higher.

Balance operational and change The classic statement here is: “I couldn’t complete the
work change on time because I had to solve an incident”. The
key issue is that IT people are involved in all sorts of
work, not just a single type.

Releases or ‘technical weekends’ A key question within IT organizations is: Why do


that cause problems the following changes need to lead to more incidents? It should be
work day possible to implement changes without causing further
disruptions.

Planning and execution of work This causes a huge amount of stress within IT
organizations, as poor planning has a correlation with
switching priorities.

Collaboration between development This is probably one of the most classic problems within
and operations, or applications and IT organizations, departments that throw work ‘over the
infrastructure wall’ to each other.

Changes applied without informing IT organizations and people are not renowned for their
users ability to communicate. However, this is a skill that must
be mastered especially in a world where IT services are
pervasive.

Constantly changing priorities This causes context switching and work to be left
incomplete (particularly the documenting).

Focus on achieving SLA KPIs Engineers no longer focus on providing a great service
but only look at whether they are achieving the numbers.

29
Shared resources, dependency on IT people, especially the experts, are often required to
specific individuals be in multiple places at one time. They are allocated
to multiple projects and continue to have a role in the
operations. This often causes huge delays and highly
stressful situations, leading to errors.

Lack of availability and capacity Every IT organization knows that it should plan for the
planning future, and understand how its services will perform
given the projected developments in their customer’s
organization. Very few actually do, leading to network
capacity problems, disk space incidents, insufficient
processing power, or poor human resource planning.

Undoubtedly, while reading the list, you will have recognized problems; others may not be relevant
in your IT organization. There will also be problems for which there are ‘standard’ solutions (“if
you use [standard IT solution], you can solve the problem”). Although the above problems may be
recognizable, their specific causes may be diverse and different per IT organization.

4.5 Ensuring Support for a Kaizen

As indicated earlier, a Kaizen without support must not be attempted. We saw earlier that the Kaizen
sponsor is an absolutely indispensable role to be obviated. Where the two are separate, the problem
owner must be identified. These two parties play a critical role in the acceptance of the solutions.

Much attention is paid to sponsorship of the Kaizen. However, in the end, if the people on the work
floor: the primary stakeholders of any problem to be solved, do not see the point in solving the
problem, then choose a different problem. If the work floor is not convinced that the problem needs
to be solved, then the acceptance of any solution will be very low.

Other stakeholders obviously include people up and downstream of the place where the problem
is identified. In pretty much every case, the customer will have an interest in the resolution of the
problem, be it from a qualitative perspective or a cost perspective. Looking at IT organizations,
the vast majority of them are internal to a business, governmental organization, or an NGO
(Non-Governmental Organization). They have a vested interest in cost reduction and/or quality
improvement. The key question is whether they need to be directly involved in the execution of the
Kaizen.

4.6 Stakeholder Analysis

Carrying out a stakeholder analysis is all about understanding where the various people involved
stand on a particular issue, and what impact their view has on the success of addressing the issue.
Not all stakeholders should necessarily be involved in the actual Kaizen event. Some stakeholders
provide input or data into the Measure phase, others need to be kept informed, and others need to

30
be actively involved in the actual meetings.

In order to understand stakeholder positions, we must define the issue on which you wish to
understand the positions of various stakeholders (individuals or groups). In the case of a Kaizen, the
issue would be the problem at hand. Stakeholders are directly or indirectly involved with the issue,
often referred to as the “chicken or pig dilemma”. When it comes to having a cooked breakfast, the
chicken is indirectly involved by having to be provide an egg; the pig is fully and directly involved as
it needs to deliver the bacon. In the stakeholder analysis, the sponsor and Kaizen team will need to
understand which stakeholders fall into which category.

On top of that, they will need to identify whether a stakeholder is positive, negative or neutral
regarding the problem. And whether they have a strong and explicit opinion on the subject or
whether they are not outspoken. Also, the stakeholder’s influence must be investigated. Do they have
formal or informal power regarding the problem? Adjust the analysis regularly throughout the Kaizen
to understand how the stance of stakeholders changes

Based on the influence (power or impact) and involvement (or interest), we can readily identify types
of stakeholders.

For each of the types of stakeholders, we see that there is a communication strategy involved in
keeping the stakeholders engaged with the problem.

4.7 Define Phase and A3

The result of the Define phase is that the “Background” section of the A3 can be completed, by
answering the following questions:

1. What is the problem?
Who has the problem?

31
2. What is the scope of the problem?

The questions may seem simple but can take considerable time and effort to answer accurately.

4.8 Key Steps in the Define Phase

We have looked at a number of aspects of the Define phase. Bringing these together, we find that
there are a number of steps that can be taken to complete the Define phase. These steps are as
follows:

1. Problem Selection and Owner Identification

Use the criteria to determine whether a problem is significant enough to warrant solving in the short
term. Always ensure that there is a person who owns the problem (the Kaizen sponsor) and sponsors
the Kaizen event. It is vital that the problem sponsor is serious about solving the problem. The owner/
sponsor must be able to maintain the drive to solve the problem. This is an important reason why
the Kaizen event must be kept as short as possible because there will always be another problem
around the corner that clamors for attention. This step may already have been sufficiently addressed
in preparing the Kaizen.

2. Problem Statement and Kaizen Team Selection

Create a problem statement, and complete the background section of the A3 and select the right
team members. Select the Kaizen team members using the stakeholder analysis. All team members

32
must agree on the problem to be solved. 4. Collect Voice of the Customer Information
If there is no agreement, then it is possible
that there are two different problems that Having understood the scope of the problem,
need to be solved, or stakeholders have been we need to bring together the Voice of the
incorrectly identified. The Kaizen lead may use Customer information that is relevant to this
an Ishikawa diagram as a visualization tool specific problem. Use the Critical to Quality
for collating symptoms of the problem, even (CTQ) to collate and structure the information.
though it is usually used for analysis of factors Note that you may well have selected your
causing the problem. The Kaizen lead can also problem based on feedback from the customer.
use the 5 Why technique to understand and Having validated, and possibly adjusted, the
scope the problem. scope of the problem, you may need to go back
to the customer for more specific requirements
3. Validate Scope of the Problem and wishes regarding the problem at hand. The
team must formulate specific questions for
Once we have defined the problem, we must the customer. Likewise, if the problem is based
validate its scope. The team must understand on a signal from the Voice of the Business
whether it is reasonable to expect the problem or the Voice of the Regulator, the team must
to be solved. To do this, we draw a SIPOC for formulate clarification questions for the
the process in which the problem occurs. The business or regulator.
team may need to adjust the scope based on
insights gained from the SIPOC. During this 5. Create High Level Plan
step, it is useful to understand what type
of problem needs to be solved. This will be You will need to agree on a plan for the
an indicative typology to guide the team’s execution of the Kaizen event. This will include
resolution efforts. It is very important not practicalities, such as the availability of team
to jump into a problem too quickly. It takes members and the sponsor, availability of
time (sometimes up to 3 hours) to define the meeting facilities, and agreement on the main
problem, and particularly, to gain agreement deadlines. This needs to be done during the
among the team members. The time is Define meeting. All participants must clear
often spent learning to listen to one another. their agenda to ensure that the Kaizen can be
Team members have the tendency to repeat completed in a short period of time. Within IT
themselves with different words. Here, the role organizations, this can be a considerable issue
of the Kaizen lead is extremely important. He since IT people can rarely take off their work
or she must ensure that team members take to do a Kaizen full time. This is certainly one
the time to listen, often the managers in the of the key challenges of doing Kaizen in an IT
team have the most difficulty with this aspect. organization. A suitable strategy is to plan five
A technique that the Kaizen lead can use is to meetings (one for each phase) and ensure that
ask the person wanting to say something to there are two days between the meetings, so
repeat what the previous speaker has said, in that actions can be carried out. The meetings
his or her own words. This technique ensures should be planned as 3-hour meeting. That
that the previous speaker feels heard and the can always be changed if the goals for the
following speaker must address what has been meeting have been met. The plan should also
said. include how the (interim) results should be

33
communicated.

It is critical that all aspects of the Define phase


are completed prior to moving to the Measure
phase. If the problem statement is not fully
defined and agreed by the Kaizen team, there
will be basis to complete the following phases.
This invariably leads to going back to the Define
phase.

34
4.9 Case Study: Define Phase you prepared to accept a solution defined
in a Kaizen?’, there was a long silence. And
Within an IT organization of about 100 people, here we find one of the key issues within IT
the teams were structured in such a way that organizations starting on their Lean journey:
the technicians managing the core of the IT the acceptance of the results of a Kaizen. It is
infrastructure (networks, servers, databases, very challenging for IT managers to accept that
operating systems, etc.) were allocated the solution to a non-technical problem (as in
to customer-oriented support teams. This this and many other cases) may be provided by
allocation was based on the number of hours the work floor. Eventually, one of the members
required. For example, the three network of the IT management team said he would act
engineers were allocated to four teams for as sponsor for the Kaizen, even though all MT
an average of 10 hours per team. This led to members could be defined as problem owners
a situation of shared resources, which faced in their own right. The other three agreed they
unmanaged and unbridled demand. The result would accept the result, as well.
was a highly tense situation in which the
management team was dissatisfied with the The next step was to define the problem.
performance of the shared resources, the team Due to the emotionally charged nature
leaders felt they did not get the service they of the discussion, we started with a pre-
had been promised and the engineers felt as Kaizen session with each of the three key
though they were being pulled in all directions stakeholder groups. In each of the sessions,
by 5 team leaders and 3 management team the stakeholders were challenged to define
members. their perception of the problem. They were
also challenged to define the problem from the
This situation existed for about a year point of view of the other stakeholders.
and the situation had become explosive
because neither the support teams nor the After the exploratory sessions, a first Kaizen
infrastructure technicians believed they were session was organized, in which delegates from
getting or delivering high quality services. The the three stakeholders came together. A total
primary stakeholders were the infrastructure of nine people made up the Kaizen team for
engineers, the IT management team (IT MT), the Define session. The Kaizen lead started the
and the team leaders of the customer-oriented session by explaining the DMAIC procedure
teams. to be used throughout the Kaizen. He also
explained the goal for this session: to agree on
The situation was, in fact, so explosive that the problem statement to be solved. He stated
the mere mention of the problem to any one the initial problem statement as defined by the
of the stakeholders led to emotionally charged problem owner/sponsor, and also asked all the
discussions about the quality and capabilities of participants whether they were prepared to
the other stakeholders. do what was necessary to solve the problem,
independent of personal preferences and
The first step was to understand whether opinions.
there was a desire within the IT management
team to actually solve the problem. The direct As a result of the pre-Kaizen information
response was ‘of course’. To the question, ‘Are sessions, it was quite easy for the participants

35
to define the problem statement. It took
further two hours to finally gain full agreement
among all parties regarding the problem to be
solved. The result was particularly interesting
because the problem statement was closest
to the problem statement as stated by the
infrastructure engineers. It was not the case
that management and team leaders used
their hierarchical power to push through their
idea of the problem to be solved. Other cases
have proved that a 2-hour session to define
the problem statement for a complicated or
complex problem is no luxury. In fact, it is often
very necessary. This is because the team had
to ensure that symptoms did not end up being
defined as the problem. The Kaizen lead had to
continually keep the team focused on the goal
and had to question the team members when
he felt the symptoms are being turned into the
problem.

Having agreed on the problem statement, the


team needed to agree on how much time they
would take to solve the problem. The urgency
and impact dictated that the team needed to
work as fast as possible.

36
5 Measure Phase
The second phase in the DMAIC cycle is the Where y is the dependent variable and x is the
Measure phase. In this phase, we refine the independent variable. The f means that the
problem statement based on measurement. problem (dependent variable) is a function of
The goal is to ensure that there is a detailed independent variables. In fact, a clear notation
understanding of the current situation would be:y=f(x1, x2, x3 ...xn)
surrounding the problem area. This is done
by collecting reliable data on the variables In this equation, we see that the problem
related to the problem. The aim is to provide may in fact be caused by any number of
information to help identify the underlying independent variables.
causes of the problem.
Our aim in the Measure and Analyze phases is
to find the independent variables (the x’s) and
5.1 Data
understand their impact on the problem (the y).
The first step is to define the data to be
An example: There is a problem with the ability
collected. The data obviously needs to be
of a desktop support department to deliver
related to the problem statement.
laptops to customers within the agreed time.
This is the y. The equation could look like this:
5.1.1 Variables
During this phase, we need to fully understand y=f(laptop, knowledge of employees, process, software,
the role of variables in the resolution of holidays, sickness, …)
problems. There are essentially three types of
We would need to collect data regarding each
variables:
of the independent variables to understand
•• Independent variable: This is an input. In the their effect on the problem.
case of problem solving, the independent
variable can be seen as something that may It is, therefore, our task in the Measure phase
or may not contribute to the problem. The
to determine the independent variables of the
aim is obviously to find the independent
problem.
variables that have the greatest effect on the
problem.
•• Dependent variable: This is the output; in 5.1.2 IT Units of Work
effect, this is the problem. One of the key characteristics that
•• Control variable: This kind of variable is
differentiates Lean IT from Lean in other areas
particularly useful in experiments. This
of business is the units of work. These units of
variable is kept constant while others are
changed so that they can be investigated. work are the inputs for processes.

The mathematical notation for the relationship The standard units of work are derived to a
between independent and dependent variables certain extent from ITIL.
is:

y=f(x)

37
Incident A technical malfunction of the IT service affecting the customer

Service Request A request from the customer, not being a technical malfunction

Problem The root cause of incident(s)

Standard Change Change that is carried out according to a checklist or Standard Operating
Procedure

Operational activity Any activity necessary to keep the current IT service running, not being
an incident or service request. This category includes events, monitoring
and other daily/weekly activities that ensure the health of an IT service.

Non-standard Any change not being a standard change


Change

Advice A document detailing options for a solution, based on a customer


request

Plan A document covering a course of action in the future (Availability,


Capacity, Continuity, Security)

Within Lean, three categories of units of work are identified. Within IT, there are two basic criteria on
which the categories can be defined: size of hours worked and the process dynamics.

•• Runners: These are units of work that occur on a daily basis and tend to require up to one hour of
work to be in a state of completion. Within IT, we can say that incidents, service requests, standard
changes, and operational activities fall in this category. The dynamics of these processes is that
work is statistically predictable (per week), but its exact occurrence is not known. This work cannot
be planned as such but time can be reserved for these units of work.
•• Repeaters: These units of work occur regularly; indicative frequency is weekly. Within IT, we find
high impact incidents, small to medium sized non-standard changes and the smaller advisory
services. This category is partly plannable (advice and changes). However, the high impact incidents
require direct response, and therefore have a dynamic that more closely resemble runners.
Unfortunately, their impact means that solving the incident can require a different effort than
regular incidents.
•• Strangers: These are units of work that have an irregular occurrence. IT ‘strangers’ are large non-
standard changes, large requests for advice and plans, all of which tend to occur, or need updating
on a monthly, or quarterly basis.
5.1.3 Technical Data
The other type of data (alongside the units of work) is technical data. This is the data that helps us to
understand the ‘behavior’ of the technology. This type of data includes entities, such as:

•• Log Files: Files in which activities (normal and abnormal) of individual IT hardware and,
particularly, software components are recorded
•• Monitoring Data: Data and information from monitoring tools, including alerts based on

38
thresholds standard work,or the desired state of any
•• Technical Performance Data: Data about activity, or process. Once standard work is
CPU, memory, network speed/capacity, defined for processes which have a repeatable
and storage usage
nature, the activity of going to Gemba can focus
5.1.4 People Data on looking for variances from the pre-defined
The final type of data required to manage an standard.
IT organization is time because of the following
However, standard work does not imply, it will
reasons:
remain static. All value systems evolve over
•• People Factor: Time represents the people time and requirement improve using both
factor, especially within IT where time usage the dialing and Kaizen event models. Another
is absolutely critical in the sense that we
key aspect of going to Gemba is to observe
have limited time and more work. Making
whether improvement is occurring against
the best use of available time must be a
top priority within the steering mechanisms standard work. In short, it is difficult if not
within an IT organization. impossible to improve something that has not
•• Skills and Knowledge Capabilities: In order been stabilized based on a previously defined
to fully understand the capabilities of the IT desired state.
organization, it is insufficient to know how
much time is available. We must also know
what knowledge is available and in what 5.2 Measurement Systems
quantities.
Essentially two types of data can be collected:
5.1.5 Go to the Gemba and Find Data
quantitative data and qualitative data.
One of the most important aspects of collecting These two types of data require different
data is to understand its context. In Lean, we measurement systems in order to collect the
understand the context by going to the Gemba, data meaningfully.
the place where the work is done. The key
question is: what do you look for when you got Quantitative data is numerical data that is
to the Gemba? always expressed as a number. Qualitative data
is expressed in non-numerical terms. Often
The key is to look at how specific data is used qualitative data is transferred to numerical
within the organization. This means going to the data by giving it a scale on which answers are
place where the data is used, understanding scored as numbers. The data, however, remains
the usage and purpose. Observe the person qualitative, since the distance between the
using the data and ask questions to clarify how numbers on the scale is not something we can
the data is used. measure.

Going to the Gemba, observing the work being The key aspect of any measurement system
done, and understanding how data is used can is that it must be accompanied by one or
help to determine what other data should be more visits to the Gemba to check whether
recorded. It may also lead to understand about the data is being interpreted correctly, and to
unused data or unrequired data. understand the context within which the data
is generated and its variance from standard
A key aspect of Lean is the definition of work.

39
5.2.1 Quantitative Measurement Systems
Quantitative measurement systems are used to gain objective insight into the performance of a
particular entity. Within IT, we have one source of quantitative data.

•• Automated Data Collection: Most systems register data in the course of their operation. In some
cases, the amounts of data are quite substantial. This data is generally held in log files that can be
consulted to find out when something happens.
When using quantitative measurement systems, it may seem as though the data is objective.
However, there are questions that need to be asked regarding the quality of the data.

Collection
Reliability issues Remedy
method

Automated Data Have we correctly understand Ensure that you have skilled technicians
Collection what the automated data is who can explain what the system says
telling us?
Use automated data as a source for the
A lot of data does not confirmation of hypotheses
necessarily mean the
information

Setting up a Measurement Procedure (automated data collection)

The first step is to ensure that the way the measure is calculated is unambiguous:

•• Definitions of units measured are clear and not open to misinterpretation


•• The way the calculation is carried out is understandable
•• Exceptions are documented in detail
Secondly, the fields within the database from which the data is taken must be clearly defined:

•• Name of the field in named database


•• If the field is filled in an application, ensure that it is clear which field is used
•• Ensure the field is suitably restricted to ensure input is always valid
Lastly, create an auditable automated routine to ensure that the correct data is selected and
transformed to the outcome of the measure.

Lean IT has many quantitative measures, many related to value stream mapping. Examples, include
measuring the lead time of the units of work captured in a service management tool, the numbers of
units of work registered, and the technical data on system performance. In all cases, the definition of
each piece of data must be identified and validated.

5.2.2 Qualitative Measurement Systems


Qualitative measurement systems principally measure capability or maturity from the perspective
of the people involved. Through the data collection method, qualitative measurement systems do

40
attempt to create objectivity in the subjective data. This can be done by using a framework of criteria;
most maturity models work on this basis. However, the maturity model is also based on human
perception. This means that qualitative measurement systems are always open to bias, be it based on
the questioning or on the answers. Three forms of qualitative measurement system are as follows:

•• Annotated Observation: This basically means watching what happens and noting the number
of times something happens, the amount of time spent on a task, the number of errors made
in finished products and other such observable occurrences. The tool often used here is the
check sheet.
•• Interview: One of the preferred methods of gathering information is through interviewing
people involved or people associated with the aspect that is being investigated. An interview
can involve one or more people involved with the subject matter. Generally, multiple points of
view are sought when gathering information through interviews.
•• Registration: During the course of work in an IT organization, data is recorded on work units
(incidents, changes, problems, and service requests). This data provides valuable input for
understanding how the organization performs regarding these units of work. In registering units
of work, the system records time stamps and other data either automatically or as a result of the
action of a user. As a result of the dependency on human action, registration is seen as qualitative
rather than quantitative.
These methods may be combined, and may take the form of asking the person or people being
observed questions about their activities. However, interrupting the work to ask questions does
affect concentration and the overall performance.

Collection
Reliability issues Remedy
method

Annotated Are we watching a representative set Observe at several different moments


observation of actions?
Observe multiple subjects
How does the fact that we are
observing impact the performance?

Interview House<?>: “Everybody lies” … except the Check statements with evidence,
interviewee of course preferably data

Information from an interview is Use multiple interviews


always biased

Involved means having an interest in


the outcome of the interview

41
Registration •• Is everything registered •• Check the quality of the data by
and classified correctly? a visual check of the raw data
•• Does the registration tool •• Do not rely on the tool to
allow access to the desired provide information, get a
information? database administrator to
extract data

Setting up a Qualitative Measurement Procedure (annotated observation and interview)

In order to create a valid qualitative measurement system, ensure the goal of the observation or
interview is clearly formulated. Define the framework against which answers will be checked.

Determine the questions to be answered and ensure the answers can be unambiguous. With
observation, this can be relatively simple, using “Yes” or “No” type answers, counts or time
measurements. During interviews, answers will often be narrative. Answers must be clearly noted.
Ensure that answers can be, and are, recorded in a way to ensure that processing the answers to
a suitable result of the measurement is possible. Both the processing and the raw data must be
auditable.

Lean IT examples of qualitative measurements are collecting data for Voice of the Customer or Skills
& Knowledge analysis. In both the cases, the information is based on the opinions of the people
involved.

Collecting VoC information can be a complicated process, if you are aiming for a detailed view of
the VoC. The easiest way to start collecting VoC data is for the team to ask three simple questions
concerning the problem:

•• How does data [problem area] create value for you?


•• What goes [problem area] well?
•• Where the situation can be improved?
Setting up a qualitative measurement procedure (registration)

Follow the same procedure as with automated data collection.

5.3 Baseline and Benchmark

Baselines and benchmarks are necessary to understand the relative value of the performance.

•• A baseline is the measurement of a situation in order to understand whether a change occurs


based on an intervention after the baseline has been set. This is particularly useful in Kaizen
because we are very interested in the effect of changes that have been implemented in the IT
organization. It is vital that during the Measure phase, a baseline is set which can be used to
measure progress.
•• A benchmark is a standard or set of standards used in evaluating the performance or
level of quality of an organization. Benchmarking is a measurement used to compare the

42
organization’s position in relation to other organizations. Benchmarking can also be done
between teams within a single IT organization. Benchmarking may be used during a Kaizen
to understand how well others perform a particular activity. This may help to identify what
improvements are possible.

5.4 Value Stream Map

A number of metrics and calculations are required to help measure a Value Stream. Many of these
were introduced In the Lean IT Foundation publication.

VSM Metrics

The following metrics help the Kaizen to prepare data such that it can be used in the Analyze phase.

Metric Explanation

Lead time The time between the moment the customer submits their request
to the time they receive the requested item or service

Takt rate Volume of customer demand per time period (takt time is the
inverse of this number)

Changeover time Time needed to change from processing one unit of work to
processing a different one. Within IT, this is the time lost due to
context-switching. This is a type of waiting time

Queue time The time a unit of work is in a queue. This is a type of waiting time.
Machine Time The time a unit of work is being processed by a machine. This is a
type of waiting time.
Work-in-process The number of uncompleted units of work that are still in the
process. This number is directly related to the lead time (Little’s
Law)

Capacity The maximum amount of output that the process can deliver over a
period of time

Throughput The actual amount of output over a period of time. This is invariably
lower than the capacity as a result of waste.

VA / NNVA / NVA time Time spent on Value Add (often referred to as cycle time), Necessary
Non-Value Add and Non Value Add activities

VSM Calculations The most essential calculations in a Value Stream Map are Process Cycle Efficiency
(PCE) and Little’s Law

Process Cycle Efficiency = VA time / Process lead time

43
Little’s Law helps us to understand the relationship between lead time and work-in-progress.

Little’s Law = The number of units of work in the process (WIP) / average completion rate

These calculations can be done over the entire process, but also for each process step. This helps to
create a richer picture of the dynamics of the process, and identify where the issues exist.

44
5.5 Measure Phase and A3

At the end of the Measure phase, the Kaizen team should be able to complete the Current Condition
and Future State goals sections of A3.

5.6 Key Steps in the Measure Phase

Bringing together the important aspects of the Measure phase into a series of steps, we find the
following points to take into account when carrying out the Measure phase of the Kaizen.

1. Identify the Outputs and Inputs of the Process in Which the Problem Occurs

Problems invariably have an effect on the output of one or more process since it is often the recipient
of the output who indicates that it does not meet the expectations of said recipient. Having defined
the output relevant to the problem, we need to define the input. This leads to an understanding of
which value stream causes the problem. Much of this work will have been done while making the
SIPOC in the Define phase. This step entails collecting the data concerning the inputs and outputs
of the process. Within IT organizations, it is not always clear that there is a process associated with
the problem. Take the case in the Define chapter. This was identified as a people problem, a time-
constraint problem, an attitude problem and many other types, but not as a process problem. In
the end, the key issues were identified by treating the issue as a resourcing process problem. This
allowed the emotion to be removed from the discussion.

2. Create a Value Stream Map of the Process

45
As we stated above, the value stream map (VSM) describes the current situation of the process at
this phase of the Kaizen. Describing the current situation may seem easier than it actually is.

Often, people working in the same process have different perceptions as to how the process actually
works. It is vital to go and look at the Gemba to see how the process is actually executed.

Within IT organizations, there is a tendency to say “we already have a process picture" when the
Kaizen lead recommends creating a VSM. The process document is usually quite old and must have
been based on a process-oriented implementation. These documents are essentially useless since
they describe a desired situation that has never been achieved; nobody knows the document except
the people who wrote it and it distracts the team from the focus of creating a description of how the
process currently works.

Always take a clean sheet of paper when making an initial VSM.

3. Create and Execute the Data Collection Plan

Once we know what the process looks like, we can identify the independent variables that may affect
the problem. When we know the independent variables, we can define the data that needs to be
collected in order to investigate the problem.

There is a strong tendency to believe that IT organizations are difficult to measure. As a result,
Kaizen teams within IT may try to take a shortcut in the Measure phase. Often, the involvement of a
powerful sponsor will lull the team into a false sense of security.

Powerful sponsorship does not mean we do not need to collect the right data to support the
resolution of the problem. The data is a continuous reminder of how important it is to completely
solve the problem. However difficult collecting data may be, it still must be done. In fact, measuring
the various aspects of IT is not so difficult. The team just needs to be prepared to extract data from
databases. And if there’s one business that has people who know how to do this, it is IT.

4. Validate the Measurement System

We have already discussed possible inaccuracies within measurement systems. This is why the
Kaizen team must validate the measurement system(s) it uses. The idea here is to show that the data
collected can be reproduced and repeated, in exactly the same way.

Any assumptions made must be stated explicitly. Any manipulation of data must be documented and
explained so that it is clear on which premises, analysis is being done in the Analyze phase.

5. Assess the Capability and Performance of the Process

For each measurement we make, we must set a baseline. In the case of the IT units of work, we tend
to create time series charts and determine average performance over a defined period. Setting a

46
single data point as a baseline tends to create an arbitrary and highly contentious baseline.

Although there are organizations that sell benchmark data and reports for considerable amounts,
there is also doubt as to whether benchmarking truly helps IT organizations. The problem is that IT
organizations are service organizations in which the factors influencing performance and cost may be
similar but may have very different effects from one organization to another. It is therefore vital to
baseline while benchmarking is optional.

6. Identify Quick Wins

During the execution of measurements, it may become clear that there is a course of action that
everyone involved agrees on; a solution that can be implemented straight away. This so-called
quick win should be implemented as soon as possible. On rare occasions, a problem thought to be
complicated or complex may turn out to be simple.

As with the Define phase, it is vital to ensure that the key deliverables from this phase are completed
before moving on the Analyze phase. In practice, it is almost impossible to think of everything that
needs to be measured. Often, further necessary measurements will emerge as a result of gaps in the
analysis. This should, however, not be a reason for rushing the Measure phase or too easily accepting
that something is not measurable.

47
5.7 Case Study: Measure Phase

Customers of an IT organization were highly dissatisfied with the service. A part of the IT organization
was responsible for carrying out network installations and changes. Delivery times were completely
unpredictable and were consequently experienced as too long. Expectations were not managed. In
essence, the team needed to process three types of requests, and also requests that included two or
three of the individual types of request.

Quite often, there would be a workload peak as a result of sales activities or management pressure.
This led to stress, because extra hours were put. New requests balanced fulfilled requests, and the
backlog remained the same, causing intense frustration within the team. Unfortunately for the team,
the expectation was that the number of requests they need to process will increase by 100% in the
next 2 years. These needed to be processed by the same people. Inevitably, this led to despondency
in the team since they were not able to keep up with the current workload.

The organization’s hypothesis was that processes were not implemented such that they would help
customers.

Having defined the problem, data needed to be collected. Most of it was available in the systems used
by the team. The data that was collected was a data dump of the previous 12 months of requests.
The data required was the date of receipt of the request, the starting date, the completion date and
the closure date. Also, the department responsible for the execution of the request was included
in the dump. Preliminary data processing provided new insights including the average lead time of
requests, WIP inventories, numbers of opened and closed requests per time period, and how long
requests spent on each status.

The data was validated and the three key conclusions were drawn from this part of the Kaizen:

•• Data was incomplete, start date of the work was not always available
•• Data was unreliable, the time stamp of status 2 was sometimes earlier than that of status 1
•• Data was not used to manage the process
This meant that the Kaizen team needed to be careful when drawing conclusions. It was also a trigger
for the operational team to improve their data registration.

Based on this data, the Kaizen team was able to construct a VSM. The VSM provided insight into
where the data was missing, particularly details about waiting times. These gaps were filled in using
a bespoke time registration sheet (in Excel) adapted to the specific situation. The sheet allowed
registration of NVA, NNVA, and VA activities. This time registration lasted for two weeks to ensure
representative data. The VSM further uncovered that there was no standard process for each of the
three basic request types.

During the Measure phase, the following quick wins were identified.

•• Daily (manual) measurement was instituted straightaway. It was carried out by the team and
communicated twice daily in short stand-up meetings

48
•• Specified knowledge-sharing sessions were organized every day based on identified needs
•• Resource planning was initiated to reduce context switching, such that. per day people were
allocated to a single task thereby increasing their effectiveness. Rotation schemes ensured that
all team members became proficient at processing all types of requests
The data was enriched and processed into graphs and other graphics so that the Kaizen team could
close the Measure phase and proceed to analyzing the data in the Analyze phase.

49
6 Analyze Phase
The Analyze phase is aimed at getting to the root cause of the problem (finding the key x’s). From the
Define phase, we have a clear problem definition. This has been refined during the Measure phase
and data has been collected. The data will be processed to a certain extent during the Measure phase.

In the Analyze phase, the goal is to translate the data into information that will provide insight into
the key variables that have the greatest impact on the problem. By determining these key variables,
we will be able to provide input for the Improve phase, in which we will try to find the possible
actions that will reduce the negative impact of the variables.

In short, the Analyze phase is about the identification, quantification, interrogation, and prioritization
of the root causes of the problem, we are investigating. We do this using a number of tools. There are
tools that help us make sense of the data we have uncovered during the Measure phase and there
are tools that help us to further decompose the problem into its constituent parts.

6.1 Seven Basic Tools of Quality

As early as the 1950’s, there has been a list of the seven basic tools of quality. It is speculated that
Kaoru Ishikawa created the list as a result of exposure to the teachings of

W. Edwards Deming. Whatever the source, the list of the seven basic tools of Quality has been
standardized and is used universally. The seven tools are: histogram, pareto chart, scatter diagram,
flow chart, control chart, fishbone (Ishikawa) diagram, and check sheet.

We will investigate each tool and explain how these are constructed. For each, an IT example will be
provided.

6.1.1 Histogram
According to Webster’s online dictionary, a histogram is "a representation of a frequency distribution
by means of rectangles whose widths represent class intervals and whose areas are proportional
to the corresponding frequencies." In short, this means that we create a graph in which groups of
numbers are plotted based on how often they appear.

The power of histograms is that they allow us to analyze extremely large datasets by reducing them
to a single graph that can show one or more peaks in the data. The histogram also visualizes the
significance of the peaks.

50
Step Description

Step 1 Select a data set to be plotted. A classic example is the distribution of incidents
according to resolution time (that is, lead time).

Step 2 Collate the data into groups. In the case of incidents, we bundle them into time-
related groups, for example, incidents solved within 1 day, 2 days, etc. or less than
10 days, between 10 and 20 days, etc.

Step 3 Count the number of data points per group. Plot the groups onto a graph with the
number of data points on the vertical axis and the names of the groups on the
horizontal axis.. The incident graph will have the number of incidents along the
vertical axis and the time intervals along the horizontal axis.

Step 4 Investigate the pattern that is depicted by the graph. Determine the cause of the
pattern. In the case of the incidents, we can easily see how long incidents have been
open and the distribution pattern according to age.

The above diagram shows an IT example of a histogram. This one shows the number of open

51
incidents with a certain age (time that they are open).

6.1.2 Pareto Chart


The Pareto chart is a way to visualize the relative importance of the root causes of problems. It is
based on the principle (the Pareto principle) that a limited number of factors account for most of the
impact on the problem. The Pareto principle is sometimes referred to as the 80-20 rule, that is,. 80%
of the impact is caused by 20% of the factors.

Using the following steps, you can create a Pareto chart. In this case, it may be best to create a pencil-
and-paper version first before entering the data into a tool, such as Excel.

Step Description

Step 1 Develop a list of causes to be compared and determine a standard measure for
comparing the causes. Frequency of occurrence, time (lead time, time usage)
and cost are the most used standard measure for comparison. Also, choose the
timeframe in which data needs to be collected.

Step 2 Count the frequency (cost or time) for each item. Add these amounts together to
create the grand total for all items. Calculate the percent of each item in relation
to the grand total, by taking the sum of the item, dividing it by the grand total and
multiplying by 100.

Step 3 List the causes in decreasing order of the measure of comparison, from most
frequent to least frequent. On top of the individual percentages, a cumulative
percent is calculated by adding the cause’s percent of the total to that of all the
other items that come before it in the ranking.

Step 4 List the items on the horizontal axis of a graph from highest to lowest. Label the left
vertical axis with the numbers (frequency, time or cost), then label the right vertical
axis with the cumulative percentages.

Step 5 Draw the bars for each cause. Draw a line graph of the cumulative percentages. The
first point on the line graph should line up with the top of the first bar.

52
Step 6 Analyze the diagram by identifying those causes that appear to account for most
of the problem. Identify those causes that account for around 80 % of the effect.
In most cases, two or three causes will generate 80% of the effect. There is usually
an inflection point where the graph levels off. If there appears to be no pattern (the
bars are essentially all of the same height), you may need to subdivide the data and
draw separate Pareto charts for each subgroup to see if a pattern emerges.

By comparing Pareto charts regarding a single problem made at intervals over a


period of time, will indicate whether mitigation actions have had a positive effect on
the problem.

The above Pareto diagram shows the prevalence of particular causes of an incident, both absolute (in
numbers) and cumulative (in percentage).

6.1.3 Scatter Diagram


A Scatter diagram is a graph that aims to demonstrate the relationship between two sets of
data. We try to understand whether there is a correlation between two sets of data and whether
this correlation is positive or negative. This type of diagram can be used to both interpolate and
extrapolate.

53
To create a Scatter diagram, follow the steps given in the following table:

Step Description

Step 1 Select the two datasets that need to be plotted against one another. A simple
example is the investigation of the lead time (in days) of changes in relation to the
size (in hours) of the change. Not that it is not necessary for the units of the two
datasets to be the same.

Step 2 Create a graph whereby one of the datasets is plotted on the vertical axis and the
other on the horizontal axis.

Step 3 Determine whether there is a correlation between the data sets. This is done by
plotting a straight line known as the line of best fit or trend line through the data
points. The trend line is drawn by ensuring that the line is as close as possible to all
of the data points and has the same number of points above it as below it.

Step 4 The scatter diagram can be used in two ways. It can be used to find the value
of a particular data point within or outside the existing data set (interpolating
and extrapolating). The second analysis is the most important. This is to find the
correlation. The correlation is positive if the values increase together; it is negative
if one decreases as the other increases. Based on this analysis, we can determine
which variables have a positive or negative (or no) correlation, and we can take this
into account when determining the solutions during the Improve phase.

54
The above scatter diagram shows the relationship between the average time to repair of incidents
in days and the days of inventory within an IT organization. In effect, this is a chart depicting Little’s
Law.

6.1.4 Flowchart
A flowchart is one of the simplest of the seven quality tools. The flowchart is the visual
representation of a series of steps in a process, and helps to break down a complicated process into
a simple series of steps. This simplification ensures that the process becomes understandable to
anyone.

A flowchart shows actions and decisions at points where variations occur in the process. These
decision points are always marked by a question that can be answered with ‘yes’ or ‘no’. The basic
forms are blocks (actions) and diamonds (decisions). There are many other symbols used for drawing
flowcharts. A further elaboration is the use of so-called ‘swimming lanes’. These are horizontal or
vertical lines that separate the activities of different roles or groups responsible for completing a
particular task in the process.

In Value Stream Mapping, a very simple version of the flowchart is used. The goal of the VSM is to
understand waste and time usage within the process. The flowchart discussed here is generally used
for a more detailed look at the process.

55
Step Description

Step 1 Select the process you wish to analyze. You may already have a SIPOC or VSM of
the process. Use this as a starting point.

Step 2 Create a sequential list of the activities and decisions in the process.

Step 3 Create blocks for the activities and use diamonds for recording the decision points.
Build the process step-by-step. Use an arrow between two symbols to denote the
flow of the process.

Step 4 The analysis of the flowchart centers around the logic of the steps. Drawing the
flow of steps can indicate where there are (unnecessary) feedback loops, parallel
activities that influence one another, i.e. should be sequential. The use of swimming
lanes indicates whether there are many transfer moments between roles. In
general, transfer moments cause delays within the process.

Below is an IT example of a flowchart. In this case, the flowchart of the Problem Management
process is discussed:.

56
6.1.5 Control Chart
Control charts were defined by Walter Shewart (the inventor of the PDCA-cycle). The control chart
is essentially a time-series chart. A time-series chart is one in which data is plotted on a chart where
the horizontal axis is a time sequence. The vertical axis can be numbers or another variable whose
value can be different over time.

The difference between a time-series chart and a control chart is that the control chart is used to
identify variation in a repeating process. This is done using control limits. Control limits are sometimes
also called action limits (control limits are calculated; action limits may be assigned).

A control chart helps to understand variation. There are two important types of variation: common
cause variation and special cause variation.

•• Common Cause Variation: The variation is due to random shifts in the X’s that are always
present in the process. As a result, the pattern shows the variation with ‘noise’, the collective
effect of many minor influences. A process affected by common cause variation is called

57
stable or in control. It makes no sense to figure out what the causes are. The only way to
improve the performance is to redesign the (parts of the) process to reduce common cause
variation. An example of this is when a process, for example, the ability to deliver a new piece
of standard software to the customer, performs consistently at a level that does not meet the
requirement of the customer. We would need to completely redesign the process to improve
the performance.

•• Special Cause Variation: In these cases, the effect of variation can be assigned to a specific
cause which can usually be discovered. Special causes generate patterns in the data. They
provide signals about the problems in the process and how they can be resolved. You cannot
predict if and when the special cause variation will occur and what will be the impact.
Therefore, the process is unstable and unpredictable. Continuing with the example above, if
the ability to deliver the software shows a spike in lead times, we can investigate and possibly
remove the reasons for the spike.

The control chart can thus be used to identify whether a process is under control (statistically) or
it suffers from special and/or common cause variation. It can also be used to detect statistically
significant trends in measurements, for example, to identify whether improvements have had an
effect on performance.

Process is in control Process is out of control

Control charts are best suited to processes where regular measurements can be made. Typically this
is in processes that repeat within a reasonably short span of time. Within IT, we look at incidents,
service requests,. and standard change processes. Control charts are also very suited to monitoring

58
technical processes, such as the ability to load a data warehouse.

Create a control chart using the following steps:

Step Description

Step 1 Identify the objective of using the control chart. Typically this will be either to detect
defects or to monitor/investigate a process.

Step 2 Identify the actual measurement to be made, including what to measure, and where
in the process to measure it. Select the measurements based on their ability to
identify problems or defects.

Step 3 Identify the type of control charts to use. This will depend on the type of
measurement being made.

Step 4 Choose the measurements that will make up each plotted point on the control chart.
Measure more frequently when significant variation can occur over a short period.
Use consecutive measurements, rather than a random sample, as this will result in
less variation within the subgroup, with tighter, more sensitive control limits.

Step 5 Measure the data. If possible, automate the measurement process. If measurements
are to be collected by hand, design a data collection method that eases both the
collection and the subsequent calculations.

Step 6 Calculate mean, upper, and lower control limits. Note that the control limits are
usually straight lines.

Step 7 Draw the chart. This should include plotted points, with a line drawn between
successive points, horizontal lines for each of the central line, upper control limit and
lower control limit, and labeling and other information to uniquely identify the chart.

Step 8 Analyze the chart, looking for significant patterns and points, and find the cause
of any identified significant set of points. In the Improve phase, we will look for a
method of correcting the problem. To be clear, the control chart shows us when the
problem occurs, but not the location.

6.1.6 Fishbone (Ishikawa) Diagram


Ishikawa diagrams (also called fishbone diagrams) are causal diagrams that show the causes of a
specific event. They were designed by Kaoru Ishikawa in the 1960’s. The Ishikawa diagram is generally
used to identify potential factors causing an overall problem. Each cause or reason for imperfection is
a source of variation, that is, an ‘x’ or an independent variable.

Causes are usually grouped into major categories to identify these sources of variation. Depending
on the industry, there may be up to seven categories. Within IT, we commonly use four categories:
people, process,technology, and policy

59
•• People: This category deals with all aspects to do with people, particularly behavior and
attitude, but also personnel and knowledge issues
•• Process:This category deals with all issues that relate to processes
•• Technology: This category deals with issues or causes related to the technical part of IT
services
•• Policy: This category deals with all the factors that determine the environment in which the
people, process, and technology exist.
In practice, these categories are Collectively Exhaustive. The factors affecting a problem can,
however, often be placed in one or more categories, that is, the set of categories is not Mutually
Exclusive. An example: Is the fact that people do not follow a process, a people factor or a process
factor? The answer is that it does not matter as long as the factor is posted on the Ishikawa diagram
and its impact on the problem is analyzed accordingly.

Create an Ishikawa diagram using the following steps:

Step Description

Step 1 An Ishikawa diagram should be made by the Kaizen team. Use a whiteboard and
sticky notes to create the first version of the diagram.

Step 2 Draw a horizontal arrow pointing to the right. At the end of the arrow, write the
problem to be solved.

Step 3 Draw 4 diagonal arrows (2 from below, 2 from above) pointing towards the
horizontal arrow; each arrow should be labeled with one of the categories: people,
process, Technology and Policy.

Step 4 The team collects as many causes as can be identified. The team can use the 5 Why
method to find and detail root causes. The value of the Ishikawa rises with the
quality and detail of the root causes.

Step 5 Use the other six basic tools to quantify the impact of each cause. This will create a
list of the causes with the greatest impact.

60
Example of Ishikawa diagra

6.1.7 Check (or tally) Sheet


The check sheet is a simple and highly effective tool for collecting quality-related data in a structured
way. It is a way to assess a process and can function as input for other analyses.

The check sheet helps to quantify the causes from the Ishikawa diagram, for which there is limited or
no numerical data to be analyzed.

Set up your check sheet using the following steps:

Step Description

Step 1 State the problem for which data is being collected. Identify and record the location
where and time when data will be collected.

Step 2 Create a table with the symptoms or occurrences to be observed and counted in the
left-hand column. Depending on how you wish to measure, you may have a single
column in which to mark the number of occurrences or you may have a column per
day of the week.

Step 3 Collecting data may be done by the people actually doing the work or may be done
by an observer. The person collecting the data puts a mark in the recording column
for each time a particular symptom or occurrence takes place.

61
Step 4 Per registration period (usually a day), the data can be processed into a Pareto
chart for further analysis. Alternatively, a histogram can be used to understand the
relative amount of times a particular symptom is observed.

6.2 Finding the Root Cause

The seven basic quality tools help us to process the data we have collected and visualize the data in a
way that facilitates getting to the root cause of the problem we are investigating in our Kaizen. There
are also tools to help us take a step further and actually get to the root cause.

6.2.1 Whys
The 5-Why analysis is a simple root cause analysis that requires the Kaizen team to question a failure
through sequential causes. ‘Why’ is asked to find each preceding trigger until we supposedly arrive at
the root cause of a problem.

A why question can often be answered with multiple answers. Each answer should be supported
by evidence that proves the answer is right. Failure to do this may send the team on a wrong failure
path.

Step 1 Make a table with two columns and five rows and write the question from the
problem statement at the top of the table

Step 2 Ask the question: “why did this happen?” Find the answer, supported by evidence,
and write the answer in the left-hand column of the top row.

Step 3 Repeat this question and answer cycle, four more times. List the answers in the left-
hand column of the table.

Step 4 Determine a solution for each of the answers and record these in the right-hand
column.

6.2.2 Cause and Effects Matrix


A cause-and-effect matrix helps determine which factors affect the outcomes of the process being
investigated. It maps the value connection between inputs (the Xs) to outputs (the Ys). With these
relationships visible and quantified, you can determine the most influential factors contributing to
value.

Follow the steps given below to create a cause and effect matrix:

Step 1 Start by listing all the possible input factors (the Xs) as individual rows of the matrix.
The inputs should come from a previously completed VSM or Ishikawa diagram.

62
Step 2 List the multiple outputs (the Ys) of the process across the columns of the matrix.
There may be only a single physical output, however there will also be outputs, such
as a performance level, a cost target, or a maximum lead time.

Step 3 The key to the cause and effect matrix is building the relationships. Analyze and
quantify the relationships between each listed input and each output by placing a
relationship score (on a scale of 0 to 9) at the matrix intersection of each row and
column. Strong cause-effect relationships are scored as 9s; moderate cause-effect
relationships get 3s; weak relationships are 1s; and having no relationship means a
score of 0.

Step 4 For each matrix row-column intersection, ask yourself whether the associated input
affects the level of variation in the associated output. Then place the appropriate
score in the matrix cell.

Step 5 Summarize the results by calculating the weighted score for each row. For each row,
multiply the first matrix cell score by the first column weight; do this for the second
matrix cell, and so on. Finally, add up the weighted scores for the entire row. Place
this weighted row sum in the far right column of the C&E matrix.

Step 6 Apply Pareto analysis to the scores for each row. Those rows with high scores are
the ones that indicate important, high-leverage input factors. The factors with low
scores can be ignored.

6.2.3 Failure Mode Effects Analysis (FMEA)


Failure Modes and Effects Analysis (FMEA) is an analysis for identifying all possible failures in a
design, process, product or service. The failure modes are the ways in which something might
fail. Failures are any errors or defects and can be potential or actual. The effects analysis is about
understanding the consequences of those failures.

Failures are prioritized according to their consequences. The aim of the FMEA is to take actions to
remove the sources of failure, that is, the root causes, starting with those with the greatest impact.
FMEA can be used throughout the lifecycle of an IT service, from design to operation and retirement
of the service.

Step 1 List the key process steps in the first column. These may come from the highest
ranked items of a C&E (Cause and Effect) matrix or VSM made previously.

Step 2 In the second column, list the (potential) failures for each process step, i.e. state how
this process step or input could go wrong.

Step3 Per failure, describe what the effect would mean to the IT organization and the
customer, in the third column

63
Step 4 We then complete three columns with ranks from 1 to 10. Before you start, ensure
the team agrees on what each number in the scale means before you start. The
three columns are:

1. The severity of the effect -1 (not severe) to 10 (extremely severe)

2. The frequency of occurrence of the effect - 1 (almost never) to 10 (very


frequently)

3. Our ability to detect based on controls in place - 1 (predictable) to 10


(undetectable)

Step 5 Multiply the severity, occurrence, and detection numbers and record this value in
the Risk Priority Number (RPN) column. This is the key number that will be used
to identify the principal causes to address first. An RPN of 1000 (10 x 10 x 10) is
obviously the most critical.

Step 6 Sort the causes by RPN number and identify most critical causes.

6.3 Analyzing a Value Stream Map

In the Lean IT Foundation, we looked at the mechanics behind creating the Value Stream Map. In this
publication, we will principally look at how to analyze the VSM.

The VSM is a mine of useful information. This information is used to identify the places in a process
where a solution is most needed.

VSM Analysis

Having carried out the chosen calculations, there is a series of aspects that we must analyze in more
depth:

•• Time Trap: This is a process step that introduces a delay into the process. A classic example
is the need for an approval. This is not a capacity constraint. A time trap is based on a policy
decision (muri). The waiting times in the VSM must be analyzed carefully to determine the
reason for waiting times. Removing time traps will improve the efficiency of the process.
•• Capacity Constraints: This is a process step that does not have sufficient capacity to process
all of the work, it must process in a particular timeframe. These steps are also referred
to as ‘bottlenecks’. Removing capacity constraints allows the process to deliver the value
in the quantities required to meet customer demand. Takt rate is an important metric for
understanding bottlenecks. Each process step for which the takt time is higher than the
takt time for the entire process is a bottleneck. Capacity constraints can cause variability
(mura) throughout the process. A capacity constraint may be caused by lack of resources or
knowledge.
•• Waste: Time traps and capacity constraints are important causes of waiting time. Obviously
we need to analyze the other types of waste (TIMWOOD) in the VSM to ensure that these are
not causing delays or quality issues (muda). We do this by analyzing each step in the process

64
and determining whether a particular type of waste is present in the step. This waste is
identified with a symbol on the VSM (See Lean IT Foundation).

6.4 Analysis in IT

Much of what has been dealt with in this chapter is not specific to IT. Does this mean that the analysis
within IT Kaizen is same as non-IT Kaizen? From a tool perspective, maybe. However, within IT, we
find that there are specific challenges. Looking at people, process, and technology, we find a number
of characteristic analyses.

Technology

There is a massive amount of data available within IT organizations regarding the ‘behavior’ of the
technology. Technology delivers data that is very suitable for the creation of control charts and
histograms.

Having said this, Murphy’s Law dictates that the bit of technology that you need to investigate does
not have the right set of monitors in place to ensure that the technology can be researched. It is vital
that monitoring can be put in place quickly to understand the technological aspects of a problem.

The analyses mostly related to technology are control charts, Pareto charts and scatter diagrams.
Control charts help to understand the behavior of technology over time, Pareto charts help to rank
the importance of the causes and scatter diagrams are used to understand whether there is a
relationship between symptoms.

Process

The analyses described above in relation to VSM are all used in relation to IT processes. It is
important, as already stated in the Measure phase, that people within IT organizations (especially
those involved in Kaizen) know the difference between the IT units of work and the associated
processes. The dynamics of each unit of work must be understood so that the effects can be
understood.

People

People-related analysis is particularly related to the availability of skills and knowledge, and the usage
of time.

Skills and knowledge can be analyzed using a skills and knowledge matrix. The aim is, particularly, to
understand in which ways muri and mura are caused by choices made regarding people.

One of the IT-specific analyses is time-related. Where traditional Lean analyses look at the time
aspects of processes, within Lean IT, we also manage time on an organizational level. In essence, we
look at the VA time versus NNVA and NVA time at an organizational level, i.e. per team, department
or the whole IT organization. This helps us to understand whether teams are faced with substantial

65
amounts of ad hoc work or whether they have a high diversity of units of work, leading to muda and
mura.

6.5 Analyze Phase and A3
Complete Analysis section of A3. The Analysis phase tends to be the phase
in which most time is spent. The team will spend much effort trying to bring together the various
symptoms, causes, effects and indications of how these may be mitigated.

There are two major pitfalls to avoid in the phase:

•• Don’t fall in Love with Your Analysis


All the hard work that goes into the Analysis phase is in fact only a step to determine the correct
course of action. Most of the charts, graphs, matrices, etc. will not find their way onto the A3; only
the most significant. This does not mean you should throw away the analysis once made. All analyses
should be stored as a baseline for checking the effect of improvements made and for starting future
improvement initiatives.

•• Don’t Jump to Conclusions


Based on the insights gained in the Analysis phase and the relief that the solution of the problem
is getting close, the team may have a tendency to think that they have found the solution as they
uncover root causes. Jumping to conclusions will mean that other, possibly more significant, root
causes may be overlooked. This is where a Kaizen lead proves their value, by keeping the team on
track.

Completing the Analysis part of the A3, ensures that the team must focus on the essence of the
analysis.

66
67
6.6 Key Steps for Analyze Phase to narrow the search for the most important
root causes. If the Kaizen team does not feel
To wrap up the Analyze phase, let us take a it has found the real cause or does not have
brief look at the main steps that need to be sufficient evidence to support the root causes
accomplished before moving on to the Improve found, do not hesitate to go back and collect
phase. additional data to verify root causes or find
new ones.
1. Determine the Critical Independent Variables
Remember: It is vital to not jump to conclusions,
The first and probably most important step especially once one or two root causes are
is to identify the key X’s, the independent known. Do not start formulating solutions
variables that most influence our problem. If we before you have finished the analysis. It may be
do not identify these correctly, we may spend a solution, but is it the best solution or the one
a lot of time and effort in analyzing aspects that will actually solve the problem completely
that really have little bearing on the problem at rather than partially. Let the analysis run its
hand. course and see where the data, the facts,
the calculations, the visualizations and the
2. Perform the Data Analysis dissecting of the problem take you.

We must use at least seven basic tools of 5. Prioritize the Root Causes
quality to analyze the data collected in the
Measure phase, with the aim of determining Lastly, we must prioritize the root causes we
which of the X’s have the greatest influence on have found. This priority will be passed on to
the problem. the Improve phase, so that the Kaizen team can
focus on finding solutions for the most pressing
3. Perform the Process Analysis root causes.

We also need to take a detailed look at the Closing the Analyze phase is probably the
VSM, we created in the Measure phase. In the most critical change of phase. The reason is
VSM, we will be trying to identify where there that prior to the closure of the Analyze phase,
is waste, where the balance between Value a significant attention must be paid to the
Add and Non-Value Add activities is clearly solutions. As we have seen, quick wins may
tipping the wrong way, i.e. too much NNVA and be found early on in the cycle. However, the
NVA activity. We also need to do the necessary danger of jumping to conclusions is always
calculations (PCE and Little’s Law). Our higher present. The Kaizen lead must therefore
level goal is to understand the flow (or lack of be continuously aware that the team stays
it) in the process by analyzing the throughput focused on the current phase. This is absolutely
and constraints. vital for the Analyze phase where the
temptation to go for the solutions is greatest.
4. Determine the Root Causes
The other danger that can raise it head at this
Based on the data and process analyses, we
point is distraction. This is a huge problem
can generate theories to explain potential
within IT organizations, as there are more than
causes. Use the 5 Why, C&E Matrix and FMEA

68
enough problems that need to be tended to. As
the team works through the DMAIC cycle, the
problem and its causes become clear. As the
team becomes more familiar with the problem,
it appears to become less daunting. When we
do not understand a problem, it seems more
threatening. As the threat decreases, people
have a tendency to downplay the problem. In
the worst case, this can lead to the problem
sponsor being distracted towards a different
problem that is demanding attention. With
this distraction, the Kaizen team may lose
interest and the Kaizen itself may peter out
before the solutions have truly been found and
implemented. It is absolutely vital that a Kaizen
is brought to its logical conclusion, with at least
one solution of the problem being implemented.
This solution must have a visible impact on
reducing the problem.

69
6.7 Case Study: Analyze Phase

In a large organization, the use of Lean principles had been steadily and rapidly increasing within
operational departments. This resulted in a large increase in the number of reports created in
the Business Intelligence (BI) system. As a result of carrying out lots of changes for the customer
departments, the IT department supporting the BI system had not paid attention to how this was
affecting the BI service. The number of incidents and complaints were exploding. Worst of all, the
reports so critical to the business were only being made available by the end of the morning, due to
slow data loading and system crashes. Finally, the problem was escalated to board level and a Kaizen
was started. This case is interesting because a number of errors were made in carrying out the
Kaizen.

The problem definition was quite quickly developed: How can we ensure that all reports are available
at 07:00 every weekday morning? The Kaizen sponsor, lead and team came to an agreement on this
question to be answered within about an hour. This led to the start of the Measure phase. Here, a
mistake was made. The team convinced itself that they did not need to measure everything because
“we know it’s of strategic importance”. The assumption was that the board support to get the BI
systems up to scratch was pretty much unconditional because of the business criticality of the
system. Later on, during a meeting with customers and the Kaizen sponsor, the preliminary analysis
of the problem and the proposed solutions were shot to pieces because they were insufficiently
supported by facts and numbers. This led to a retake of the Measure phase.

In the first iteration, the Kaizen team focused on creating an Ishikawa of the problem. This formed
the basis for the ill-fated meeting. The team subsequently sets about creating clear control charts
of the performance of the load processes, supplemented by those of the use of memory, network
bandwidth, processing power and disk space. These were accompanied by histograms of the
occurrence of incidents and the implementation of changes. For each of the branches of the Ishikawa,
a Pareto chart was produced.

Initially, no VSM was made, because the problem was deemed to be a technical and behavior and
attitude problem. In the end, the VSM provided vital insight to solving the problem. It turned out that
a number of steps were sequential, when they could be carried out in parallel. This was the result of
a challenge to the common sense assumptions that had been made many years back. and since then,
had not been challenged. The VSM including both human and technical actions showed where time
was being wasted.

During the Analyze phase, the Kaizen sponsor played an unsettling role for the team, pushing for
solutions. The Kaizen lead needed to have a discussion with the Kaizen sponsor to ensure the team
was not put under pressure to jump to conclusions. Here, it was found that no clear timelines had
been agreed between the team and the sponsor. And team members were unclear as to how much
time they could spend on the Kaizen, outside of the five meetings that had been planned. The five
meetings had been set based on the DMAIC phases, with an agreement that work would be done
in between the sessions. However, no one had any idea of how much work would be required in

70
between sessions.

In the end, based on the aforementioned analyses supplemented by the use of the 5-Why technique
at various points during the Analyze phase, the analysis was completed, with sufficient factual
support to warrant a complete overhaul of the Business Intelligence (BI) environment. The technical
analysis and VSM showed that the primary causes of the problem could be found in the way the
process was structured from the close of business on one day until 07:00 the following morning and
in the way the entire system made use of storage and memory.

In both, causes were addressed in the short term alleviating the key problem. However, the structural
program was needed to ensure the improvements were sustainable.

71
7 Improve Phase
At the end of the Analyze phase, the Kaizen team basically has a list of the most important X’s, the
factors that cause the problem. The next goal is to identify improvement options. This is the aim of
the Improve phase.

The Improve phase is really the moment that we start thinking about solutions. Earlier in the cycle,
we may have come across a solution, especially if the problem turns out to be an obvious one.
Assuming that we are dealing with a complicated or complex problem, the Improve phase is the time
to start gathering solutions

In this section, we will look at ways of generating solution ideas, techniques for selecting and
prioritizing the solutions and testing solutions. All of these methods are useful but the one that
stands head-and-shoulders above all of them is going to the Gemba.

Observing the Gemba and validating solutions at the Gemba are two of the most important ways to
ensure that the right improvements are implemented in the right way. Going to the Gemba facilitates
the generation of ideas for solutions, especially because ideas can be discussed with the people.
Gemba validation ensures that the implementation of improvements is carried out in a way that
garners support with the people doing the work.

7.1 Idea Generation

There are many options when choosing idea generation techniques. The techniques described below
are well-known, often-used, proven techniques that generate many ideas

7.1.1 Brainstorming
Brainstorming is about generating as many ideas as possible. It is vital that ideas are not evaluated
during the brainstorm session as this limits the creativity. Typically, the brainstorm session will start
with a recap of the key factors causing the problem. These may be posted on a flipchart or on the
wall; a visual solution is recommended. Per factor, the team must generate as many solution ideas as
possible. As time passes, the solutions become more outlandish and strange. This is when you know
that the brainstorm session is reaching its goal. Often, in the absurdity of a proposed solution is a
core truth that helps to develop a more realistic solution. Once all the ideas have dried up, the team
can move on to selecting and prioritizing the ideas.

An alternative to brainstorming is brainwriting. In this technique, the principal factors causing the
problem are posted on flipcharts around a room. The team members walk around the room in silence
posting sticky notes with their ideas on them. Participants read each other’s’ posts and use them as
inspiration to generate new ideas. The fact that a particular post is not explained means that another
person can freely associate or interpret the post as they wish. This again leads to ideas that are
generated out of the ordinary phase.

72
7.1.2 Reverse Thinking
Reverse thinking is all about describing what you would like to have happen and then working out
how to make the opposite happen. This method helps to understand what the team should definitely
not do. Once this is clear, the step to understanding the possibilities becomes much easier. Usually
developing 10 to 15 reverse ideas provides sufficient input to look for desired solutions.

This method works because it is fun. Looking at the absolute opposite of what you are trying to
achieve means, from an IT perspective, how can I aggravate the current, problematic situation?
This leads to amazing definitions of how the IT service infrastructure and organization can be
comprehensively sabotaged. Many additional insights have been collected during a reverse-thinking
session, as particularly engineers try to out-do each other with better ideas. The challenge is to then
identify the opposite solution. A single negative solution may lead to multiple positive solutions.

7.1.3 SCAMPER
A third idea generation technique uses action verbs as triggers to generate ideas. SCAMPER is an
acronym with each letter standing for an action verb which in turn stands for a prompt for creative
ideas.

S – Substitute

C – Combine

A – Adapt

M – Modify

P – Put to another use

E – Eliminate

R – Reverse

Again, the aim is to produce as many ideas as possible. Each cause of the problem is approached
using the seven action verbs, with the aim of understanding what could happen if an aspect of the
cause (or all of it) is substituted, combined, adapted, modified, and so on.

7.2 Option selection and prioritization

Having generated a large number of solutions, we need to make this number manageable. This can be
done through bundling and/or elimination. The question is: how can we select the best solution(s) for
solving the problem?

Also for this task, there are tens of tools. The selection presented below are among the more

73
commonly used.

7.2.1 Affinity Mapping


Affinity mapping reduces the number of solutions by bundling solutions that are linked, similar or
overlapping. The benefit of bundling solutions is that we are able to identify the central themes of a
set of solutions. This in turn can provide further insight into the best solution. Affinity mapping is all
about sorting the large number of solutions into a manageable set of clusters.

As with brainwriting, the first part of affinity mapping is done in silence. Team members put sticky
notes with possible solutions together, if they believe the solutions should be clustered for any
reason. Then one by one the clusters are discussed, and a header describing the key theme is given to
each set of solutions.

The team then determines which solutions are most suitable from each of the clusters, or potentially
they may develop a different solution based on the insight gained from the bundling exercise.

7.2.2 Solution Matrix


The solution matrix is a simple tool made up of two axes: feasibility and impact. Feasibility represents
the ability of the IT organization to actually implement the solution. Feasibility is high if the costs,
effort and time involved are low. Impact is about judging the effect that the solution will have, if it
was implemented. Impact is about the effect on the IT organization and its customers in financial,
performance and/or learning terms. In paragraph 4.4, we already dealt with a number of questions
that can help to determine feasibility and impact.

Example of solution matrix

All of the solutions are then plotted by the team on to the solution matrix. The important part of the
process is that team members discuss the reasons why they believe a given solution should have a

74
particular impact and feasibility. This helps to understand how the team members see the adoption
of solutions within the organization.

Once all the solutions have been plotted, there will be a group that is clustered in the high
impact, high feasibility quadrant. These will be the solutions that need to be considered first for
implementation.

7.2.3 Multi-voting
Each of the above techniques helps to gain control over a large group of solutions. The solution matrix
helps to make a broad prioritization of the solutions, as well. Multi-voting focuses on prioritizing the
solutions. This is done by each team member allocating votes to a set of solutions.

Let’s assume that there are 30 solutions. Each team member is given 10 votes (a third of the total
number of solutions that can be voted for). After everyone has voted, the scores are tallied and the
top 10 solutions are selected. It is possible to conduct a second round in which everyone gets three
or four votes in order to select the best solutions from the previously determined top 10. In this way,
the Kaizen team can reduce the number of solutions to a manageable number.

7.2.4 Business Case Development


The last technique is one to use once the number of solutions has been reduced to less than a
handful. The aim is to build comparable business cases for each of the solutions. Each business case
will include both the costs and the returns for the same fixed period of time.

The solutions of a Kaizen should give a positive return within a maximum of six months. Anything
more probably means that the solution is too big and costly; the team should look for smaller
solutions, possibly only tackling part of the problem. It is advisable, where possible, to implement part
of the complete solution to a problem at a fraction of the cost, rather than spend huge amounts of
resources to completely solve a problem in one go. The key consideration here is the acceptance of
the change: smaller changes are more easily accepted and assimilated into the way of working than
large changes.

7.3 Testing Solutions

Having selected one or more solutions to implement, the question that must then be answered is:
how will we try out the solution to see whether it works. This will depend very much on the type of
problem being solved.

Type of Problem (Cynefin) Solution Test

Obvious or Simple Implement a pilot using the best practices available in the market

Complicated Create a small production pilot to understand how it behaves in


the live environment.

75
Complex Use experimentation techniques to understand how the solution
‘behaves’ in practice.

Chaos Determine which actions to take and carry out a risk analysis for
each action

7.4 Solutions used in IT

Based on the table above, we can deduce that there are situations within IT for which solutions have
already been devised. Let us look at some typical solutions used within IT organizations.

Best practices are an area at which the IT industry excels. There are many best practice frameworks
developed for use within IT. These best practice frameworks present sets of rules that have been
developed over many years with contributions from the IT community. The most prominent
examples are:

•• ITIL: The most widely accepted approach to IT Service Management. It describes best practices
within IT organizations covering the entire lifecycle of an IT service from concept to retirement.
It identifies and describes 26 processes, all of which contain solutions to common problems
within IT organizations. A more concise version, related to ISO/IEC 20000, is part 2 of this
standard.
•• COBIT: Approaches IT from a strategic governance perspective. COBIT aims to link business
goals to IT objectives including the definition of metrics, and roles and responsibilities. COBIT
primarily provides answers to the governance issues of IT organizations
•• Scrum: It is a best practice model for rapid application development. It describes the way
to ensure that software is developed rapidly and that the final product delivers value to the
customer. A number of the problems faced by IT organizations in delivering new software to
customers are solved in this best practice.
•• Prince2/PMI: These are best practices that help to solve problems in the area of project
management and project governance.
On top of the best practices, IT also has good practices. These are frameworks of principles and tools
that help to improve the ability of IT to deliver and improve its services to customers. In this category,
we find methods like

•• Lean IT: On top of Kaizen problem-solving to continually improve services, Lean IT applies
lean principles to IT. As such, it helps to focus IT processes on single piece flow and delivery of
value to customers. Lean IT describes the principles and good practices that can be applied to
complex and complicated problems. An example of a lean solution is 5S, which guides teams
through a series of steps to take action on how work is organized.
•• Agile: This is a set of principles, originating from the development of software, that can and
is applied to a variety of areas (e.g. Agile Project Management). The essence is about focusing
on individuals and interaction, working product (software), customer collaboration and
responding to change. Agile can be used problems in a similar way to Lean IT.
•• DevOps: A more recent addition to the list of methods, DevOps is a solution that derives
its effectiveness from the integration of a number of critical areas: process, organization,
performance, behavior & attitude and automation. This combination ensures that all aspects of

76
IT are included in the solution.

7.5 Improve Phase and A3

The Improve phase leads to the review of the Future State section to see whether the solutions meet
the requirements of the intended future state. The Proposed Options section of the A3 must be
completed. Finally, the team must also describe the plan for implementing the solutions in the Plan/
Improvement section. This last section will be finalized during the Control phase, when the relevant
details of the control plan are added.

Determine the key actions to be taken per cause. Once actions have been completed, re-score the
occurrence and detection. In most cases we will not change the severity score unless the customer
decides this is not an important issue

7.6 Key Steps for Improve Phase

As a recap for the Improve phase, let us review the main steps that need to be accomplished before
moving on to the Control phase:

1. Generate Potential Solutions

Having understood the cause and effect relationships in the Analyze phase. The Kaizen team
must now generate as many solution possibilities as they can, using one or more idea generation
techniques. It is important that maximum creativity is used in this step. The more solutions, the
better the chance that there is an easy-to-implement solution that solves the problem. Remember:

77
we are looking for small effective steps to checklists, KPIs (Key Performance Indicators)
resolve the problem, not large solutions that and metrics. The team may also use the
require considerable effort to implement. FMEA (Failure Mode and Effects Analysis) to
prepare for the possible challenges during the
2. Select and Prioritize Solutions implementation.

From the large number of solutions defined, 6. Create Implementation Plan for full-scale
we must now reduce the collection to a Rollout of Solution(s)
small number of solutions that have both an
impact and high feasibility. We do this using Plan the implementation. This is primarily
the selection and prioritization techniques. for communication purposes. Hopefully,
If necessary, the team may need to perform the solutions to be implemented are small
small experiments to check whether a solution requiring a minor amount of training to ensure
is suitable. adoption. However, there may be aspects of
the implementation that require substantial
3. Apply Best and Good Practices communication with people affected by the
problem.
Within IT, we have a large number of best and
good practices, it is very important to check the The Kaizen team and particularly the sponsor
best and good practices for the particular area must agree on the fact that the solutions will
where the problem exists. Since IT problems, help to alleviate the problem. It is only then
even complex problems, usually include parts that the Improve phase can be closed and the
that can be solved using best practices, it is a implementation of the solution(s) can begin.
waste not to apply what others have already
learned.

4. Develop “Future State” VSM

Once the team understands which solutions


it intends to implement, it can create the
future state VSM. Creation of future state
VSM is important because it helps to focus the
improvement efforts and to communicate the
intended changes to the other people working
in the process who were not part of the Kaizen.

5. Pilot the solution and confirm improvement


outcomes

During the Improve phase, the Kaizen team


must check whether the intended solutions
actually work. Use the pilot to create
documentation required to support the full-
scale implementation,for example,. SOP,

78
7.7 Case Study: Improve Phase team, which did not include the customer,
actually invited all of the known project owners
The IT management team had recently received (the principal project customers of IT) to listen
a number of complaints from customers about to the analysis and help to generate solutions
the intake of projects within the IT organization. to the problem.
Their complaints revolved around the number
of times they had to tell their ‘story’ before IT The team, and the invited customers, used
actually got down to execute project. classic brainstorming to generate solutions.
The Kaizen lead introduced a 20-minute period
The Measure and Analysis phases of the Kaizen of reverse thinking when she felt the options
showed that, depending on the sensitivity of were drying up. This had the added effect of
a topic, up to six different people may have a causing hilarious exchanges between business
meeting with the customer to define the goals and IT people. Later, in the evaluation, the
for the project. First, an account manager team realized that this period of fun actually
would ask what the customer wanted. Second, improved the acceptance of the changes that
an information manager would request a had previously been non-discussable. The ‘non-
meeting to gain some more insight into the discussables’ were especially related to people
technological impact of the project. Then a having to relinquish part of their responsibility
project manager would turn up and pretty or authority for the sake of a more efficient
much repeat everything the account manager process.
and information manager had just done, but
then from a project execution perspective. If The result was quite surprising: the customer
there were budgetary or governance issues, the recognized that they themselves had caused
financial director or operations manager may part of the problem by insisting on a fairly
require a similar explanation as had already bureaucratic governance on the business side; a
been given. And, finally, if there was any kind classic case of muri.
of problem the CEO would get involved as a
referee. There seemed to be many projects Policies were adjusted and the information
with ‘any kind of problem’. In short, the process manager was given an overall responsibility for
of getting a project started was extremely ensuring that the project was fully defined. This
time-consuming. was done in such a way that a project manager,
who was allocated to the project at a later date
The Kaizen involved creating a Value Stream (sometimes weeks after the project had been
Map of the process and the analysis focused defined), could easily infer the requirements
on the roles involved. Each of the roles had and start executing the project.
documented responsibilities some of which
either overlapped, conflicted or caused
transfers. The analysis determined that, in fact,
no one was actually responsible for ensuring
that a project was defined, so that it could be
executed.

The Improve phase was novel in that the Kaizen

79
8 Control Phase
The Control phase is the last step in the DMAIC we have focused and dealt with the underlying
cycle. In this phase, the goal is to successfully causes that prevented the performance needed
implement and, more importantly, maintain to meet the requirements from the four voices
the gains achieved, i.e. it is all about ensuring discussed earlier. Now that we have achieved
the sustainability of the improvement. The the performance improvements, we need to
question that the Kaizen team is trying look at controlling the delivered quality.
to answer is, “How can we guarantee the
improved performance?” Ensuring that the As we all know making changes is challenging
successes from the Improve phase will continue and sustaining those changes even more so.
means transferring the responsibilities for That is why we need control. It is often said
performance to the process owner. One way to that standing still means going backwards.
look at the concept of establishing controls is We need to continually put energy into the IT
to ask yourself what elements, activities, roles, organization in order to maintain organizational
policies, etc. you need to put in place to make performance. The problem is that expending
sure that, the next time you go to the Gemba, energy is tiring. The way to reduce energy in
the improvement is still in place and preferably organizations is to create habits.
better than you left it.
We need to be diligent and develop the habits
The Control phase has two main focus areas: and practices necessary to maintain our
guarantee the increased performance and the current state and pursue improvement in the
hand-off the improved process to its process future. That is why we need a Kaizen mindset;
owner. a true Kaizen mindset means enjoying the
challenge of counteracting the descent to chaos
and continually seeking improvements.
8.1 Achieving Control

A control, in essence, is a procedure or policy; 8.2 Control Plan


a way to identify whether work is done in
the correct way. There are specific IT controls As we said, the Control phase in our Kaizen is
that provide assurance that the information aimed at maintaining the changes that were
technology used by an organization operates made in order to sustain the improvements. To
as intended, for example, with the correct help the process owner and the people doing
authorizations, sufficient audit trails and the actual work, we must develop a control
processes that deliver the correct results. The plan. This plan consists of four basic parts:
aim of this phase is to implement controls to •• Documentation: a record of the changes
ensure that work is done the correct way. made
•• Monitoring: a way of checking that changes
In the previous Kaizen phases, we successfully are maintained
investigated and, eventually, made •• Response: a way of reacting to deviations or
improvements to alleviate the causes of the incidents
problem. To make structural improvements, •• Training: communication of the changes to
stakeholders

80
Without implementing a control plan to ensure issue: if it is not there, people complain that
problems do not reoccur, the Kaizen cannot be they do not know what is expected, if it does
successful in the long run! exist, nobody reads it. Traditionally, process
documentation comes in one of two forms:
8.2.1 Documentation a process flowchart accompanied by a RACI
Our improvements in the way of working need (Responsible, Accountable, Consulted, Informed)
to be institutionalized as habits and routines. chart or a process flowchart with ‘swimming
One key ingredient to achieve this is through lanes’ (as described in the Analyze phase).
documentation. Of course, we all know that Whichever method you use, the document
documenting alone is insufficient. It would should be short and simple so that it is easy
be just a paper tiger (‘something that seems for people to understand. This style of process
threatening, but is ineffectual’). However, documentation is often required for compliance
without documentation, it is difficult to create purposes. An additional way to document a
a baseline, to establish the right routines and process is to post the Value Stream Map on
habits. Examples of new documentation include the wall, and regularly organize short meetings
new process steps, standards, procedures, to determine which improvements need to be
policies and instructions for new or updated implemented. Creating a Value Stream Map has
systems or tools. the effect of keeping the improvement of the
process at the top of people’s minds.
Policy
Standard Operating Procedure (SOP)
Creating policy often results in documents
that resemble legal documents, not least An SOP is a written procedure that describes
because they aim to be complete, covering all how a specific task should be carried out. The
eventualities and exceptions. A policy should be idea is that by following the SOP, the desired
clear and concise, and should stay within the outcome can be guaranteed and created in
intended scope. The policy should clearly state a consistent and efficient manner. An SOP is
its intent and the spirit that should be followed sometimes referred to as the ‘best known way’
when applying it. New or rewritten policy must of doing something, simply because there will
obviously not contradict any other policy. one day be a better way of doing the work.
This is an example of how the language in a
Roles and Responsibilities lean organization can differ from that used
in a non-lean organization. Within IT, the key
Key to gaining control is establishing clear area where we use the SOP is in describing the
ownership and accountability for results. This execution of Standard Changes.
needs to be documented so that the ownership
and accountability are available for all involved. A good SOP has a name for each step. It
These roles and responsibilities can also be describes what needs to be done per step and
used in the process documentation. how this should be carried out. An excellent
and highly effective SOP also includes why
Process Documentation a step needs to be carried out in the way
described. If people do not understand the
Process documentation is always a contentious ‘why’, they will not often ignore that step,

81
especially if it is an administrative step (see Indicator (see Lean IT Foundation for the
case at the end of the chapter). definition of a KPI).

An alternative to the SOP is the checklist. The credibility of measurements is highly


Checklists are particularly helpful when the dependent on their consistency and coherence.
process is non-standard or has limited aspects Consistency means that they are measured in
of repeatability. Although this document works a repeatable way that is the same across the
principally as a reminder to carry out particular whole IT organization. Coherence means that
activities (as does the SOP), it does not the measures are self-consistent across any
guarantee a specific outcome. Rather, it ensures number of assessments.
that things are not forgotten.
Ideally, we will identify leading and lagging
It is very important to determine the detail of indicators. The former will identify whether
the documentation. This should be based on the performance will decline in the future; the
the risk of not having sufficient detail versus latter will look at performance. Within IT, an
the value of a short and easy-to-understand example of a leading indicator is the number
document. As with everything in Lean, start by of Problems solved. This is a leading indicator
focusing on the value. for reducing the number of incidents. Untested
changes on the other hand are a leading
8.2.2 Monitoring indicator for an increasing number of incidents.
The decrease or increase in the number of
To detect any irregularities, we need to know incidents is the lagging indicator.
how the implemented changes are affecting
performance. To do this, we need monitoring. Besides the usage of metrics, we need to
Our approach for monitoring should focus on: establish a form of a dashboard. The input for
the dashboards is the information from the
•• Monitor the process using the updated
metrics. A dashboard is a visual tool to ensure
metrics and measurements made during the
that both managers and engineers know how
Measure phase
•• Evaluate the improvements made in the they are performing. The dashboard ensures
Improve phase consistency in the use and interpretation of
•• Assesses the capability of the process over metrics and KPIs since everyone looks at the
time and ensure that the solutions work for same consistent and coherent measurements.
the long term
Visual Management
Metrics
Aside from the metrics, an excellent way of
During the Define and Measure phases, we spotting irregularities is to talk with the people
have identified and created measurement related to the problem area about performance
systems for metrics related to the problem. and the issues they face. To do this, we can
Some of these can be re-used. The setup Lean style Visual Management and
improvement we have chosen to implement engage in meaningful performance dialogues
will undoubtedly be related to a Critical Success with the process owner, engineers and
Factor for which there is a Key Performance management.

82
As we saw in the Lean IT Foundation, Visual •• Share feedback on the performance and
Management is about effective communication spot irregularities
and real-time updates regarding the work. •• Identify root causes and offer suggestions
for improvement
Performance and workload are shared for
•• Determine what needs to be done in order
visibility and effective communication. Visual
to correct any irregularities and whether
management covers steering the work, support is needed for completion.
planning and reviewing progress and, of course,
The order of the steps may vary depending on
managing improvements on a daily and weekly
whether new irregularities are being discussed,
basis. Including the Visual Management tools,
ongoing performance is being evaluated or
therefore, make sense as control mechanisms.
correcting measures are being reviewed.
Visual management helps to create consistent
Cascade
and effective communication. It removes the
need for a series of one-to-one communication
Quite often, support from other organizational
that inherently has the risk of an inconsistent
units is required when implementing
message. The communication is effective
improvements. Therefore, we must establish
because the entire team hears the same
rapid and effective communication that
message at the same time. Lastly, frequent
can easily cascade through all levels of the
feedback loops are established. This is based on
organization. This may require changes to
common knowledge of the chosen solutions to
the structure of the meetings relevant to the
problems.
problem area. The goal is to propose changes
to the structure so that ideas, suggestions and
Performance Dialogues
requests for help can flow readily through the
Measurement is vital to understand the channels of the organization.
dynamics of the processes. Measurements are
8.2.3 Response
nothing but statistical, which per se does not
contribute to the process, but the inference Although we never know when and what
drawn from the process really matters. kind of irregularities we will be facing, we can
Measurement must lead to the changes prepare by setting up responses in advance.
in behavior and we must ensure that our This means establishing checks that will signal
behavior helps us to achieve our goals. Once out-of-control conditions and define actions to
again, as we saw in the Lean IT Foundation, be taken.
the performance dialogue is an instrument
that helps us better understand what good During the Analyze phase, we looked at the
performance is and how to collaborate in order use of the FMEA to understand what could
to create value for customers. go wrong in a certain situation. To the FMEA,
we can add OCAP procedures (Out-of-Control-
To engage in meaningful communication to Action-Plan).
control our process, we setup performance
dialogues. Their aim is to ensure a structured As we saw before, a good FMEA defines:
and objective discussion of performance. These •• The activities such as inspection, checks or
discussions consist of three elements. measurements aimed at control

83
•• The frequency of the activities, who is responsible and the tools involved
•• The standard or norm that defines acceptance and rejection criteria.
The interventions grouped together, we call an OCAP or Out-of-Control-Action-plan. It prescribes
what to do in case a failure occurs. It is a living document, which stores knowledge about possible/
known issues and related solution strategies. The OCAP makes exception handling and firefighting
efficient and effective.

It is vital that engineers are involved in setting up the FMEA and the OCAP to ensure they know the
requirement. The OCAP was first used at Philips Semiconductors. At Philips, it had been a common
practice to establish normal operations, but no response was formalized to specific events. By adding
the OCAP to the FMEA, they created the possibility to prepare for disruptions. Also, the responsibility
for the activities was anchored in the work place. In effect, the OCAP is like a small-scale contingency
plan.

Guidelines for using the OCAP are:

•• OCAP procedures are usually documented as a flowchart


•• Knowledge and experiences with the process are documented
•• The OCAP is a living-document and is updated regularly
•• Part of the OCAP is maintaining a log, in which uses of the OCAP are registered for both problem
and solution
•• This log offers insight into problems that occur frequently and guide new Kaizen initiatives.
8.2.4 Training
The training part in our Control plan is focused on:

•• Ensuring that each person executing a particular role knows his or her accountability
•• Providing guidance in understanding and using all the documentation
•• Providing instructions for how to use the monitoring tools
•• Providing knowledge on the response activities
To aid in setting up the necessary training, a knowledge and skills matrix can be used. With this tool,
we can readily identify the current and desired knowledge and skills for those involved in the problem
area.

If a knowledge and skill matrix already exists, it will need to be updated to include any improvements.
Lack of skills and knowledge is a kind of waste and may prevent the Kaizen improvements from
having the desired impact.

8.3 Communication Plan

Building a communication plan is essentially about ensuring that the right people are given the right
information at the right time. When putting together a communication plan, we must include the
following variables:

•• Content: what are we communicating about?


•• Audience: for whom is the communication?

84
•• Purpose: why are we communicating about this content?
•• Timing: when will the communication take place? Is it a one-time event or will it recur?
•• Format: will the communication be presented in the form of a newsletter, email, interactive
meeting, presentation, training or any other form?
•• Input: from whom do we need input / consultation prior to the communication event?
•• Actions: who will ensure that the communication happens?
•• Capacity: how much time is needed to carry out the communication event?

Generic example of communication plan.

8.4 Closure

The final step in our Kaizen is its closure and the hand-off from the Kaizen team to the problem
owner (as we saw, this tends to be the Kaizen sponsor). We consider the Kaizen closed when the
problem owner has accepted the following deliverables:

•• Improved performance, including the before and after data on metrics, to be used as a baseline for
further improvements
•• A completed Kaizen A3, including lessons learned (both success and failures) and recommendations
for further improvements
•• Documentation, including SOPs, policies and other documentation produced during the Kaizen,
such as Value Stream Maps and other tools
•• Operational training, including training on the Standard Operating Procedures and changes in
processes or policies
•• Transfer plan for sharing gained knowledge and new best practices
One of the powerful aspects of running Kaizens is to transfer successful implementations across the
entire organization, through replication and standardization. Replication means taking the solution
from the team and applying it to the same type or a similar type of problem. Standardization
means taking the lessons learned from the team and applying those good ideas and solutions to
other problems. The Kaizen team should consider standardization and replication opportunities to
significantly increase the impact on the business, so as to far exceed anticipated results.

The transfer of best practices demands great care and a well-devised implementation method.
Special care should be given to:

•• The people working in other processes. The background of the changes should be well
explained to them.

85
•• The changes made should be verified whether they work well enough in practice, and are
transferable to the new situation. This can be done by carrying out pilots of the improvement
actions
•• Any feedback, especially on complications at the start
•• Acceptance of the changes
Never assume that your proposed improvements would work perfectly at once somewhere else.
Usually the improvements come across some complications. Fine-tuning the improvements may be
necessary and provide a great opportunity to involve the others. Ensure that any feedback given is
captured and used.

When the Kaizen event is officially over, a team evaluation may be done to assess how each individual
did as a team member, management may devise rewards to recognize the work of the team, and the
team may share the gained knowledge on how to run a successful Kaizen with others.

8.5 Control Phase and A3

In this final step of the Kaizen, we must complete the A3. This will entail reviewing the entire A3 to
ensure that the story that needs to be told is actually told. As the team moves through the DMAIC
cycle, each phase that is completed appears to be the most important. And it is until that point. The
key message of the A3 is: what are we doing to remove the problem we initially defined?

The analysis that the team spent so much time on, producing valuable insights through
measurements, may be reduced to a few sentences, results or graphs. The proposed options from
the Improve phase may be limited to the top three.

86
In finalizing the A3, we focus on creating a consistent story based on the prior documentation and,
principally, describe the solution to the problem defined in the background section.

In the Plan / Improvement section, the chosen solution to the problem is described. It is accompanied
by a plan defining how the chosen solution will be implemented in the IT organization. The Follow-up
section is where we describe the activities that we have devised to ensure that the solution remains
embedded in the IT organization, or if necessary how the solution will be disseminated throughout
the IT organization.

8.6 Key steps in the Control phase

To complete the Control phase, let us take a brief look at the main steps that need to be
accomplished.

1. Create a measurement system

Institute the metrics to control the improvement. Ensure that these are included in a dashboard for
use by all people involved. The basis for this measurement system will probably have been laid during
the Define and Measure phases.

2. Create Documentation

It is vital to record the changes made. At the same time, we must be careful about creating too much
documentation. Keep policies and process documentation concise. Ensure that the documentation is
written for the right audience. Make use of Standard Operating Procedures and checklists wherever
possible.

3. Create Control plan

Ensure that the Kaizen team makes a control plan, preferably with help from colleagues outside
the Kaizen team. This involvement helps to generate a plan that is supported by a greater number
of people and that contains acceptable controls. The control plan must include all of the four key
aspects: documentation, monitoring, response and training.

4. Communicate to stakeholders

Communicating the results and control measures to the stakeholders is vital. This is the only way to
ensure that everyone involved knows what to do. Part of the communication is achieved through
training, the rest will be achieved through information sessions. Sending an email to inform someone
of the change does not constitute communication to the stakeholders.

5. Present the Results as described on the A3

We need to finalize the Kaizen A3. The key reason is to ensure that all pertinent information is

87
collected in one place, and that the results of the Kaizen are explained simply. This document can
obviously be used to support the communication of the solution and the control activities to the
stakeholders.

6. Transition ownership

The last step is for the Kaizen sponsor to take ownership of the results of the Kaizen. In effect, we
move the responsibility from a ‘project team’ back to the hierarchical line. The Kaizen sponsor should
be pleased with the result, since a (part of the) problem has been solved.

Therefore, do not forget to celebrate the success with all involved.

88
8.7 Case Study: Control Phase

A classic problem for many IT organizations is the use and, particularly, the maintenance of the
Configuration Management Database (CMDB). Mostly, considerable effort is put into ensuring that the
Configuration Items (CI) are recorded in the CMDB. The problem is that within months (sometimes
weeks), the CMDB is no longer up-to-date, CIs are missing, details of new Cis have not been entered
and people start complaining about the quality of the CMDB. This leads to general apathy towards
the CMDB, and it spirals into disuse.

One IT organization decided to take this problem seriously. The Kaizen resulted in the conclusion
that many of the aspects necessary for the CMDB to be used were in place. What was missing was
a comprehensive set of controls to ensure that everyone was focused on keeping the CMDB up to
date. The result of the Kaizen was a complete control plan, which was written by both managers and
engineers, thereby stimulating the adoption of the agreed actions.

Starting with the documentation, it was found that documentation process existed. Its quality was
fine and it turned out to still be relevant. Two pieces of documentation were missing. The first was a
policy. This document was created and consisted of nine points covering definitions, authorizations,
the allocation of responsibility for particular CIs to teams, the basic set of data to be collected and the
way quality to should be monitored. The second piece of documentation was not so much missing,
as in need of improvement. This concerned all Standard Change procedures. It was decided to include
the ‘CMDB update’ step at ¾ of the way through the set of steps to ensure that everyone would carry
out the update before the end of the procedure. In the ‘SOP’, it was explained why the step was so
important.

The next step was to define the monitoring activities. First, a set of simple metrics was defined:
the number of CIs with no relationships, the total number of CIs under management per team
and the number of CIs not containing the basic set of data. The metrics were used during the
Visual Management meetings and results were used in both performance dialogues and were
communicated through the cascade. The management levels agreed that if the metrics did not
show a steadily improving result, management would ensure that corrective action was taken. The
responses were pre-determined and communicated to the teams.

Lastly, everyone in the whole IT organization was trained in the new way of working. The training
was primarily done by team members (not by management). In general, team members were very
persuasive in their communication to their colleagues as to the why, how and what of the way of
working surrounding the CMDB.

The result was a much better acceptance of the need to maintain the CMDB. The quality of the CMDB
improved over the ensuing months. The quality improvement did not spike and fall back, rather it
showed a steady improvement trend.

89
9 Appendix 1: References
9.1 Lean Six Sigma Pocket Toolbook (chapters 1-4, 9)

Authors: Michael L. George et al


ISBN number 0-07-144119-0
Publisher: McGraw Hill

9.2 Understanding A3 Thinking

Author: Durward K Sobek III, Art Smalley


ISBN: 978-1-56327-360-5
Publisher: CRC Press

9.3 A Leader’s Framework for Decision Making

Author: David Snowdon, Mary Boone


Publisher: Harvard Business Review
Date: November 2007, p69-76

90
10 Appendix 2: Glossary
A3 Refers to the size of a piece of paper that provides enough space to
explain a relatively complicated story, but encourages conciseness in
the communication of a message.

A3 Proposal Is used for creating a recommendation for action

A3 Status Report The A3 status report is aimed at informing all stakeholders of the
progress of the execution of a long-running project or action.

Affinity Mapping Bundling solutions that are linked, similar or overlapping in order to
reduce the number of solutions.

Agile A set of principles, originating from the development of software, that


can and is applied to a variety of areas (e.g. Agile Project Management).

Andon Refers to a system to notify management, maintenance, and other


workers of a quality or process problem.

Analysis An A3 skill where the aim is to separate something into its constituent
parts or elements. It is vital when writing an A3 report to understand
the parts of the problem so that only the right information is given. If
we are able to discern the parts of a problem, we can also determine
which of these parts are relevant to the reader.

What was done to identify the root cause of the problem. (vb. Analyze).

Analyze (Phase) Third phase of the DMAIC cycle in which the analysis of the problem is
done.

Annotated Watching what happens and noting the number of times something
Observation happens, the amount of time spent on a task, the number of errors
made in finished products and other such observable occurrences

Baseline Baselines and benchmarks are necessary to understand the relative


value of the performance. A baseline is the measurement of a
situation in order to understand whether a change occurs based on an
intervention after the baseline has been set. This is particularly useful
in Kaizen because we are very interested in the effect of changes that
have been implemented in the IT organization. It is vital that during the
Measure phase a baseline is set that can be used to measure progress.

91
Benchmark A benchmark is a standard or set of standards used in evaluating the
performance or level of quality of an organization. Benchmarking may
be used during a Kaizen to understand how well others perform a
particular activity. This may help to identify what improvements are
possible.

Capacity The maximum amount of output that the process can deliver over a
period of time.

Cause and Effect See Fishbone diagram.


Diagram

Cause and Effect A cause-and-effect matrix helps to determine which factors affect the
Matrix outcomes of the process being investigated.

Change Over Time Time needed to change from processing one unit of work to processing
a different one. Within IT, this is the time lost due to context-switching.

Check sheet The check sheet is a simple and highly effective tool for collecting
quality-related data in a structured way. It is a way to assess a process
and can function as input for other analyses when there is limited or no
numerical data to be analyzed.

Common cause Sources of variation in a process that are inherent to the process, also
variation referred to as noise.

Continuous Ongoing process in an organization with the objective to find, resolve


Improvement and share solutions to problems. The objective is to achieve perfection,
in other words to improve value streams, product and customer
value. A philosophy of frequently reviewing processes, identifying
opportunities for improvement, and implementing changes to get closer
to perfection.

Control Chart The control chart is essentially a time-series chart. A time-series chart
is one in which data is plotted on a chart where the horizontal axis is
a time sequence. The vertical axis can be numbers or another variable
whose value can be different over time.

A control chart helps to understand variation.

Control (Phase) The fifth and final phase of the DMAIC cycle. This phase ensures that
improvements are implemented and anchored into the way of working

92
Control Plan A plan aimed at maintaining the changes that were made in order
to sustain the improvements. This plan consists of four basic parts:
Documentation: Monitoring, Response:, Training

Control Variable This kind of variable is particularly useful in experiments. This variable
is kept constant while others are changed so that they can be
investigated.

Customer The person or group of people who use your product or service OR the
person next in line in the value stream.

Customer Value A person who buys, uses or derives value from a product/service. Only
the ultimate customer defines value. The person ‘next in line’ is referred
to as a ‘partner in the value stream’, or an ‘internal’ customer.
A capability provided to a customer at the right time at an appropriate
price, as defined by the customer. The more a product or service meets
a customer’s needs in terms of affordability, availability and utility,
the greater value it has. Thus, a product with true value will enable, or
provide the capability for, the customer to accomplish his objective.

Cynefin (Model) A model, in which categorized decision-making is placed into one of five
types: simple, complicated, complex, chaotic and disorder.

Daily Kaizen Act of responding to everyday occurrences such as incidents, mistakes


and other quality issues and addressing quality issues at the source
rather than being satisfied with quick fixes

Define (Phase) The first phase of the DMAIC cycle, in which the problem to be solved is
defined and agreed

Dependent Variable This is the output; in effect, this is the problem that is captured as part
of the Measure phase.

DevOps DevOps is a solution that derives its effectiveness from the integration
of a number of critical areas: process, organization, performance,
behavior & attitude and automation.

DMAIC Acronym for the five steps in problem solving with Kaizen, i.e.: Define,
Measure, Analyze, Improve and Control.

DMEDI Acronym for the five steps in problem solving with Kaikaku, i.e.: Define,
Measure, Explore Decide and Implement

Fishbone diagram The fishbone diagram identifies many possible causes for an effect or
problem. It can be used to structure a brainstorming session.

93
Five “Whys.” A root-cause analysis tool used to identify the true root cause of a
problem. The question “why” is asked a sufficient number of times
to find the fundamental reason for the problem. Once that cause
is identified, an appropriate countermeasure can be designed and
implemented in order to eliminate re-occurrence.

Flow The smooth, uninterrupted movement of a product or service through


a series of process steps. In true flow, the work product (information,
paperwork, material, etc.) passing through the series of steps never
stops.

Flowchart A flowchart is one of the simplest of the seven quality tools. The
flowchart is the visual representation of a series of steps in a process,
and helps to break down a complicated process into a simple series
of steps. This simplification ensures that the process becomes
understandable to anyone.

Failure Mode and Failure modes and effects analysis (FMEA) is an analysis for identifying
Effect Analysis all possible failures in a design, process, product or service. The Failure
(FMEA) modes are the ways in which something might fail. Failures are any
errors or defects and can be potential or actual. The effects analysis is
about understanding the consequences of those failures.

The aim of the FMEA is to take actions to remove the sources of failure,
i.e. the root causes, starting with those with the greatest impact. FMEA
can be used throughout the lifecycle of an IT service, from design to
operation and retirement of the service

Gemba The place where the work is done. Within a lean context, Gemba simply
refers to the location where value is created

94
Histogram A histogram is "a representation of a frequency distribution by means
of rectangles whose widths represent class intervals and whose areas
are proportional to the corresponding frequencies." In short, this means
that we create a graph in which groups of numbers are plotted based
on how often they appear.

The power of histograms is that they allow us to analyze extremely


large datasets by reducing them to a single graph that can show one or
more peaks in data. The histogram also visualizes the significance of the
peaks.

Hypothesis A hypothesis is a statement that will start with the words “I/We think/
believe that …”. The hypothesis is as yet not supported by any factual
basis. The hypothesis is based on people’s beliefs as a result of their
observations. These are by definition selective and biased, and very
much in need of testing through thorough analysis of the data and facts
that can be found.

Improve (Phase) Fourth phase of the DMAIC cycle. The Kaizen team thinks up possible
solutions to the problem based on the analysis done.

Improvement Board Board that presents current problems and the follow-up to resolving
or addressing that problem (also Kaizen Board); an element of Visual
Management

Incident An unplanned interruption to an IT service or reduction in the quality


of an IT service. Failure of a configuration item that has not yet affected
service is also an incident

Independent Variable In the case of problem-solving, the independent variable can be seen
as something that may or may not contribute to the problem. The aim
is obviously to find the independent variables that have the greatest
effect on the problem.

Ishikawa diagram See Fishbone diagram.

Jidoka Creating an environment in which disturbances to the flow of work


through the value streams are made visible, i.e. problems are not left
covered up

95
Kaikaku Japanese for "radical change" is a business concept concerned with
making fundamental, transformational and radical changes to a
production system, unlike Kaizen which is focused on incremental minor
changes.

Kaizen An improvement philosophy in which continuous incremental


improvement occurs over a sustained period of time, creating more
value and less waste, resulting in increased speed, lower costs and
improved quality. When applied to a business enterprise, it refers to
ongoing improvement involving the entire workforce including senior
leadership, middle management and frontline workers. Kaizen is also
a philosophy that assumes that our way of life (working, social or
personal) deserves to be constantly improved.

Kaizen board See Improvement board

Kaizen charter The document in which the problem is described and an indication is
given of what resources (people, time, money) are to allocated to the
resolution of the problem

Kaizen Event See DMAIC

Kaizen lead This person manages the Kaizen process on behalf of the sponsor and
the team

Kaizen Mindset There must be a belief throughout the IT organization, both among
managers and employees, that improving IT services and the way they
are delivered can and must be done on a daily basis

Kaizen sponsor This person is the owner of the problem, and has a direct interest in
having the problem solved.

Kaizen team member The people executing this role will do the required work. They must be
involved with the problem as it occurs on the work floor

Kakushin This is the third form of improvement. Kakushin focuses on innovation,


reform and renewal. It differs from Kaikaku in that Kaikaku deals with
transformational change of existing structures, systems, etc. Kakushin
deals with the introduction of completely new structures, systems, etc.

Known Error A Problem for which the root cause and a workaround have been
documented

Lead Tme The time between the moment the customer submits their request to
the time they receive the requested item or service

96
Little’s Law Little’s Law = the number of units of work in the process (WIP) /
average completion rate. Helps us understand the relationship between
lead time and work-in-progress.
Machine Time The time a unit of work is worked on by a machine. This is a type of
waiting time.
Measure (Phase) Second Phase of the DMAIC cycle. In this phase, facts and figures are
collected to understand the problem we are trying to resolve.

MECE Acronym for Mutually Exclusive, Collectively Exhaustive. Mutually


Exclusive means that all items in a particular category only belong
to that category, and no other. Collectively Exhaustive means that all
possibilities have been covered.

Muda Japanese word for waste. See Non-value-added and Waste.

Multi-Voting Multi-voting focuses on prioritizing the solutions by allowing each team


member allocating votes to a set of solutions.

Mura Japanese word meaning unevenness; irregularity; lack of uniformity;


variation

Muri Japanese word meaning overburdened, unreasonableness;


excessiveness. Often related to policy-based waste

Pareto chart or Bar chart showing the causes of problem or condition order from large
diagram to small contribution. Effective tool to show what the big contributors
to the problem are.

PDCA Cycle Plan, Do Check, Act is a well-known continuous improvement method


often referred to as the Deming Circle. The PDCA cycle is applicable in
any situation, and forms the basis for all improvement within Lean.

Performance Dialogue Their aim is to ensure a structured and objective discussion of


performance. These discussions consist of three elements.

Poka Yoke Literally, to prevent an unintentional error, this is a concept aimed at


ensuring that activities can only be done in one way, the right way; fool-
proofing an activity

Problem An undesired situation that stands in the way of providing the


necessary customer value; an opportunity to improve. Also, the root
cause of incidents (ITIL Definition of Problem, denoted with a capital P)

Problem Board See Improvement board.

97
Problem Management A Core ITIL Operational process with an aim is to prevent problems
and incidents, eliminate repeating incidents and minimize the impact of
incidents that cannot be prevented

Problem Statement A statement that helps the team investigating the problem to focus its
attention. The problem statement may be in the form of a question or
in the form of a statement. The former is preferable because it is then
clear when you have found the answer to the question.

Process Cycle Process Cycle Efficiency refers to the degree of efficiency of a process
Efficiency (PCE) (or set of processes) whether it relates to the level of success of
processing within an organization, the cost-effectiveness of a market,
or the erosion of income by expense.

Pyramid Principle Developed by Barbara Minto’s. The Pyramid Principle is a method that
is fully compatible with A3 thinking. In fact, it helps to structure the
information and insights gained during the Kaizen event.

The problem is framed using the following framework Situation-


Complication-Key Question-Answer:

Queue Time The time a unit of work is in a queue. This is a type of waiting time

Repeater These units of work occur regularly; indicative frequency is weekly. As


an example within IT, we find high impact incidents, small to medium
sized non-standard changes and the smaller advisory services.

Root cause The underlying or original cause of an incident or problem.

Root cause analysis Studying the fundamental causes of a problem, as opposed to analyzing
symptoms.

Runner Units of work that occur on a daily basis and tend to require up to one
hour of work for them to be completed. Within IT, we can say that
incidents, service requests, standard changes and operational activities
fall in this category.

98
SCAMPER A third idea generation technique uses action verbs as triggers to
generate ideas. SCAMPER is an acronym with each letter standing for
an action verb which in turn stands for a prompt for creative ideas. S –
Substitute, C – Combine, A – Adapt, M – Modify, P – Put to another use

E – Eliminate, R – Reverse

Scatter diagram A graph that aims to demonstrate the relationship between two sets of
data. We try to understand whether there is a correlation between two
sets of data and whether this correlation is positive or negative.

Shewhart Cycle Often referred to as the Deming Circle (Plan-Do-Check-Act (PDCA)


sequence.

SIPOC Supplier, Input, Process, Output, Customer. Diagram used to establish


the Kaizen project team, create the project charter and planning, get
stakeholders’ support and start the project.

SMART Specific, Measurable, Achievable, Realistic, Time-bound

Solution Matrix Matrix in which solutions can be plotted according to two axes:
feasibility and anticipated cost.

Special Cause Source of variation that can be assigned to a specific cause which can
Variation usually be discovered. Special causes generate patterns in the data and
provide signals about the problems in the process and how they can be
resolved.

Standard Operating An SOP is a written procedure that describes how a specific task should
Procedure (SOP) be carried out.

Stranger Are units of work that have an irregular occurrence. IT ‘strangers’ are
large non-standard changes, large requests for advice and plans, which
all tend to occur or be updated on a monthly or quarterly basis.

Summarize An A3 skill that is the ability to express thoughts, facts, and other
information concisely

99
Synthesis Refers to a combination of two or more entities that together form
something new; alternately and is required to address complex and
chaotic problems (vb. Synthesize)

System Thinking Systems thinking has been defined as an approach to problem solving,
by viewing "problems" as parts of an overall system, rather than
reacting to specific parts, outcomes or events, and thereby potentially
contributing to further development of unintended consequences.

Is the process of understanding how those things which may be


regarded as systems influence one another within a complete entity, or
a larger system.

Takt Time Volume of customer demand per time period (takt time is the inverse of
this number)

Tally sheet See Check Sheet

Throughput The actual amount of output over a period of time. This is invariably
lower than the capacity as a result of waste.

Value Stream Map A technique used to analyses the flow of materials and information
(VSM) currently required to bring a product or service to a consumer. A
visual representation of all of the process steps (both value-added and
non-value-added) required to transform a customer requirement into
a delivered good or service. A VSM shows the connection between
information flow and product flow, as well as the major process
blocks and barriers to flow. VSMs are used to document current state
conditions as well as design a future state. One of the key objectives
of Value Stream Mapping is to identify non-value adding activities
for elimination. Value Stream Maps, along with the Value Stream
Implementation Plan are strategic tools used to help identify, prioritize
and communicate continuous improvement activities.

Variable, Control This kind of variable is particularly useful in experiments. This variable is
kept constant while others are changed so that they can be investigated

Variable, Dependent This is the output; in effect, this is the problem

Variable, Independent This is an input. In the case of problem-solving, the independent


variable can be seen as something that may or may not contribute to
the problem. The aim is obviously to find the independent variables that
have the greatest effect on the problem.

100
Visual Management Visual Management is about effective communication and real-time
updates regarding the work.

Visualize An A3 skill used to turn your story into a visual experience using
pictures and graphics to explain what has been investigated and what is
proposed as a solution.

Voice of the Business Concerns the ‘business’ of the IT organization itself; not to be confused
with the fact that the customer of IT is regularly referred to as “the
business”.

Voice of the Customer Gives the IT organization feedback on how the customer, the user of the
IT service, actually experiences the IT service

Voice of the Process Provides information about processes not working correctly.

Voice of the Regulator Those representing regulatory requirements

VSM See Value Stream Map

Work in Progress The number of uncompleted units of work that are still in the process.
This number is directly related to the lead time (Little’s Law)

101
11 About the author
11.1 Niels Loader

As advisor to tens of IT organizations, Niels has extensive knowledge and experience in implementing
IT Service Management, IT Performance Management, Lean IT and DevOps within IT organizations.
In 2010 and 2011, he was one of the initiators of the Lean IT Foundation certification and spent four
years as the Chief Examiner for the APMG Lean IT certification. He is the lead of the Content team of
the Lean IT Association.

The author would like to thank everyone who put their time and effort into improving this document.

The author would especially like to thank Troy DuMoulin for the inspiring discussions to get the right
content into the Lean IT Kaizen syllabus and the first reviews. And many thanks to Gary Case, Rita
Pilon, Hans van den Bent, Marianne Hubregtse and Mike Orzen for their critical reviews, which helped
to improve this publication.

102
Copyright © 2015 Lean IT Association.

For all your inquiries, please contact info@leanitassociation.com

or visit us at www.leanitassociation.com

103

You might also like