Professional Documents
Culture Documents
Foreword
As an education and training organization within the IT Service Management (ITSM) industry, we have been impressed by the positive changes introduced by the Version 3 refresh of the ITIL framework. The evolution of the core principles and practices provided by the framework provides the more holistic guidance needed for an industry that continues to mature and develop at a rapid pace. We recognize however, that many organizations and individuals who had previously struggled with their adoption of the framework will continue to find challenges in implementing ITIL as part of their approach for governance of IT Service Management practices. In light of this, one of our primary goals is to provide the quality education and support materials needed to enable the understanding and application of the ITIL framework in a wide-range of contexts. This workbooks primary purpose is to complement the accredited ITIL Planning, Protection & Optimization program provided by The Art of Service or one of our accredited partners. We hope you find this book to be a useful tool in your educational library and wish you well in your IT Service Management career!
The Art of Service Pty Ltd All of the information in this document is subject to copyright. No part of this document may in any form or by any means (whether electronic or mechanical or otherwise) be copied, reproduced, stored in a retrieval system, transmitted or provided to any other person without the prior written permission of The Art of Service Pty Ltd, who owns the copyright. ITIL is a Registered Community Trade Mark of OGC (Office of Government Commerce, London, UK), and is Registered in the U.S. Patent and Trademark Office. The Art of Service
Contents
FOREWORD........................................................................................................................................................ 1 1 2 INTRODUCTION ........................................................................................................................................ 5 IT SERVICE MANAGEMENT ...................................................................................................................... 7 2.1 2.2 2.3 3 3.1 3.2 3.3 4 4.1 4.2 4.3 THE FOUR PERSPECTIVES (ATTRIBUTES) OF ITSM.................................................................................................8 BENEFITS OF ITSM............................................................................................................................................9 BUSINESS AND IT ALIGNMENT .........................................................................................................................10 THE SERVICE LIFECYCLE .................................................................................................................................14 MAPPING THE CONCEPTS OF ITIL TO THE SERVICE LIFECYCLE ........................................................................16 HOW DOES THE SERVICE LIFECYCLE WORK?....................................................................................................18 WHAT ARE SERVICES? ...................................................................................................................................22 PROCESSES & FUNCTIONS .............................................................................................................................28 OTHER COMMON TERMINOLOGY ..................................................................................................................33
5 RELATIONSHIP BETWEEN THE PLANNING, PROTECTION AND OPTIMIZATION PROCESSES AND THE SERVICE LIFECYCLE......................................................................................................................................... 34 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 6 6.1 6.2 6.3 6.4 6.5 7 7.1 7.2 8 9 8.1 9.1 9.2 10 11 10.1 11.1 11.2 11.3 SERVICE STRATEGY ........................................................................................................................................35 OBJECTIVES OF SERVICE STRATEGY.................................................................................................................35 BENEFITS OF SERVICE STRATEGY ......................................................................................................................36 SERVICE STRATEGY INTERFACES WITH OTHER SERVICE LIFECYCLE PHASES ...........................................................37 SERVICE DESIGN ...........................................................................................................................................39 OBJECTIVES OF SERVICE DESIGN....................................................................................................................39 BENEFITS OF SERVICE DESIGN .........................................................................................................................40 SERVICE DESIGN INTERFACES WITH OTHER SERVICE LIFECYCLE PHASES ..............................................................41 DEMAND MANAGEMENT...............................................................................................................................46 AVAILABILITY MANAGEMENT .........................................................................................................................52 CAPACITY MANAGEMENT .............................................................................................................................76 IT SERVICE CONTINUITY MANAGEMENT ..........................................................................................................85 INFORMATION SECURITY MANAGEMENT .........................................................................................................96 GENERIC ROLES ..........................................................................................................................................103 ROLES WITHIN PLANNING PROTECTION AND OPTIMIZATION............................................................................104 KNOWLEDGE MANAGEMENT TOOLS ............................................................................................................115 THE CONTINUAL SERVICE IMPROVEMENT MODEL ..........................................................................................118 MANAGING CULTURAL CHANGE.................................................................................................................119 REVIEW QUESTIONS ................................................................................................................................122 THE PRACTICE OF SERVICE MANAGEMENT ...............................................................................................131 SERVICE STRATEGY PRINCIPLES ................................................................................................................131 STRATEGY AND ORGANIZATION ...............................................................................................................133
TECHNOLOGY CONSIDERATIONS ...................................................................................................... 113 IMPLEMENTING PLANNING, PROTECTION AND OPTIMIZATION PROCESSES................................... 118
4
11.4 11.5 11.6 11.7 11.8 11.9 12 13
ITIL V3 : Planning, Protection & Optimization Best Practices TECHNOLOGY AND STRATEGY .................................................................................................................134 SERVICE DESIGN PRINCIPLES ...................................................................................................................135 AVAILABILITY MANAGEMENT ...................................................................................................................135 CAPACITY MANAGEMENT ......................................................................................................................136 IT SERVICE CONTINUITY MANAGEMENT ....................................................................................................136 INFORMATION SECURITY MANAGEMENT ...................................................................................................137
GLOSSARY ............................................................................................................................................ 139 CERTIFICATION..................................................................................................................................... 143 13.1 13.2 ITIL CERTIFICATION PATHWAYS .............................................................................................................143 ISO/IEC 20000 PATHWAYS ...................................................................................................................144
14 15
1 Introduction
More than a decade of building knowledge and experience around IT Service Management, and many worthwhile discussions with our clients made us realize that most companies still focus on the technology and technical capabilities of an IT organization, rather than looking from the outside into IT. Technology and applications were often designed with the IT specialist and not the user in mind. But not everybody worked like this: there were many companies who aimed for better quality, for more alignment between the business needs and the IT offerings. However, in most of these cases the processes that were implemented focused more on the operations, rather than the design. The design aspect was a separate activity, performed by architects and the strategic development teams. Now with ITIL Version 3 we have the opportunity to look at this situation as if we were entirely new to the industry. What is it that the business needs from the IT group? How will IT Service Management help us in designing and delivering IT services that support those needs? How are we going to make sure we deliver services that are designed in a planned manner, and deliver these services in the most difficult circumstances? The business requires our support to achieve their business goals, or to put it more simply: to get money in through the doors. With this in mind, this workbook aims to develop the readers knowledge and appreciation of the practices for IT Service Management, with particular focus on those capabilities required for Planning, Protection & Optimization in the modern IT environment. This workbook is created to be used in addition to the combination of an accredited ITIL training program as well as practical experience gained in the field. Assumptions are made by the authors that readers already have some familiarity with IT and ITIL terminology or have already completed an ITIL Foundation program.
2 IT Service Management
The term IT Service Management (ITSM) is used in many ways by different management frameworks and organizations seeking governance and increased maturity of their IT organization. Standard elements for most definitions of ITSM include: Description of the processes required to deliver and support IT Services for customers; The purpose primarily being to deliver and support the technology or products needed by the business to meet key organizational objectives or goals; Definition of roles and responsibilities for the people involved including IT staff, customers and other stakeholders involved; and The management of external suppliers (partners) involved in the delivery and support of the technology and products being delivered and supported by IT. The combination of these elements provide the capabilities required for an IT organization to deliver and support quality IT Services that meet specific business needs and requirements. The official ITIL definition of IT Service Management is found within the Service Design volume (page 11), describing ITSM as A set of specialized organizational capabilities for providing value to customers in the form of services. These organizational capabilities are influenced by the needs and requirements of customers, the culture that exists within the service organization and the intangible nature of the output and intermediate products of IT services. However, IT Service Management comprises more than just these capabilities alone, being complemented by an industry of professional practice and wealth of knowledge, experience and skills. The ITIL framework has developed as a major source of good practice in Service Management and is used by organizations worldwide to establish and improve their ITSM practices.
Process Products/Technology
There are four perspectives (4Ps) or attributes to explain the concept of ITSM. Partners/Suppliers Perspective: Takes into account the importance of Partner and External Supplier relationships and how they contribute to Service Delivery. People Perspective: Concerned with the soft side of ITSM. This includes IT staff, customers and other stakeholders. E.g. Do staff have the correct skills and knowledge to perform their roles? Products/Technology Perspective: Takes into account IT services, hardware and software, budgets, tools. Process Perspective: Relates the end-to-end delivery of service-based on-process flows. Quality IT Service Management ensures that each of these four perspectives are taken into account as part of the continual improvement of the IT organization. These same perspectives need to be considered and catered for when designing new or modified services to succeed in the design, transition and eventual adoption by customers.
10
11
Example to illustrate business and IT alignment: Business: A fashion store. What are some of your organizations objectives or strategic goals? We want to make a lot of money $$$! We want to have a good image and reputation. What Business Processes aide in achieving those objectives? Retail, marketing, buying, procurement, HR etc. What IT Services are these business processes dependent on? Web site, email, automatic procurement system for buying products, Point of Sale Services. We have ITSM in order to make sure the IT Services are: What we need (Service Level Management, Capacity Management etc); Available when we need it (Availability Management, Incident Management etc.); and Provisioned cost-effectively (Financial Management, Service Level Management). If we dont manage the IT Services appropriately we cannot rely on these services to be available when we need them. If this occurs we cannot adequately support our business processes effectively and efficiently. And therefore we cannot meet or support our overall organizations objectives!
12
13
3 What is ITIL?
ITIL stands for the Information Technology Infrastructure Library. ITIL is the international de facto management framework describing good practices for IT Service Management. The ITIL framework evolved from the UK governments efforts during the 1980s to document how successful organizations approached service management. By the early 1990s they had produced a large collection of books documenting the best practices for IT Service Management. This collection was eventually entitled the IT Infrastructure Library. The Office of Government Commerce in the UK continues to operate as the trademark owner of ITIL. ITIL has gone through several evolutions and was most recently refreshed with the release of Version 3 in 2007. Through these evolutions the scope of practices documented has increased in order to stay current with the continued maturity of the IT industry and meet the needs and requirements of the ITSM professional community. ITIL is only one of many sources for best practices, including those documented by: Public frameworks (ITIL, COBIT, CMMI etc.); Standards (ISO 20000, BS 15000); and Proprietary knowledge of organizations and individuals. Generally best practices are those formalized as a result of being successful in wideindustry use.
Five volumes make up the IT Infrastructure Library (Version 3). Service Strategy; Service Design; Service Transition; Service Operation; and Continual Service Improvement.
14
3.1 The Service Lifecycle
Figure 3.2 ITIL Service Lifecycle Model Crown Copyright 2007 Reproduced under license from OGC
Lifecycle: The natural process of stages that an organism or inanimate object goes through as it matures. For example, human stages are birth, infant, toddler, child, pre-teen, teenager, young adult, adult, elderly adult and death. The concept of the Service Lifecycle is fundamental to the refresh of ITIL for Version 3. Previously, much of the focus of ITIL was on the processes required to design, deliver and support services for customers. As a result of this previous focus on processes, Version 2 of the ITIL Framework provided best practices for ITSM based around the how questions. These included: How should we design for availability, capacity and continuity of services? How can we respond to and manage incidents, problems and known errors? As Version 3 now maintains a holistic view covering the entire lifecycle of a service. No longer does ITIL just answer the how questions, but also why? Why does a customer need this service? Why should the customer purchase services from us? Why should we provide (x) levels of availability, capacity and continuity? By first asking these questions it enables a service provider to provide overall strategic objectives for the IT organization, which will then be used to direct how services are
15
designed, transitioned, supported and improved in order to deliver optimum value to customers and stakeholders. The ultimate success of service management is indicated by the strength of the relationship between customers and service providers. The 5 phases of the Service Lifecycle provide the necessary guidance to achieve this success. Together they provide a body of knowledge and set of good practices for successful service management. This end-to-end view of how IT should be integrated with business strategy is at the heart of ITILs five core volumes (books).
16
17
NOTES: The Service Lifecycle phases (and ITIL books) are shown through the arrows at the bottom. The concepts in dark shading are the V2 ITIL concepts. The concepts not shaded are the new ITIL V3 concepts. The concepts in light shading are Functions. Although Service Level Management officially sits in the Service Design book, it plays a very important role in the Continual Service Improvement phase, and therefore could also fit in the CSI book as a process.
18
Service Strategy Service Design Service Transition Service Operation Continual Service Improvement
Patterns of Business Activity Service Portfolio information New and changed service assets Service Catalogue, SLAs, OLAs, UCs Testing and Validation Criteria Known Errors from Development Testing and validation results Change Authorization Incidents & Problems, Events, Service Requests Request for Changes Information collected from infrastructure monitoring
It is important to note that most of the processes defined do not get executed within only one lifecycle phase. As an example we will look at the process of Availability Management and where some activities will get executed throughout Service Lifecycle. Service Strategy Phase: Determines the needs, priorities, demands and relative importance for desired services. Identifies the value being created through services and the predicted financial resources required to design, deliver and support them. Service Design Phase: Designs the infrastructure, processes and support mechanisms needed to meet the Availability requirements of the customer. Service Transition Phase: Validates that the Service meets the functional and technical fitness criteria to justify release to the customer. Service Operation Phase: Monitors the ongoing Availability being provided. During this phase we also manage and resolve incidents that affect Service Availability. Continual Service Improvement Phase: Coordinates the collection of data, information and knowledge regarding the quality and performance of services supplied and Service Management activities performed. Service Improvement Plans developed and coordinated to improve any aspect involved in the management of IT services.
19
20
21
4 Common Terminology
Critical to our ability to participate with and apply the concepts from the ITIL framework is the need to be able to speak a common language with other IT staff, customers, end-users and other involved stakeholders. This next section documents the important common terminology that is used throughout the ITIL framework.
Figure 4.1 The importance of terminology Crown Copyright 2007 Reproduced under license from OGC
22
4.1 What are Services?
The concept of IT Services as opposed to IT components is central to understanding the Service Lifecycle and IT Service Management principles in general. It requires not just a learned set of skills but also a way of thinking that often challenges the traditional instincts of IT workers to focus on the individual components (typically the applications or hardware under their care) that make up the IT infrastructure. The mindset requires instead an alternative outlook to be maintained, with the focus being the Service oriented or end-toend view of what their organization actually provides to its customers. The official definition of a Service is a means of delivering value to Customers by facilitating outcomes customers want to achieve without the ownership of specific costs or risks. Well what does this actually mean? To explain some of the key concepts I will use an analogy that most (food lovers) will understand. While I do enjoy cooking, there are often times where I wish to enjoy quality food without the time and effort required to prepare a meal. If I was to cook, I would need to go to a grocery store, buy the ingredients, take these ingredients home, prepare and cook the meal, set the table and of course, clean up the kitchen afterwards. Alternatively, I can go to a restaurant that delivers a service that provides me with the same outcome (a nice meal) without the time, effort and general fuss if I was to cook it myself. Now consider how I would identify the quality and value of that service being provided. It isnt just the quality of the food itself that will influence my perceptions, but also:The cleanliness of the restaurant; The friendliness and customer service skills of the waiters and other staff; The ambience of the restaurant (lighting, music, decorations etc.); The time taken to receive my meal (and was it what I asked for?); and Did they offer water as well as normal drinks and beverages.
If just one of these factors doesnt meet my expectations, than ultimately the perceived quality and value being delivered to me as a customer are negatively impacted. Now relate this to our role in providing an IT Service. If we as IT staff focus on the application or hardware elements being provided and forget or ignore the importance of the surrounding elements that make up the end-to-end service, just like in the example of the restaurant, the customer experience and perceived quality and value will be negatively impacted. But if we take a Service oriented perspective, we also ensure that: Communication with customers and end users is effectively maintained; Appropriate resolution times are maintained for end user and customer enquiries;
23
Transparency and visibility of the IT organization and where money is being spent is maintained; and The IT organization works proactively to identify potential problems that should be rectified or improvements that could be made.
To help clarify the relationship between IT Services and the business we need to look at the concepts of Business Units and Service Units.
4.1.1
Organization Create value Business unit Capabilities Goods/ Services Consume assets Knowledge Coordinate, control, and deploy Resources Supply Generate returns (or recover costs) Information Processes
People
Customers
Asset types
Applications
Infrastructure
Financial capital
Figure 4.2 Business Units Crown Copyright 2007 Reproduced under license from OGC
A business unit is a bundle of assets with the purpose of creating value for customers in the form of goods and services. Their customers pay for the value they receive, which ensures that the business unit maintains an adequate return on investment. The relationship is good as long as the customer receives value and the business unit recovers cost and receives some form of compensation or profit (depending on the nature of the organization). A business units capabilities coordinate, control and deploy its resources to create value in the form of business services. Some services simply increase the resources available to the customer, others may increase the performance of the customers management, organization, people and/or processes. The relationship with the customer becomes strong when there is a balance between value created and returns generated.
The Art of Service
24
Customer
Performance potential Customer Assets Service potential
Service Provider
Service Assets
+
Capabilities
+
Service
Risks Costs
Capabilities
Resources
+
Demand Idle capacity
Resources
(Business Unit)
(Service Unit)
Figure 4.3 Balancing service potential against demand for services Crown Copyright 2007 Reproduced under license from OGC
Service units are similar structures to business units, being a bundle of service assets that specializes in creating value in the form of IT services. Business units (customers) and Service units (providers) can be part of the same organization or come from multiple independent organizations. This relationship enables Business Units to focus on the outcomes provided by services, while the Service Unit(s) focus on managing the costs and risks of providing the service, possibly by spreading these costs and risks across more than one customer or Business Unit. In the case of Service Offerings & Agreements, the nature of relationship between Business Units and Service Units will affect the approach taken, potentially requiring greater focus on one or more of the following areas:Legal requirements and contractual obligations (particularly when the business units are from a different organization to that of the service unit); Technical documentation and clarity (particularly when the business units/customers are IT organizations themselves); and Risk Management, depending on the number and size of business units/customers being served.
25
UTILITY
Available enough? Capacity enough? Continuous enough? Secure enough? Figure 4.4 Creating Service Value Crown Copyright 2007 Reproduced under license from OGC
Value
WARRANTY
Formula: Service Warranty + Service Utility = Service Value Service Utility describes the positive effect on business processes, activities, objects and tasks. This could be the removal of constraints that improves performance or some other positive effect that improves the outcomes managed and focused on by the customer and business. This is generally summarized as being fit for purpose. Service Warranty, on the other hand, describes how well these benefits are delivered to the customer. It describes the Services attributes such as the availability, capacity,
26
performance, security and continuity levels to be delivered by the provider. Importantly, the Service Utility potential is only realized when the Service is available with sufficient capacity and performance. By describing both Service Utility and Service Warranty, it enables the provider to clearly establish the value of the Service, differentiate themselves from the competition and, where necessary, attach a meaningful price tag that has relevance to the customer and associated market space.
4.1.3
To discuss Service Packages, Service Level Packages and how they are used to offer choice and value to customers, were going to use the example of the packages made available by typical Internet Service Providers (ISPs). As customers, we have a wide range of choice when looking for an ISP to provide broadband internet. So as a result ISPs do need to work hard to attract customers by communicating the value that they provide through their offerings. They also need to offer a wide range of choice for customers, who have varying requirements and needs for their broadband internet service.
A Service Package provides a detailed description of package of bundled services available to be delivered to Customers. The contents of a Service Package includes: The core services provided; Any supporting services provided (often the excitement factors); and The Service Level Package (see next page). Service Level Packages are effective in developing service packages with levels of utility and warranty appropriate to the customers needs and in a cost-effective way. Service Level Packages include: Availability & Capacity Levels; Continuity Measures; and Fi Security Levels.
So for our ISP example, we can define a Service Package in the following way:
27
Service Package: Broadband SuperUser ($69.95 per month) Core Service Package: Internet Connection Email Addresses Supporting Services Package: Static IP Address Spam filtering 100MB Web-Space VOIP
Service Level Package: Download Speeds: 8000kbs 24 000kbs (max) Download Quota: 30 GB (Shaped to 512kbs after) Backup dial-up account 98 % Availability Guarantee (otherwise rebate offered) 24 x 7 Service Desk for Support.
Figure 4.7 Service Package Example (ISP)
Most of the components of Service Packages and Service Level Packages are reusable components of the IT organization (many of which are services). Other components include software, hardware and other infrastructure elements. By providing Service Level Packages in this way it reduces the cost and complexity of providing services while maintaining high levels of customer satisfaction. In our example above, the ISP can easily create multiple Service Packages with varying levels of Utility and Warranty provided in order to offer a wide range of choice to customers, and to distinguish themselves from their competition. The use of Service Packages and Service Level Packages enables Service Providers to avoid a one-size fits all approach to IT Services.
28
29
Figure 4.8 Generic Process Elements Crown Copyright 2007 Reproduced under license from OGC
The figure above describes the physical components of processes, which are tangible and therefore typically get the most attention. In addition to the physical components, there are behavioral components which are for the most part intangible, and are part of an underlying pattern so deeply embedded and recurrent that it is displayed by most members of the organization and includes decision making, communication and learning processes. Behavioral components have no independent existence apart from the work processes in which they appear, but at the same time they greatly affect and impact the form, substance and character of activities and subsequent outputs by shaping how they are carried out. So when defining and designing processes, it is important to consider both the physical and behavioral aspects that exist. This may be addressed by ensuring the all required stakeholders (e.g. staff members, customers and users etc.) are appropriately involved in the design of processes so that:
30
They can communicate their own ideas, concerns and opinions that might influence the way in which processes are designed, implemented and improved. Of particular importance may be current behaviors that have not been previously identified which may affect the process design and implementation. Stakeholder groups are provided adequate training and education regarding how to perform their role within the process and what value the process provides for. Stakeholders generally feel to be empowered in the change being developed, and therefore are more likely to respond positively rather than actively or passively resisting the organizational changes occurring.
Service Desk
Technical Management Mainframe Server Network Storage Databases Directory Services Desktop Middleware Internet/Web
IT Operations Control
Console Mgmt Job Scheduling Backup & Restore Print & Output
Facilities Management
Data Centers Recovery Sites Consolidation Contracts
Business Apps
31
NOTE: These are logical functions and do not necessarily have to be performed by equivalent organizational structure. This means that Technical and Application Management can be organized in any combination and into any number of departments. The lower groupings (e.g. Mainframe, Server) are examples of the roles performed by Technical Management and are not necessarily a suggested organizational structure.
4.2.3
It is often said that processes are perfectuntil people get involved. This saying comes from misunderstandings of the people involved and a lack of clarity regarding the roles and responsibilities that exist when executing processes. A useful tool to assist the definition of the roles and responsibilities when designing processes is the RACI Model. RACI stands for: R Responsibility (actually does the work for that activity but reports to the function or position that has an A against it). A Accountability (is made accountable for ensuring that the action takes place, even if they might not do it themselves). This role implies ownership. C Consult (advice/guidance/information can be gained from this function or position prior to the action taking place). I Inform (the function or position that is told about the event after it has happened). RACI Model Service Desk Logging Classification Investigation RACI RACI ACI Desktop RCI RCI Applications RCI Operations Manager CI CI CI
A RACI Model is used to define the roles and responsibilities of various Functions in relation to the activities of Incident Management. General Rules that exist:
32
Only 1 A per Row can be defined (ensures accountability, more than one A would confuse this). At least 1 R per Row must be allocated (shows that actions are taking place), with more than one being appropriate where there is shared responsibility.
33
Resources:
Service Owner:
A decision support and planning tool that projects the likely consequences of a business action. It provides justification for a significant item of expenditure, including costs, benefits, options, issues, risks and possible problems.
The Art of Service
34
5 Relationship between the Planning, Protection and Optimization Processes and the Service Lifecycle
The focus of this workbook is on the ITIL Planning, Protection and Optimization processes, activities and methods used when providing IT services. These processes are mainly contained within two Service Lifecycle phases, Service Strategy and Service Design. While both of these phases have their own objectives and responsibilities, together they provide a coordinated approach to the high level planning, protection and optimization of the architectures, systems and components required to deliver IT services according to business need. To describe the role of these two phases in brief: The Service Strategy lifecycle phase focuses on developing and refining the high level objectives, policies, and plans that document how the IT organization can/will provide value and support to the business and associated customers. The Service Design lifecycle phase focuses on transforming these objectives and plans into actual Services by designing the architectures, technology, processes and operational support capabilities that are required.
35
36
37
38
Service Design
Service Validation Criteria, Cost Units, Priorities & Risks of IT Services, Requirements Portfolio
Service Transition
Service Strategy
Service Models for support, Service Portfolios, Demand Management Strategies, IT Budgets Nominated budgets for delivering and supporting services Process metrics and KPIs, Service Portfolios Continual Service Improvement
Service Operation
39
40
Figure 5.4: Service Design involves a balancing act affected by multiple constraints Crown Copyright 2007 Reproduced under license from OGC
41
42
CSI should provide momentum to the Service Lifecycle, including Service Design, so that the performance, quality and overall customer satisfaction of the IT organization continually improves.
Service Assets, Service Components for budgeting and IT Accounting, Service Level Requirements Service Acceptance Criteria, Test Plans, Service Transition Plans, Service Design Packages, Configuration Item information, SLAs, OLAs, UCs
Service Strategy
Service Transition
Service Design
Support Procedures, User Documentation, Service Catalogues, SLAs, OLAs, UCs, Security & Access Policies Nominated budgets for delivering and supporting services Process metrics and KPIs, Service Portfolios Continual Service Improvement
Service Operation
43
44
However the level to which the PPO processes are required to be implemented will depend on many factors, including: The complexity and culture of the organization; The relative size, complexity and maturity of the IT infrastructure; The type of business and associated customers being served by IT; The number of services, customers and end users involved;
45
Regulations and compliance factors affecting the business or IT; and The use of outsourcing and external suppliers for small or large portions of the overall IT Service Delivery.
Based on these influencing factors, the actual PPO team may be a single person in a small IT department or involve a worldwide network of business and customer oriented groups in an international organization. Regardless of the size of the actual teams responsible for the PPO processes, it is important to always ensure: Clear definition of roles and responsibilities (using a RACI model) of the people, groups and stakeholders involved; Clear definition of the scope, objectives, Critical Success Factors (CSFs) and associated Key Performance Indicators (KPIs) for all processes involved; and Training and awareness is developed for all stakeholders including IT staff, customers and end-users so that the processes operate successfully and deliver upon their required objectives. This chapter describes the PPO processes, defining the key elements that exist and how they should be used as part of the successful development and management of Planning, Protection and Optimization.
46
6.1 Demand Management
Demand Management was previously an activity found within Capacity Management, and now within Version 3 of ITIL it has been made a separate process found within the Service Strategy phase. This is because before we decide how to design for availability and capacity, decisions must be made regarding why demand should be managed in a particular way. Such questions asked here include: When and why does the business need this capacity? Does the benefit of providing the required capacity outweigh the costs? What level of capacity and performance should we support? How can we influence demand to reduce our excess capacity needs? How do our IT strategic objectives affect our approach? Are we focused on costeffectiveness? Poorly managed demand is a particular source of risk for service providers, with potential negative impacts being felt by both the IT organizations and customers. If demand is not accurately predicted and managed, idle (excess) capacity will generate cost without creating associated value that can be appropriately recovered. From the customer perspective most would be highly reluctant to pay for idle capacity unless it provides some value for them. On the other hand, insufficient capacity can impact the quality of services delivered, potentially limiting the growth desired for services and for the organization as a whole. Accordingly, Demand Management must seek to achieve a balance between the prediction and management of demand for services against the supply and production of capacity to meet those demands. By doing so both the customers and IT can reduce excess capacity needs while still supporting required levels of quality and warranty in agreed services. Keep in mind that Demand Management plays an integral part in supporting the objectives of an organization and maximizing the value of the IT Service Provider. This means that the way in which Demand Management is utilized will vary greatly between each organization. Two examples showing these differences are: Health Organizations: When providing IT Services that support critical services being offered to the public, it would be unlikely that there would be many (if any) demand management restrictions that would be utilized, as the impact of these restrictions could lead to tragic implications for patients being treated. Commercial Confectionery Organizations: Typically a confectionery company will have extremely busy periods around traditional holidays (e.g. Christmas). Demand Management techniques would be utilized to promote more cost-effective use of IT during the non-peak periods; however leading up to these holidays the service provider would seek to provide all capacity to meet demand and support higher revenue streams for the business units involved.
47
48
Figure 6.1: Activity-based Demand Management Crown Copyright 2007 Reproduced under license from OGC
Over time, Demand Management should be able to build a profile of business processes and the patterns of business activity in such a way that seasonal variations as well as specific events (e.g. adding new employees) can be anticipated in terms of associated demand. Using this information will help various elements of the Service Lifecycle, including the following: Service Design: Particularly Capacity and Availability Management, who can optimize designs to suit demand patterns; Service Transition: Change Management and Service Validation and Testing can ensure that appropriate levels of warranty can be provided; Service Operation: Can optimize the availability of staff based on patterns of demand; and Continual Service Improvement: Can identify opportunities to consolidate demand or introduce improved incentives or techniques to be utilized in influencing demand. Critical to the effective application of Demand Management is a forward-looking Capacity Plan, which should identify how capacity will be produced to meet the predicted demand patterns, including the level of excess capacity deemed appropriate in accordance with the business requirements for service value.
49
User profile
Senior Executive (UP 1)
PBA code
33B 17D 21A 33D 17B 21A 33A 17E 21C
In the above table, the PBA code would be referencing previously defined patterns of business activity, which helps clarify when will each type of user will typically generate demand for IT services and what level of demand will there be. This is valuable information which can be used for then predicting the potential impact that adding or removing staff members (users) may have on the demand for IT services and the ability of the IT service provider to meet those demands.
50
In a basic approach, this may be as simple as Gold, Silver and Bronze offerings to influence the adoption and use of IT services. To clarify how the different processes work together, the following is a summary of the various responsibilities: Service Portfolio Management responsibilities to assess, manage and prioritize investments into IT, identifying underserved, well served and over served demand. Manage Service Portfolio, including the definition of services in terms of business value. Demand Management responsibilities identify, develop and analyze PBA and user profiles. Build capabilities for predicting seasonal variations and specific events in terms of the associated demand generated. Strategically package services to reduce excess capacity needs while still meeting business requirements. Design and apply techniques where necessary to influence demand. Financial Management responsibilities to work with Demand Management to determine value of services (and understand the effect on value by varying levels of capacity and performance), and to develop appropriate chargeback models to be used in influencing demand. Service Level Management responsibilities to maintain regular communication with customers and business units, identify any potential issues, promote service catalogue, negotiate and agree relevant SLAs (including the charging mechanisms used to influence demand), ensure correct alignment of Service Packages and Service Level Packages. Generally measure the success of IT and quality of service delivered from the customer perspective, providing feedback to the other processes on issues and potential improvements.
51
52
53
6.2.2 Scope
While primarily a Service Design process, there are many elements involved in Availability Management that interact throughout the Service Lifecycle. Using a combination of both proactive and reactive activities, the scope of the process covers design, implementation, measurement, management and improvements of IT service and component availability. In order to achieve a balance between cost-effectiveness and appropriate quality, Availability Management is involved in the determination and fulfillment of the requirements for availability, including an in-depth understanding of: The current business processes, their operation and requirements; Future business plans, objectives and requirements; Service targets and the current IT service operation and delivery ; IT infrastructure, data, applications and environments and their performance (in terms of stability, redundancy, useful life span); and Business impacts and priorities in relation to the services and their usage. Understanding all of this will enable Availability Management to ensure that all services and associated supporting components and processes are designed and delivered to meet their targets in terms of agreed business needs. While implementations will vary depending on the organizational requirements for Availability Management as well as a number of both internal and external influencing factors, as a general guide the process should include some adoption of the following activities:
54
Figure 6.3: The proactive and reactive elements of Availability Management Crown Copyright 2007 Reproduced under license from OGC
Proactive Activities (primarily executed in Service Design and Service Transition): The development and maintenance of an Availability Plan, which documents the current and future requirements for service availability, and the methods used to meet these requirements; Development of a defined set of methods, techniques and calculations for the assessment and reporting of availability; Liaison with IT Service Continuity Management and other aligned processes to assist with risk assessment and management activities; and Ensuring consistency in the design of services and components to align with the business requirements for availability. Reactive Activities (primarily executed in Service Operation and Continual Service Improvement): Regular monitoring of all aspects of availability, reliability and maintainability, including supporting processes such as Event Management for timely disruption detection and escalation; Regular and event-based reporting of service and component availability; Ensuring regular maintenance is performed according to the levels of risk across the IT infrastructure; and Assessing the performance of and data gathered by various Service Operation processes such as Incident and Problem Management to determine what
55
improvement actions might be made to improve availability levels or the way in which they are met.
Terminology
Explanations
56
1. Availability
The ability of a service, component or CI to perform its agreed function when required. It is typically measured an reported as a percentage using the following formula: Availability (%) = Agreed Service Time Downtime Agreed Service Time This means that if a service is only partly functional, or the performance is degraded to a point outside of normal service operation, then the service should be classed as unavailable. x 100 %
2. Service Availability
Involves all aspects of service availability and unavailability and the impact of component availability, or the potential impact or component unavailability on service availability.
Involves all aspects of component availability and unavailability A measure of how long a service, component or CI can perform its agreed function without interruption. This metric provides an understanding of the frequency of disruption and is often reported as Mean Time Between Service Incidents (MTBSI) or Mean Time Between Failures (MTBF). It is typically calculated with the formulas: Reliability (MTBSI) = Available time in hours Number of service disruptions Reliability (MTBF) = Available time in hours Total Downtime in hours Number of service disruptions
5. Maintainability
A measure of how quickly and effectively a service, component or CI can be restored to normal operation after a failure. This metric is typically measured and reported as the Mean Time to Restore Service (MTRS), which includes the entire time from the start of the disruption until the full recovery. The following formula is normally used: Maintainability (MTRS) = Total downtime in hours Number of service disruptions EXAMPLE: For a service that is provided 24 x 7 and running for a reporting period of 5020 hours with only two disruptions (one of 6 hours and one of 14 hours), the following metrics would result: Availability (%) = 5020 20 5020 x 100 % = 99.60%
57
= 2510 hours
6. Serviceability
The ability of an external (third-party) supplier to meet the terms of their contract. Often this contract will include agreed levels of availability, reliability and/or maintainability for a supporting service or component.
Defined business critical elements of a business process that are supported by an IT service. While many functions are supported by IT, we typically prioritize our efforts and resources around supporting the critical elements, including the use of redundant and highly resilient components. Certain VBFs may need special designs which are now used commonly in key infrastructure components (such as servers), which include the following four concepts.
A characteristic of the IT service that minimizes or masks the effects of component failure to the users of a service. The ability of a service, component or CI to continue to operate correctly after failure of a component part. An approach or design to eliminate planned downtime of an IT service. This may mean that individual components are disrupted during maintenance, but the IT service as a whole remains available.
An approach or design to achieve theoretical 100% level of service availability. Multiple design factors will support this to occur, but more stringent requirements will also be assessed first, including environmental and other factors.
58
59
Specific security requirements; and Capabilities for service backups and recovery of service following disruptions.
As this will be an iterative process which includes elements of Service Strategy, there will be a multiple assessments as to whether an effective balance has been proposed between availability and cost. The steps to assist in defining this balance include: Determining the business impact caused by a loss or degradation of service (user productivity losses & other financial impacts felt); Determining the requirements (both internal and external) for supporting the proposed service availability levels; Through assessment with the associated business or customers, determine whether the costs identified in meeting the proposed availability levels are justified; and Where these are seen as cost-justified, begin to define the availability, reliability, maintainability and serviceability requirements for documentation into Service Level Agreements, Operational Level Agreements and Underpinning Contracts.
2. Designing for availability Once an agreed level of availability is defined by balancing the business requirements for availability against the resources required to sustain it, Availability Management will utilize a number of components, techniques, processes and supporting systems to ensure the service is designed appropriately and can be supported in Service Operation. Some of the elements required in order to meet increasing service availability levels include: Quality controlled components and products; Systems Management (including monitoring and, diagnostic and recovery capabilities; Service Management processes (including Event, Incident and Problem Management); High availability-designs (including the elimination of SPOFs and the implementation of redundant components and systems to minimize or avoid disruption); and Special Solutions with full redundancy, through the use of multiple redundant components, systems and sites with strict testing and quality assurance measures being used for each of these elements.
60
Figure 6.4: The various elements required to provide Service Availability Crown Copyright 2007 Reproduced under license from OGC
3. Designing for recovery These activities are concerned with ensuring that in the event of an IT service failure or disruption, the service and its various supporting components can be restored as quickly as possible so that business operations can resume. In many cases there wont be any business justification to build a highly-available service, however suitable availability levels may still be provided by swift and effective resolutions to manage any disruptions that do occur. Some of the elements involved in designing for recovery include: Implementing Systems management for monitoring and escalating any Events that may lead to a service disruption; Developing internal and external processes and procedures to be used to maintain availability and resolve disruptions; and Improving the capability of the people involved in Service Operation with ongoing training and awareness sessions. To assist in this goal, there should be appropriate communication and shared involvement for these activities between both the staff involved in design as well as those involved in the operation and support of services. The actual capabilities and mechanisms used for the recovery of service disruptions will be covered by in section 6.3.5.2 The reactive activities of Availability Management. 4. Component Failure Impact Analysis (CFIA)
61
Component Failure Impact Analysis (CFIA) can be used to predict and evaluate the impact on IT service arising from component failures within the technology. The output from CFIA can be used to identify where additional resilience and redundancy should be considered to prevent or minimize the impact of component failure to the business operation and users. This is particularly important during the Service Design stage, where it is necessary to predict and evaluate the impact on IT service availability arising from component failures with the proposed IT Service Design. CFIA is a relatively simple technique that can be used to provide this valuable information. In addition, it can also be applied to identify impact and dependencies on IT support organization skills and competencies amongst staff supporting the new service. This activity is often completed in conjunction with IT Service Continuity Management and Capacity Management. The outputs of CFIA are used in a number of ways in both planning for availability as well as planning for recovery. These include: The impact that component failure can have on users and business operations; Component and people dependencies; Relative component recovery sequences timeframes; Identified areas where specific recovery plans and procedures may be required; and Identified components that may need some risk reduction measures implemented. One of the main methods for performing a CFIA is by developing a matrix that illustrates what IT services depend on what component Configuration Items (CIs). This may come from the Configuration Management System or from other sources of infrastructure information. This matrix will be populated using the following rules: Leave a blank when a failure of the CI does not impact a service in any way; Insert an X when the failure of the CI causes the service to be inoperative; Insert an A when there is an alternative (e.g. redundant) CI to provide the service; and Insert an M when there is an alternative CI, but it requires manual intervention for the service to be recovered.
Service 1 M M X X A A X
Service 2 M M X X A A X
62
Server 1 Disk 1 Disk 2 Application 1 Application 2
X A A X
X A A X
Once completed there will be consideration into the following aspects: If a CI has a large number of Xs this indicates it is critical to a number of services and its failure can result in a high impact to the business. It may also indicate that this particular CI is a potential Single Point of Failure; and If an IT service has a large number of Xs it indicates it is a potentially complex configuration that has a higher vulnerability level to failure. The same exercise as above can be completed from a different perspective, instead mapping how the components of a single IT service relate to Vital Business Functions and various user groups. While the above example is a valuable for understanding particular risks in the infrastructure and IT services, an advanced CFIA can provided a more detailed analysis including such fields as: Component availability weighting: a weighting factor based on the impact of failure or service as a whole. As an example, this might be based on the percentage of users from the entire user community being affected (e.g. 500 out of 2000 users would be 0.25 or 25%); Probability of failure: based on the reliability of the component or service, typically measured in MTBF; Recovery time: the predicted recovery time for the service or CI; Recovery procedures: to verify that an effective recovery procedure exists; Device independence: used to verify CIs where necessary have been implemented independently; and Dependency: to show dependency between CIs. 5. Single Point of Failure Analysis A Single Point of Failure (SPOF) Analysis can be performed to identify the components that have no backup or fail-over capabilities, and have the potential to cause disruption to the business if it fails. For any SPOFs detected, the information gained from a CFIA can be used to evaluate whether the cost for its remediation can be justified.
63
6. Fault Tree Analysis By performing a Fault Tree Analysis (FTA) either in the design of services or as part of a review of a Major Incident or Problem it provides an understanding of the chain of events that causes a disruption to IT services. The analysis makes use of Boolean notation for the events that occur in the fault tree, and are combined using a number of logic operators to understand the potential result.
Figure 6.5: Example Fault Tree Analysis Crown Copyright 2007 Reproduced under license from OGC
The typical logic operators used for understanding the combination of events include: AND-gates: where the result occurs only when all input events occur simultaneously; OR-gates: where the result occurs when one or more of the input events occur; Exclusive OR-gate: where the result only occur when one and only one of the input event occurs; and Inhibit gate: where the result only occurs when the input condition is not met. 7. Risk Analysis and Management As part of the coordinated approach by the various Service Lifecycle processes in regards to Risk Analysis and Management, Availability Management should contribute by identifying potential risks to and inherent in the infrastructure, provide analysis of vulnerability to these risks and the impacts associated with them, and what countermeasures might be used to mitigate this risks in a cost-effective manner.
64
All organizations should define their own formal approach to Risk Analysis and Management to ensure complete coverage and sufficient confidence in their strategy. A generic framework that can be applied across all areas of an organization to assist in this is Management of Risk (M_o_R) which is another best practice framework published by the Office of Government Commerce (OGC). This framework adopts a systematic approach for the identification and assessment of risk and implementation associated countermeasures. Some of the key elements focused on by the M_o_R framework include providing direction for organizations to: Develop a transparent, repeatable and adaptable framework; Communicate the risk policy and its benefits clearly to all staff and stakeholders; Assign accountability and responsibility to key individuals in senior management; Ensure the culture of the organization is motivated to embed risk management initiatives; Ensuring that risk management supports the organizations objectives; and Adopting a no-blame approach to monitoring and reviewing risk assessment activities.
Figure 6.6: Risk Analysis and Management Crown Copyright 2007 Reproduced under license from OGC
More information regarding risk management will also be covered in 6.5: IT Service Continuity Management. 8. Planned and preventative maintenance All IT components should be subject to a planned maintenance strategy. The frequency and levels of maintenance required, varies from component to component, taking into
65
account the technologies involved, criticality and the potential business benefits that may be introduced. Planned maintenance activities enable the IT support organization to provide: Preventative maintenance to avoid failures; Planned software or hardware upgrades to provide new functionality or additional capacity; Business requested changes to the business applications; and Implementation of new technology and functionality for exploration by the business. Where the business hours required for services is not 24/7 then the scheduled maintenance can be performed outside of business hours without impacting IT service availability. Even in the case of services that are designed for continuous availability, there are typically some time periods that are less critical that might need to be used for maintenance, with degraded performance of the service potentially being an accepted result. Once the agreed schedule for planned and preventative maintenance has been defined and agreed, these will normally be documented in: SLAs, OLAs and Underpinning Contracts; Change Management Schedules; Release and Deployment Management Schedules; and Intranet/Internet pages that communicated schedules and unscheduled outages to the end user community. 9. Production of the Projected Service Outage Availability Management is also responsible for the documentation and communication of the Projected Service Outage (PSO) document, which consists of any variations to agreed levels of service availability in SLAs based on input from: The change and release schedules; Planned and preventative maintenance schedules; Testing schedules; and IT Service Continuity Management and Business Continuity Management testing schedules. The PSO is used to ensure that all variations to agreed availability levels are understood and agreed by all relevant stakeholders. During its use any additional maintenance windows that might be required should be communicated appropriately by the Service Desk to all relevant parties.
66
10. Continual review and improvement As part of the Continual Service Improvement phase, Availability Management should seek to ensure that any previous issues and missed targets are considered for their appropriate correction. However, even if a service provider may have met all their agreed targets for service availability in the previous reporting period, there will still improvement actions that can be developed in some way. Why? Because the business requirements for availability always change! As a result it is important that Availability Management does seek to continually identify ways in which to optimize the availability of the IT infrastructure, by either increasing availability or reducing the costs and resources required to do so. Other potential improvement actions may result in more qualitative benefits, such as enhanced user and customer satisfaction during disruptions that occur. To ensure that critical business developments and changes are also considered, other input should be evaluated on a regular basis from ITSCM, particularly from the updated Business Impact Analysis and Risk Analysis exercises.
67
However many organizations and businesses have realized that these traditional methods for reporting availability are no longer adequate in communicating the user and business experience of availability and unavailability. Alternatively, when the objectives for monitoring and reporting are defined in the context of the business and user perspective a more representative view of the overall quality of IT services is achieved. Some methods that might be employed with this approach include defining: Impact by user minutes lost calculated by multiplying the length of disruption by the number of users affected; and Impact by business transaction calculated by assessing the number of business transactions that could not be processed during the period of disruption. In any case, the methods utilized by Availability Management should be appropriate to the organizations business processes and operational models. If there are a wide range of automated and electronic processing actions performed without actual user involvement, then simply measuring based on user impact will not be sufficient. To enable efficiency of the process, Availability Management should seek to implement and integrate all useful (but cost-effective) monitoring systems, information sources and reporting mechanisms to reduce the complexity and resources required for information gathering, analysis and reporting. When done effectively, it will also streamline the production of the agreed regular reports, including normal monthly availability reporting, the Availability Plan, Service Failure Analysis (SFA) and CFIA reports. Unavailability analysis As part of justifying the costs of implementing and executing the Availability Management process, analysis should be made as to the actual costs incurred during periods of disruption or unavailability. The tangible costs are typically well defined and understood, such as: Lost user productivity; Lost IT staff productivity; Overtime payments; Lost revenue; Wasted goods and materials; Imposed fines or penalty payments; and Injuries or potential safety incidents caused. While these are of course important, there are also a number of intangible costs that can also have potentially long-lasting negative effects on the organization and the service provider. Such intangible costs of unavailability include: Loss of customers; Loss of customer satisfaction; Loss of business opportunity;
68
Damage to business reputation; Loss of confidence in the IT Service Provider; and Damage to staff morale.
While they may be difficult to measure they shouldnt be simply dismissed, and should form part of the focus of customer and user surveys performed by Service Level Management. The Expanded Incident Lifecycle As already discussed earlier in the principles of Availability Management, even when disruptions occur there are still methods that can be utilized by which user and customer satisfaction is still maintained. One way to help achieve this goal is by ensuring that the duration of any disruption is minimized so that normal business operations are resumed as quickly as possible and the impact on the associated business processes and users is reduced. By analyzing the various time elements making up those disruptions it will help Availability Management target any improvement actions, be it through technology enhancements, additional documentation and procedures or staff training and education. This is can be performed by analyzing the expanded incident lifecycle, which breaks down all the major stages through which all incidents progress.
Figure 6.7: The expanded incident lifecycle Crown Copyright 2007 Reproduced under license from OGC
69
1. 2. 3. 4. 5.
Incident detection Incident diagnosis Incident repair Incident recovery Incident restoration
Each stage, and the associated time taken, influences the total downtime perceived by the user and business. By evaluating each of the stages it enables Availability Management to identify potential areas of inefficiency that result in a longer disruption to the user. These improvements may be focused on the detection mechanisms, the documentation supporting diagnosis, repair and recovery, the escalation procedures utilized or any other activity or task involved during the incident lifecycle. As a consequence, Availability Management needs to work closely with Incident and Problem Management to ensure any improvement actions optimize the use of resources and prevent the reccurrence of the incidents where possible. Event Management may also be coordinated to ensure that there is appropriate coverage of CIs with the ability to detect events that may lead to service disruptions. 1. Incident detection This stage encompasses the time between the actual failure or disruption to a CI or IT service occurring and the moment at which the service provider is made aware of that incident. A variety of tools and systems should be employed that have capabilities to detect events and incidents and, consequently reduce the detection timeframe that occurs. For critical CIs, the same systems and tools should also be utilized to trigger automated recovery with scripted responses. When implemented effectively many incidents will be detected and resolved before the users have even been impacted in any noticeable way. 2. Incident diagnosis This stage encompasses the timeframe between which the IT service provider has been made aware of the incident to the time where the underlying cause of the incident has been determined. Particularly at this stage there is an appropriate balance that needs to be developed between the need to capture diagnostic data and the need to have the incident restored. While capturing a range of diagnostic data will often extend the restoration time, it will also enhance the capabilities for preventing the reoccurrence of those incidents. Other than information potentially gathered from users, input for the successful diagnosis of the incident may come from: Data captured by the failing CI(s);
70
Incident, Problem and Known Error databases, along with general knowledge management systems that may have captured the criteria by which the incident has occurred before; and Manual observation performed by the various technical staff responsible for the failing CI(s).
3. Incident repair This stage encompasses the repair time required for the incident, including both the automated and manual techniques that might be needed. The repair time is shown independently from the recovery time, with this stage focusing on the actual techniques that initiate the resolution of the incident, including: Actions to repair or restart a failed CI; Activities performed by external suppliers for the repair of CIs under their control; and Documentation of the procedures used to assist in the future diagnosis and repair of similar incidents. The agreed timeframes for the response and repair of incidents should be documented in SLAs, OLAs and Underpinning Contracts, and continuously monitored for compliance. 4. Incident recovery This stage encompasses the activities to recover service availability (but distinct from the repair activities previously mentioned). This can include activities such as: The initiation for backups to be effectively restored; The activities required to recover lost or corrupt data; Utilizing spare equipment (including those within the Definitive Spares) to be implemented in the production environment to facilitate recovery; and The actions and time required to restore an image to a desktop machine following a failure. When designing new IT services the recovery requirements for each supporting component CI should be identified as early as possible to facilitate the development of the associated recovery plans and procedures to be used when failures occur. 5. Incident restoration This stage encompasses the time from when the recovery steps are completed, through to the time that the service provider has confirmed that normal business and IT service operation has resumed. In some situations, this will be performed by Service Desk staff, who simply call the affected users for confirmation that service been restored to them. In other cases, particularly those which support automated business processes, a synthetic
71
transaction or user-simulation test script may need to be executed to ensure that the restored IT service is working as expected. Service Failure Analysis Service Failure Analysis (SFA) is a technique designed to provide a structured approach to identifying the underlying causes of service interruptions to the user and business operations. SFA utilizes a variety of sources to assess where and why shortfalls in availability are occurring. This technique enables a holistic view to be taken to drive improvements to the IT support organization, processes, procedures and tools. The staff utilized in an SFA would typically be those also involved in Incident and Problem Management, so the activities here will be jointly performed by these processes. Some of the key high-level objectives of SFA are to: Improve the overall availability of IT services by producing a set of improvements for implementation or input into the plan; Identify underlying causes of service interruption to users; Enable enhanced levels of service availability without incurring major costs; Assess the effectiveness of the IT support organization and key processes; Develop cross-functional teams to reduce silos that may exist; Produce detailed reports of major findings and recommendations; and Ensure that Availability improvements derived from SFA-driven activities are measured.
Figure 6.8: The high-level structure for a Service Failure Analysis (SFA) Crown Copyright 2007 Reproduced under license from OGC
72
1. Select opportunity
There should be an agreed number of assignments scheduled per year within the Availability Plan as part of the proactive approach to improving availability. Typically SFAs will focus on high priority IT services, however as part of a long-term approach there should be a wide coverage of services analyzed for improvement. When scheduling the SFA there needs to be clarification as to which IT service or CI is going to be reviewed (based on the types of failures that have previously occurred), with consultation of any key stakeholders (from IT and business) so that there is management commitment to the recommendations made. 2. Scope assignment This should explicitly describe what areas are and are not covered within the assignment. 3. Plan assignment The SFA should be conducted using an appropriate project framework, including an agreed project plan and a defined set of resources. Where necessary the staff involved may form a virtual, the size of which should reflect the scope and complexity of the SFA. 4. Build hypothesis For analysis of key data to occur there should be some hypotheses first built regarding the disruption and the providers response to the disruption. This provides focus to the analysis that will be performed, as well as early detection and resolution of issues that may affect the SFA (such as a lack of data). 5. Analyze key data During this period, the SFA team will analyze the data collected and develop some early conclusions. The way in which this analysis is actually performed will depend on the size of the SFA team and the actual roles being played by those members. 6. Interview key personnel While the data gathered from the various systems management tools has already been analyzed, this time is now used to interview business and user representatives for understanding of the business perspective regarding the service failure and response
73
actions taken. This will help capture issues that arent recorded through other means and provide some meaning and focus to the data already gathered. 7. Findings and conclusions Once the previously mentioned activities have concluded the SFA team should begin to start formulating initial findings and conclusions. This may require some further analysis to valuate findings if previously gathered data is not deemed sufficient. 8. Recommendations When all findings and conclusions have been validated, the SFA team along with the Availability Manager should formulate recommendations to be reported. These recommendations should also take into consideration the practicality of implementing and sustaining the improvement actions in the future.
74
9. Report
A final report should be presented to the SFA sponsor with an executive summary. If recommendations are made, a business case to justify the allocation of the estimated resources may be required. 10. Validation This essentially provides a point of comparison to the before state that existed before the SFA. Where the desired benefits havent been achieved, a review should be conducted.
75
Percentage reduction in the cost of unavailability; Percentage reduction in the costs involved in Service Delivery; Percentage reduction in the overtime hours worked as a result of unavailability; Reduced time to document Availability Plan; and Reduced time required to complete SFA.
76
6.3 Capacity Management
6.3.2 Scope
While the focal point of Capacity Management is to ensure adequate performance and capacity of IT services are being developed and already delivered, there are many supporting elements including IT components, product and software licenses, physical sites, human resources and third party products that will all need to be managed appropriately for this goal to be achieved. As a result, while many activities will be the responsibility of other IT Service Management or general organizational management processes (such as managing human resources), Capacity Management will be involved in the high level planning of each of these elements. As a general guide, Capacity Management seeks to understand and support: The current business operation and its requirements, through the patterns of business activity (provided by Demand Management); The future business plans and requirements (provided by Service Portfolio Management); The agreed service targets for performance and capacity (provided by Service Level Management); and All areas of IT technology in regards to the requirements for capacity and performance. When implemented effectively, Capacity Management can help to ensure that there are no surprises with regard to service and component design and performance.
77
This optimum balance is only achieved both now and in the future by ensuring that Capacity Management is involved in all aspects of the Service Lifecycle. When this doesnt occur Capacity Management only operates as a reactive process, with only limited benefits being delivered as a result.
In the above figure, capacity is only implemented when disruptions begin to occur as demand has exceeded supply. While the implemented capacity does work to resolve the disruptions, there are some consequences to this type of reactive behavior including:
78
IT infrastructure components being purchased that dont optimally fit the requirements or architecture; Budget overruns for the unforeseen and unanticipated purchases; Periods of time where there are potentially large amounts of excess capacity; Reduced customer and user satisfaction with the affected IT services; and A general negatively affected perception of the IT organization as a whole.
79
aligned with business requirements and provisioned in an optimal and cost-effective manner.
80
As new technologies emerge, the IT service provider should seek to evaluate whether they might be able to deliver enhanced capacity and performance levels in a more costeffective manner than those already used. Recent examples of technologies that have been used successfully in this manner include virtualization, cloud computing and blade server implementations. 2. Designing resilience In conjunction with Availability Management, there should be analysis as to where it is cost-effective to build resilience into the infrastructure, by assisting in techniques such as a Component Failure Impact Analysis (CFIA) and other risk assessment management activities. Depending on the availability levels that have been agreed, Capacity Management will evaluate what level of spare capacity of infrastructure components are required to meet these targets, and strive to ensure these requirements are considered early in the design stage of new or modified services. 3. Threshold management and control Within Service Operation, there should be an ongoing set of monitoring and control activities that assist in providing assurance that agreed service levels are being delivered and protected. In the context of Capacity Management, there should be thresholds set for various components and services that raise warnings and alarms when approached orbreached. Event Management will be primarily involved to support this capability and ensure that an appropriate level of capacity events are monitored and escalated to avoid staff being flooded with alerts.
81
The main techniques utilized for modeling include: Baselining where a baseline of current performance and capacity levels is identified and documented; Trend analysis where services and components are monitored over time for their utilization to assist in the identification of trends and the potential forecasting of future utilization and performance levels; Analytical modeling where mathematical techniques are used to predict the performance levels that might be achieved under certain conditions or after making modifications to the infrastructure. Analytical modeling is typically quicker and cheaper to perform than Simulation Modeling, but also typically provides less accurate results; and Simulation modeling where a set of discrete events a modeled and compared against a defined hardware configuration. This will often involve simulation transactions across the service and infrastructure, and as a result will typically yield more accurate results. 2. Application Sizing Application sizing is an activity that begins during the early design of a new or modified service and ends when the service has been accepted into the production environment. The sizing activities relate to all elements and components required for service, including estimation of the required capacity levels of hardware, data, environments and applications that are involved. The main objective is to accurately estimate the resource requirements to support a proposed change and ensure that it meets its required service levels. This includes consideration as to the resilience measures that might be required to deliver a set level of capacity, performance and availability. This will be an iterative process, including constant negotiation with Service Level Management to define a cost-effective approach that satisfies the business objectives. While some aspects of quality may be improved after implementation (including adding additional hardware and other components), in most cases quality must be built in from the start, otherwise much higher costs are incurred trying to fix issues once the service is in production.
82
to the data being collected and the perspective from which it is analyzed. For example, Component Capacity Management is concerned with the performance of individual components, where Service Capacity Management is concerned with the performance of the entire service, monitoring transaction throughput rates and response times.
Figure 6.10: The operational activities of Capacity Management Crown Copyright 2007 Reproduced under license from OGC
1. Utilization monitoring The monitoring applied should be specific to a particular CI, whether it is an IT service, an operating system, a hardware configuration or application. It is important that the monitors can collect all the data required by Capacity Management for each of the three subprocesses. Some of the typical monitored data collected include: Processor utilization; Memory utilization; % processor per transaction type; Input/output rates; Queue lengths; Disk utilization; Transaction rates; Response times; Database usage; Index usage; Hit rates; Concurrent user numbers; and Network traffic rates.
83
When collecting data intended for use by the Service Capacity Management subprocesses, the transaction response time for services may be monitored and measured by: Incorporating specific code within client and server applications software; Using robotic scripted systems with terminal emulation software; Using distributed agent monitoring software; and Using specific passive monitoring systems. 2. Analysis The data collected by the various monitoring activities and mechanisms will then be used to identify trends, baselines, issues and conformance or breaches to agreed service levels. There may be other issues identified such as: Bottlenecks within the infrastructure; Inappropriate distribution of workload across the implemented resources; Inefficiencies in application design; Unexpected increased in workloads and input transactions; and Scheduled services that need to be reallocated. 3. Tuning After analysis of collected data has occurred, there may be some corrective action that is required in order to better utilize the infrastructure and resources to improve the performance of a particular service. Examples of the types of tuning techniques that might be used include: Balancing workloads transactions may arrive at the host or server at a particular gateway, depending where the transaction was initiated; balancing the ratio of initiation points to gateways can provide tuning benefits; Balancing disk traffic storing data on disk efficiently and strategically, e.g. striping data across many spindles may reduce data collection; Definition of an acceptable locking strategy that specifies when locks are necessary and the appropriate level, e.g. database, page, file, record and row delaying the lock until an update is necessary may provide benefits; and Efficient use of memory may include looking to utilize more or less memory depending upon the circumstances. Before implementing any of the recommendations arising from the tuning techniques, it may be appropriate to consider using one of the on-going activities to test the validity of the recommendation. 4. Implementation
84
The objective of implementation is to control the introduction of any changes identified into the production environment. Depending on the changes required, this may be implemented via a normal change model (using all the normal steps of Change Management) or a standard change where there is already change approval and an established procedure for the work required.
85
6.4.2 Scope
The scope of ITSCM can be said to be focused on planning for, managing and recovering from IT disasters. These disasters are severe enough to have a critical impact on business operations and as a result will typically require a separate set of infrastructure and facilities to recover. Less significant events are dealt with as part of the Incident Management process in association with Availability Management. The disaster does not necessarily need to be a fire, flood, pestilence or plague, but any disruption that causes a severe impact to one or more business processes. Accordingly, the scope of ITSCM should be carefully defined according to the organizations needs, which may result in continuity planning and recovery mechanisms for some or all of the IT services being provided to the business. There are longer-term business risks that are out of the scope of ITSCM, including those arising from changes in business direction, organizational restructures or emergence of new competitors in the market place. These are more the focus of processes such as Service Portfolio Management and Change Management.
86
So for general guidance, the recommended activities for any ITSCM implementation include: The agreement of the scope of the process and the policies adopted; Business Impact Analysis (BIA) to quantify the impact a loss of IT service would have on the business; Risk Analysis; Production of an overall ITSCM strategy that must be integrated into the BCM strategy; Production of ITSCM plans; Testing of plans; and Ongoing education and awareness, operation and maintenance of plans.
Figure 6.11: Lifecycle of IT Service Continuity Management Crown Copyright 2007 Reproduced under license from OGC
If there has already been extensive work performed in the context of Business Continuity Management (BCM- focusing on the capabilities and resources required to continue business operations during and following a disaster), then this will provide excellent input into the initiation of IT Service Continuity Management. However, in many cases the work of BCM has not yet been performed or is still ongoing, in which case these two aligned processes should work together so that an appropriate strategy can be developed and cost-effective decisions can be made. In other scenarios where BCM is entirely absent,
87
ITSCM is required to fulfill many of the requirements and activities of BCM. This chapter, however, will assume that BCM process has been established and appropriate plans and documents are in place.
88
5. Agreed project and quality plans
One of the controlling elements in the project will be quality plans that ensure deliverables are achieved to an acceptable level of quality. The extent to which these activities need to be considered during the initiation process depends on the contingency facilities that have been applied within the organization. This activity also provides a vehicle by which to communicate the resource requirements for the project, thus working towards gaining buy-in from management and all necessary stakeholders.
The Business Impact Analysis (BIA) seeks to quantify the range of impact that a loss of service will have. Some forms quantified for the damage a loss of service may cause include: Lost income and incurred costs through overtime payments or fines paid; Damaged reputation; Decreased competitive advantage; Decreased customer satisfaction and perception of the IT service provider; Potential threat of injury or loss of life; Immediate and long-term; and Breach of law, regulations and compliance requirements. Each form of impacts is measured against particular scenarios for each business process, such as an inability to invoice for a period of 3 days leading up to Christmas.
89
Figure 6.12: Graphical representation of business impact in relation to time Crown Copyright 2007 Reproduced under license from OGC
The level of impact felt by business operation will also change depending on the length of disruption. Figure 6.12 shows how some disruptions will immediately cause a significant impact on business operations, whereas other disruptions wont impact immediately, but grow over the length of time that the disruption endures. This analysis will influence the approach and measures taken, primarily either being focused on risk reduction (being able to withstand failures) or recovery (to bring back the affected IT services over a period of time). Some other aspects identified by the BIA include: Staffing and skills necessary to continue operating at acceptable levels; Time within which minimum staffing, facilities and services should be recovered; The time within which all required business processes and operations should be partially and fully recovered; and The relative priority of each business process being supported by IT services. The views represented by the BIA should encompass all levels of the organization as well as any other stakeholders that might be affected.
90
2. Requirements Risk Analysis
Another activity performed in order to determine the requirements of IT Service Continuity Management is that of Risk Analysis. This involves the assessment of the existing threats that might cause disruption as how vulnerable the organization is to that threat. This activity as a result is a joint responsibility of ITSCM, Availability Management and Information Security Management. A standard and defined methodology should govern the use of Risk Analysis and Risk Management activities within the organization. One particular methodology that might be used is the Management of Risk (M_o_R) framework, which is shown in the figure to the right.
Figure 6.13: The M_o_R framework Crown Copyright 2007 Reproduced under license from OGC
The M_o_R approach adopts the following principles when applied: M_O_R principles which are derived from corporate governance principles and are essential for developed good practices for risk management. M_o_R approach which documents the agreed approach for the organization, including dynamic documents such as: o Risk Management policy; o Process guides; o Plans; o Risk registers; and o Issue logs. M_o_R processes which consists of four main steps: o Identifying threats and opportunities; o Assessing the effect of threats and opportunities; o Planning to reduce the threats and maximizing opportunities; and o Implementing the corrective action and reviewing where the results do not meet expectations. Embedding and reviewing M_o_R to continually review and improve the practices for Risk Management. Communication which ensures that appropriate communication occurs, with plans documented to ensure staff members and stakeholders know their responsibilities and who the audience for communication should be.
91
Figure 6.14: Developing a risk profile Crown Copyright 2007 Reproduced under license from OGC
Using their chosen methodology, the organization should develop and maintain a risk profile, which classifies risks on scales of severity and likelihood to occur. This profile will also show which risks have been determined to be acceptable, and for those deemed unacceptable there are some risk reduction or recovery measures required. 3. IT Service Continuity Strategy
The results of the BIA and Risk Analysis will be used by BCM and ITSCM to begin developing appropriate strategies in response. Overall, the strategy should represent a balance between risk reduction and recovery options, as well as a balance between the cost of developing and maintaining these options against the impact felt if the risks do eventuate. Typical measures for used for risk reduction include: UPS and backup power systems to computers, servers and other infrastructure; Systems designed with fault tolerance when any downtime is unacceptable (involves multiple redundancy with load sharing and/or automated failovers); RAID arrays for disk storage; Spare equipment such as routers, switches, desktops and laptops to be used in the case of component failure; Off-site storage for backups and for failover systems; and Multiple suppliers for critical sub-services (e.g. WAN and internet connections). Typical measures for recovery include: Manual workarounds such as using paper based systems for a limited timeframe;
92
Reciprocal arrangements where two or more organizations share the costs associated in developing and operating some shared facilities that can be used in the case of a disaster occurring; Gradual recovery aka. cold standby where the recovery facilities provide empty accommodation equipped with power, network cabling and telecommunications connections. Over the course of the disruption the provider moves in and configures any infrastructure required to recover service; Intermediate recovery aka. warm standby where the recovery facilities (often provided by third parties) provide the accommodation for necessary staff and houses preinstalled infrastructure to be used for recovery. The actual recovery however will take some time as the infrastructure will need to be re-configured as well as ensuring that applications and data can be restored from backups; Fast recovery aka. hot standby where the recovery facilities house dedicated infrastructure for the organization to utilize in the case of disruption. In the event of a failure the organization can then initiate failover to the recovery site, initiate any backups to restore and recover service within a 24 hour period; and Immediate recovery aka hot standby provides recovery facilities that support the immediate restoration of services, with potentially no visible impact on the business operations itself. This is often implemented in such a way that the organization houses dedicated equipment at an alternative site (often far enough away to not be affected by the same risk such as blackouts or weather events). In some cases the IT services actually being protected by this recovery option will only be those that support a vital business function.
It is important that the strategy includes a combination of measures, so that the balance between cost and risk as well as prevention and recovery is obtained. The plan should document where staff will be located, as well as how other critical services are managed such as power, water, telecommunications, couriers and information management.
93
The plan should be under the control of Change Management to ensure integrity, but also be made widely available to key staff at all times. This requires both electronic and physical (multiple) copies to be maintained, as well as some off-site storage of these documents. To facilitate its use in the case of disaster there should be a checklist that covers specific actions that are required during all stages of the recovery, including those actions to evaluate whether normal service operation has resumed. Other documents that will also be integrated with the BCP and IT Service Continuity Plans are the: Emergency Response Plan; Damage Assessment Plan; Salvage Plan; Vital Records Plan; Crisis Management and Public Relations Plan; Accommodation Plan; Security Plan; Personnel Plan; Communication Plan; and Finance and administration Plan. Testing As any recovery plans are being implemented, there is a requirement for sufficient testing to be undertaken to ensure the plans effectiveness, including walk-through tests, full and partial tests and, a scenario test. These tests will also need to be conducted as part of Stage 4 Ongoing operation as required.
94
dependencies such as new systems or networks or a change in service providers, as well as when there is a change in business direction and strategy or IT strategy. Testing following the initial testing it is necessary to establish a program of regular testing to ensure that the critical components of the strategy are tested at least annually or as directed by senior management or audit. It is important that any changes to the IT Infrastructure are in included in the strategy, implemented appropriately and tested to ensure they function correctly. Change Management following tests and reviews, and day-to-day changes, there is a need for the ITSCM plan to be updated. ITSCM must be included as part of the existing Change Management process to ensure all changes are reflected in the contingency arrangements provided by IT or external suppliers. Invocation Invocation is the key component of the BCP and IT Service Continuity Plan, so guidance should be provided to support the decision-making process regarding whether to invoke the recovery plans. This decision should take into account the extent and scope of damage, the likely length of disruption and unavailability, and the time at which the disruption occurred (e.g. occurred during a non-business critical time of the year).
Typical responsibilities for ITSCM in planning and dealing with disaster are similar to how First Aid Officers and Fire Wardens act in planning and operational roles (they may not be full-time roles, but are instead a hat they wear when required). See the following table for an example of how responsibilities for ITSCM are typically assigned.
95
Role Board
Responsibilities Crisis Management Corporate/Business decisions External affairs Co-ordination Direction and arbitration Resource authorization Invocation of continuity or recovery Team Leadership Site Management Liaison & Reporting Task execution Team membership Team and Site liaison
Senior Mgmt
Management
96
6.5.2 Scope
The process should be the focal point for all IT security issues, and must ensure that an Information Security Policy is produced, maintained and enforced, that covers the use and misuse of all IT systems and services. This will include understanding: Business Security Plans; Current business operation and its security requirements; Future business plans and requirements; Legislative requirements; Obligations and responsibilities; and Business and IT risks and their management. As a guide, the Information Security Management process should include activities to: Produce, maintain, distribute and enforce the ISM policy and supporting security policies; Understand the agreed current and future security requirements of the business and the existing Business Security Plans;
97
Implement a set of security controls that support the ISM policy and manage associate risks; Document all security controls, together with the operation and maintenance of the controls and their associated risk; Manage all suppliers and contracts regarding access to systems and services, in conjunction with Supplier Management; Manage all security breaches and incidents associated with all systems and services; Proactively improve security controls and security risk management and the reduction of security risks; and Integrate security aspects with all other IT Service Management processes.
98
framework for managing security will help to ensure that the Four Ps of People, Process, Products, and Partners are considered as to the requirements for security and control.
Figure 6.15: Framework for managing IT security Crown Copyright 2007 Reproduced under license from OGC
As a guide, standards such as ISO 27001 provide a formal standard by which to compare or certify their own ISMS, covering the five main elements of: 1. Plan
Planning is used to identify and recommend the appropriate security measures that will support the requirements and objectives of the organization. SLAs and OLAs, business and organizational plans and strategies, regulation and compliance requirements (such as Privacy Acts) as well as the legal, moral and ethical responsibilities for information security will be considered in the development of these measures. 2. Implement
The objective of this element is to ensure that the appropriate measures, procedures, tools and controls are in place to support the Information Security Policy. 3. Control
The objectives of the control element of the ISMS are to: Ensure the framework is developed to support Information Security Management; Develop an organizational structure appropriate to support the Information Security Policy;
99
4.
The evaluate element of the ISMS is focused on ensuring: Regular audits and reviews are performed; Policy and process compliance is evaluated; and Information and audit reports are provided to management and external regulators if required. 5. Maintain
As part of Continual Service Improvement, the maintain element seeks to: Improve security agreements as documented in SLAs and OLAs; and Improve the implementation and use of security measures and controls.
100
Figure 6.16: Security Control Crown Copyright 2007 Reproduced under license from OGC
There are various security threats to our infrastructure and we want to prevent or reduce the damage of these as much as possible. Prevention/Risk reduction measures assist us to do this. E.G. Antivirus systems, firewalls etc. In the case that they do pass our prevention mechanisms, we need to have detection techniques to identify when and where they occurred. Once a security incident has occurred, we want to repress or minimize the damage associated with this incident. We then want to correct any damage caused and recover our infrastructure to normal levels. E.G. Antivirus systems quarantining an affected file. After this process we need to review how and why the breach occurred and how successful were we in responding to the breach.
To assist in identifying what controls are missing or ineffective, a matrix can be developed that analyzes each of the control measures used for the different perspectives of security that need to be protected and controlled.
101
102
103
104
1. 2. 3. 4.
7.2.2 IT Planner
The IT Planner makes (or at least coordinates the creation of) the IT Plans. These include strategic plans, IT standards, policy and strategy implementation plans. Description Recommend policy for the effective use of IT throughout the organization. Obtain and evaluate proposals from suppliers to ensure that all business and IT requirements are satisfied. Take ultimate responsibility for prioritizing and scheduling the implementation of new or changes services within IT. Develops the initial plans for the implementation of authorized new IT services, clearly listing costs and expected benefits. Conduct Post Implementation Reviews (PIRs) in conjunction with Change Management.
1. 2. 3. 4. 5.
105
6.
Make sure that all IT Planning processes, roles, responsibilities and documentation are regularly reviewed and audited for efficiency, effectiveness and compliance.
7.2.3 IT Designer/Architect
Where the IT Planner coordinates the overall production and coordination of IT plans, and the Service Design Manager coordinates the deployment of quality solution designs we need another role to focus completely on the technology behind the service. Description Produce a detailed process map that documents all processes and their highlevel interfaces. Design secure and resilient technology architectures that meet all current and anticipated business needs. Design an appropriate and suitable Service Portfolio. Create and maintain IT design policies and criteria, including (but not limited to) connectivity, capacity, security and recovery. Ensure that all new services meet their service levels and targets. Provide advice to management and planning phases of IT systems, to ensure that requirements are reflected in the overall specifications.
1. 2. 3. 4.
5.
1. 2. 3. 4.
Key Skills Good knowledge of Design Philosophies. Good knowledge and practical experience with Programme and Project Management. Understand how architectures, strategies, designs and plans fit together. Good knowledge and understanding of Service Management Frameworks.
1. 2. 3. 4. 5.
106
6.
7. 8. 9. 10. 11.
Raise incidents and problems when breaches of capacity or performance thresholds are detected, and assist with the investigation and diagnosis of capacity-related incidents and problems. Identify and implement initiatives to improve resource usage for example, demand management techniques. Assess new technology and its relevance to the organization in terms of performance and cost against relative Service Value and underserved demand. Is familiar with potential future demand for IT services and assessing this on performance service levels. Ensure that all changes are assessed for their potential effects on demand for IT services. Assessnew techniques and hardware and software products for use by Capacity Management that might improve efficiency in managing and serving demand. Report on service quality and performance against targets contained in SLAs. Maintain a knowledge of future demand for IT services and predict the effects of demand on performance service levels. Determine performance service levels that are maintainable and cost-justified Key Skills Demonstrate awareness of the business priorities, objectives and business drivers. Demonstrate awareness of the role IT plays in enabling the business objectives to be met. Advanced customer service skills. Awareness of what IT can deliver to the business, including the latest capabilities. Demonstrate competence, knowledge and information that is necessary to complete the role effectively. Use, understand and interpret the best practice, policies and procedures to ensure adherence. Demonstrate management skills, both from a personnel management perspective and from the overall control of process. Exceptional organizational and communication skills. Ability to articulate all information regarding Demand and Capacity management in both written and verbal forms.
1. 2. 3. 4. 5. 6. 7. 8. 9.
107
2. 3. 4. 5.
6.
7.
8. 9. 10. 11.
Assist with the investigation and diagnosis of all incidents and problems that cause availability issues or unavailability of services or components. Specify the requirements for new or enhanced event management systems for automatic monitoring of availability of IT components. Specify the reliability, maintainability and serviceability requirements for components supplied by internal and external suppliers. Monitor actual IT availability achieved against SLA targets, and provide a range of IT availability reporting to ensure that agreed levels of availability, reliability and maintainability are measured and monitored on an ongoing basis. Create, maintain and regularly review an AMIS and a forward-looking Availability Plan, aimed at improving the overall availability of IT services and infrastructure components, to ensure that existing and future business availability requirements can be met. Ensure that the Availability Management process, its associated techniques and methods are regularly reviewed and audited, and that all of these are subject to continual improvement and remain fit for purpose. Work with Financial Management, ensuring the levels of IT availability required are cost-justified. Maintain and complete an availability testing schedule for all availability mechanisms. Assist Security and IT Service Continuity Management with the assessment and management of risk. Assessi changes for their impact on all aspects of availability, including overall service availability and the Availability Plan.
108
7. 8. 9.
17.
23.
organizations business planning cycle, identifying current usage and forecast requirements during the period covered by the plan. Ensure that appropriate levels of monitoring of resources and system performance are set. Analyse of usage and performance data, and reporting on performance against targets contained in SLAs. Raise incidents and problems when breaches of capacity or performance thresholds are detected, and assisting with the investigation and diagnosis of capacity-related incidents and problems. Identify, initiate and tune to optimize and improve capacity or performance. Identify and implement initiatives to improve resource usage for example, demand management techniques. Assess new technology and its relevance to the organization in terms of performance and cost. Be familiar with potential future demand for IT services and assessing this on performance service levels. Ensure that all changes are assessed for their impact on capacity and performance and attending CAB meetings when appropriate. Produce regular management reports that include current usage of resources, trends and forecasts. Size all proposed new services and systems to determine the computer and network resources required, to determine hardware utilization, performance service levels and cost implications. Assess new techniques and hardware and software products for use by Capacity Management that might improve the efficiency and effectiveness of the process. Performance testing of new services and systems. Report on service and component performance against targets contained in SLAs. Maintain a knowledge of future demand for IT services and predicting the effects of demand on performance service levels. Determine performance service levels that are maintainable and cost-justified. Recommend tuning services and systems, and making recommendations to IT management on the design and use of systems to help ensure optimum use of all hardware and operating system software resources. Act as a focal point for all capacity and performance issues.
109
3.
4. 5. 6. 7. 8. 9. 10. 11.
requirements of the organizations Business Continuity Management process, and represent the IT services function within the Business Continuity Management process. Ensure that all ITSCM plans, risks and activities underpin and align with all BCM plans, risks and activities, and are capable of meeting the agreed and documented targets under any circumstances. Perform risk assessment and risk management to prevent disasters where cost-justifiable and where practical. Develop and maintain the organizations continuity strategy. Assess potential service continuity issues and invoke the Service Continuity Plan if necessary. Manage the Service Continuity Plan while it is in operation, including fail-over to a secondary location and restoration to the primary location Perform post mortem reviews of service continuity tests and invocations, and instigating corrective actions where required. Develop and manage the ITSCM plans to ensure that, at all times, the recovery objectives of the business can be achieved. Ensure that all IT service areas are prepared and able to respond to an invocation of the continuity plans. Maintain a comprehensive IT testing schedule, including testing all continuity plans in line with business requirements and after every major business change. Undertake quality reviews of all procedures and ensuring that these are incorporated into the testing schedule. Communicate and maintain awareness of ITSCM objectives within the business areas supported and IT service areas. Undertake regular reviews, at least annually, of the Continuity Plans with the business areas to ensure that they accurately reflect the business needs. Negotiate and manage contracts with providers of third-party recovery services. Assess changes for their impact on Service Continuity and Continuity Plans. Attend CAB meetings when appropriate.
2.
110
3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
17. 18.
parties. Ensure that the Information Security Policy is enforced and adhered to. Identify and classify IT and information assets (Configuration Items) and the level of control and protection required. Assist with Business Impact Analysis. Perform Security Risk Analysis and risk management in conjunction with Availability and IT Service Continuity Management. Design security controls and developing security plans. Develop and document procedures for operating and maintaining security controls. Monitor and manage all security breaches and handle security incidents, taking remedial action to prevent recurrence wherever possible. Report, analyse and reduce the impact and volumes of all security incidents in conjunction with Problem Management. Promote education and awareness of security. Maintain a set of security controls and documentation, and regularly review and audit all security controls and procedures. Ensurieall changes are assessed for impact on all security aspects, including the Information Security Policy and security controls, and attend CAB meetings when appropriate. Perform security tests. Participate in any security reviews arising from security breaches and instigating remedial actions. Ensure that the confidentiality, integrity and availability of the services are maintained at the levels agreed in the SLAs and that they conform to all relevant statutory requirements. Ensure that all access to services by external partners and suppliers is subject to contractual agreements and responsibilities. Acti as a focal point for all security issues.
7.2.9 Service Level Manager (not covered by PPO but an aligned role required)
Description Design, maintain and review a structure for the process that covers the interactions of the people involved and the expected content of Service Level Management related documents (involving IT and Customers) AND
1.
111
2.
3.
4. 5.
6.
Coordinate any required Service Improvement Plans/Programmes to eradicate falling Service Delivery performance. Coordinate process reviews utilizing independent parties to provide an objective view on the simplicity of the process and areas for improvement. Responsible for implementing any design improvements identified. Establish, maintain and review: Service Level Agreements with the business Customer (including a decision on SLA Structure); Operational Level Agreements with the IT provider; and Underpinning Contracts with third party providers. Creation, maintenance, marketing and distribution of the Service Catalog (which documents the IT Services offered by the organization). Control and review: Any outstanding process related actions; Current targets for service performance; and Performance against SLAs, OLAs and UCs. Make available relevant, concise reports that are both timely and readable for Customers and IT providers. Key Skills Display a communication style based around listening and demonstrable genuine interest. Ability to use and apply valuable information gained from customers. High degree of people/relationship management focus and an ability to deal with an administrative workload. Will also tend to be balanced in negotiations almost to the point of neutrality during discussions between the customer and the IT Service Provider. Take an active interest in learning about services offered by external and internal providers. The manager will be interested in understanding how services are provided, rather than just accepting a marketing statement. Good oral and presentation skills. The manager is a champion for this process and must display an air of confidence, without arrogance. Communicate with people at all levels of the organization; this is one contributing factor that also will require a high degree of understanding of human emotion and resistance. Demonstrate ways to do things differently that will improve the process. Be very risk conscious, but not risk adverse. Although not a highly numeric role, the selected person must be able to understand the basics of supply and demand, with a commonsense attitude to service charging and a grip on basic statistical analysis. Engage in technical discussions with technical people (to ensure credibility) and to engage in business discussions with business people, about those
1.
2.
3.
4. 5.
6. 7.
8.
112
113
8 Technology Considerations
Technology is a significant factor in the quality and success of Service Offerings & Agreements for the modern service provider. There are two main ways in which delivery and support of services is supported by technology: Enterprise-wide tools that support the broader systems and processes within which offerings and agreements are developed and managed; and Tools targeted more specifically at supporting Service Offerings & Agreements processes. The following systems support the wider scope for enterprise requirements, providing automated support for some elements of Service Management: IT Service Management systems: o Enterprise frameworks. o System, network and applications management tools. o Service dashboards and reporting tools. Specific ITSM technology and tools that cover: o SKMS (Service Knowledge Management Systems). o Collaborative, content management, workflow tools. o Data mining tools. o Extract, load and transform data tools. o Measurement and reporting systems. o Test Management and testing tools. o Database and test data management tools. o Copying and publishing tools. o Release and deployment technology. o Deployment and logistics systems and tools.
With particular focus on the Service Offerings & Agreement processes, tools and systems that can be utilized include: Financial Management tools and systems, utilized for charging, accounting and budgeting; Web publishing systems utilized for easy communication of the Service Catalogues; Monitoring tools for measuring utilization of services, used by Demand Management for predicting and analyzing PBAs and their affect on demand; and Document Management Systems, used to manage SLAs, OLAs, Underpinning Contracts and other controlled documents.
114
While the needs for supporting technology will be influenced by a large number of factors, an integrated suite of ITSM tools and systems should generally include the following functionality: Self-Help; Workflow or process engine; Integrated CMS; Discovery/Deployment/Licensing technology; Remote control; Diagnostic utilities; Reporting; Dashboards; and Integration with Business Service Management.
What requirements? Evaluate products Identify products Short listing Selection criteria
Scoring
Select product
Figure 7.1: Typical Activities for selecting ITSM tools and systems Crown Copyright 2007 Reproduced under license from OGC
Typical items to consider when evaluating various products for the most appropriate selection include: Data structure; Integration; Conformity; Flexibility; Usability; Support for monitoring service levels; Conversion requirements; Support options; Scalability; Tool and Vendor credibility; Training needs; Customization; and What level of adaptation is needed to implement the product successfully.
115
8.1.1 Communities
Communities are rapidly becoming the method of choice for groups of people spread across time zones and country boundaries to communicate, collaborate and share knowledge. Examples of services and functions provided within the typical online community are: Community portals; E-mail alias management; Wikis and forum groups; Focus groups; Intellectual property, best practice, work examples and template repository; and Online events and net shows. Successful communities often implement reward schemes for their members to acknowledge and reward the contribution of valuable knowledge assets. It is also
The Art of Service
116
recommended that senior management actively participates in these communities to foster a culture and environment that rewards knowledge-sharing and collaboration.
8.1.2
Collaboration
Collaboration is the process of sharing tacit knowledge and working together to accomplish stated goals and objectives. Knowledge services, when properly implemented, can significantly improve the productivity of people by streamlining and improving the way they collaborate. Examples of knowledge services, widely available today: Shared calendars and tasks; Threaded discussions; Instant messaging; White-boarding; Video or teleconferencing; and E-mail.
117
118
Baseline assessments
Measurable targets
Figure 9.1: Continual Service Improvement Model. Crown Copyright 2007 Reproduced under license from OGC
119
The Continual Service Improvement Model summarizes the constant cycle for improvement. While there may be a focus on Service Operation, the questions require close interactions with all the other ITIL processes in order to achieve Continual Service Improvement. Steps taken to improve Planning Protection and Optimization: What is the Vision? Defining what wants to be achieved by improving Service Operation. Is the focus on Service Quality, compliance, security, costs or customer satisfaction? What is the broad approach that we should take? Where are we now? Baselines taken by performing maturity assessments and by identifying what practices are currently being used (including informal and ad-hoc processes). What information can be provided by the Service Portfolio regarding strengths, weaknesses, risks and priorities of the Service Provider? Where do we want to be? Defining key goals and objectives that wish to be achieved by the formalization of Service Operation processes, including both shortterm and long-term targets. How do we get there? Perform a gap analysis between the current practices and defined targets to begin developing plans to overcome these gaps. Typically the process owners and Service Operation manager will oversee the design/improvement of the processes, making sure they are fit for purpose and interface as needed with other Service Management processes. Did we get there? At agreed time schedules, checks should be made as to how the improvement initiatives have progressed. Which objectives have been achieved? Which havent? What went well and what went wrong? How do we keep the momentum going? Now that the targets and objectives have been met, what is the next course of improvements that can be made? This should feed back into re-examining the vision and following the CSI model steps again.
120
10 Summary
For any organization, the ability of the IT organization to identify changes in business requirements, market places and technology, and to respond in an efficient yet controlled manner is critical, especially in markets where there is high competition for service provision. But even when this is achieved success still requires more; the development and design of services that meet business requirement in a cost-effective and generally optimal manner, with potential to scale for future business improvements and growth. As a summary, the key benefits delivered as a result of improved Planning, Protection and Optimization: Improved effectiveness and efficiency in the Service Catalogue meeting business demand for IT services; Increased return on investment into IT, appropriate levels of availability, capacity, continuity and security being delivered; Improved customer satisfaction through constant communication of service quality, performance and improvement initiatives; Improved synchronization between demand cycles and the production of capacity. Reduction of risk inherent in the IT infrastructure and services supporting business operations; and Improved architectures that suit the diverse requirements of both business and IT stakeholders.
IT Budgets: Strategic Objectives, Service Portfolios: Patterns of Business Activity (PBA)
Service Design
Service Validation Criteria, Cost Units, Priorities & Risks of IT Services, Requirements Portfolio
Service Transition
Service Strategy
Service Models for support, Service Portfolios, Demand Management Strategies, IT Budgets Nominated budgets for delivering and supporting services Process metrics and KPIs, Service Portfolios Continual Service Improvement
Service Operation
121
Figure 10.1: Some of the outputs from Service Strategy to the rest of the Service Lifecycle
Service Assets, Service Components for budgeting and IT Accounting, Service Level Requirements Service Acceptance Criteria, Test Plans, Service Transition Plans, Service Design Packages, Configuration Item information, SLAs, OLAs, UCs
Service Strategy
Service Transition
Service Design
Support Procedures, User Documentation, Service Catalogues, SLAs, OLAs, UCs, Security & Access Policies Nominated budgets for delivering and supporting services Process metrics and KPIs, Service Portfolios Continual Service Improvement
Service Operation
Figure 10.2 Some of the outputs from Service Design to the rest of the Service Lifecycle
122
10.1 Review Questions
The ITIL V3 Intermediate exams are comprised of 8 scenario-based multiple-choice questions (though not all exam questions contain a scenario followed by a question). The following practice exam contains 8 questions for you to complete. In the official exam, all scenarios are provided first, then the questions. To make it easier to follow due to the number of questions, we have set it out with scenario followed by question.
Scenario One
The Martin Luther School District has a reputation for being innovative and experimental in enabling teachers, parents, and students to support each other in the academic and personal growth of their students. Though they are considered one of the top school systems in the United States, they are experiencing the pressures experienced by most other school district; namely lack of funding, increased focus on standardized testing, and cultural, social, and legal protection of students and teachers. Recently developed neighbourhoods and changing boundaries have made massive changes to the student and family demographic. Many of the students that have an option to stay with Martin Luther or go to another school district are choosing to stay, while other students who didn't have Martin Luther as an option before are choosing to leave their previous district for the better education provided within Martin Luther. In addition, some schools that used be administered by a different school district are now under the control of the Martin Luther SD. One of the growing programs provided by Martin Luther is their computer labs and kiosks. The program allows students learn more about computers by completing class work online. This can be done form school or at home. Currently, every student is a registered user of the Martin Luther computer system and it has become a crowning achievement for the school district. The recent changes have introduced a couple of challenges to the program: a massive import of new students and school facilities to the system, some of whom have little computer experience or little computer equipment available; and limited funding to resolve the problem. The risk of losing the program has become a real issue for administrators, teachers, parents, and students alike. To assist in the changes, the school board has agreed to cut budgets of other programs within the schools to fund the technical needs of these new facilities and students. Though not a popular decision, it is being supported by all parties impacted. In an effort to alleviate the funding problem, a local computer company has agreed to subsidize the effort with a small grant. In addition to the money, the company has agreed to provide computer equipment that it was planning to destroy. Since this gift does relieve a large portion of the financial concerns, the other programs are requesting to have their original budgets restored.
123
Question 1
You have been asked to define the capacity requirements for the expanded district. What is the best approach for identifying and defining requirements if you are acting in the role of Capacity Manager?
a) You recognize that in order to be successful, the additional users and facilities have to have the same access and response times that were available to the school district before the zoning change. Work with the IT developers to understand how the previous systems behaved and what is required to maintain the same level in an expanded environment. You recommend a number of requirements based on the worst case scenario that the new facilities have no existing capacity to ensure that there is sufficient capacity to maintain the current level of IT services. b) Based on the inputs from the school board and faculty, you consult with the Service Level Manager to determine a number of possible approaches for meeting customer requirements. Once the Service Level Requirements have been determined, you work with the IT personnel to meet the requirements in the current production environment. You monitor the progress and and create list of minimum specifications required by the solution. As each facility is added to the district, perform a walk through inspection of the facility to identify that the minimum specifications for the solution can be met. Where minimum specifications cannot be met, consult with the Service Level Manager to determine the best approach for handling an exception. The required specifications and the exceptions become the defined capacity requirements for the school district. c) You obtain the Service Level Requirements for the school district from the Service Level Manager. You work with the IT personnel to meet the requirements in the current production environment. You monitor the progress and make adjustments to the IT architecture to produce a standard model for meeting the service requirements. The architecture will define the minimum capacity requirements for the district. d) Work with IT personnel to identify the capacity concerns of an expanded district. Provide these concerns to the Service Level Manager to assist in negotiating reachable service level requirements. Once the service level requirements have been agreed to, review the current architecture to verify that the service levels are being met. When they are, identify the minimum capacity requirements to be used in each facility that is added to the district.
Question 2
As part of the process to fully understand the requirements on capacity, you decide to understand the how the IT systems are being used in the schools. To do this, you've decided to adopt the analytic tool for understanding Patterns of Business Activities. What is the MOST appropriate justification for using the PBA approach to understand how IT systems are being used in the schools?
a) Patterns of Business Activities are a structured description of an activities which includes information on the assets required to perform the activity including people, processes, and applications. In addition, the description provides an understanding of frequency of the activity, the volume of activity at given times, specific locations that the activity is found and how long the activity takes place. For a school setting, a set of PBAs can distinguish
124
between class work being performed by 35 students at a time or an individual project being performed by a single student. PBAs can be used to identify how much capacity is required and when based on the descriptions, as well as define security controls for user IDs and crucial availability targets for specific systems. PBAs can become configurable assets for use during the planning and design of the IT architecture. b) Patterns of Business Activities are the structured description of activities which includes information on the assets required to perform the activity including people, processes, and applications. In addition, the description provides an understanding of frequency of the activity, the volume of activity at given times, specific locations that the activity is found and how long the activity takes place. They are used to understand the IT service requirements of the customer. c) Patterns of Business Activities are a method for describing the components of an activity, specifically for the use of determining the requirements to support that activity. By using the method, a person can understand the frequency of the activity, the volume of activity at a given time, the location, and duration of the activity. The method can set expectations associated with the types of transactions and communications performed during the day. d) Patterns of Business Activities provide a comprehensive view of a single activity: the frequency of that action, the volume at any given time, where the activity is most like to happen, and the expected time to perform the activity. A PBA identifies the required assets needed to perform the activity, whether the activity is manual or automatic. The assets can include types of people, processes, applications, and systems. By having this view, a person can predict what is required to have the activity be successful based on when the activity will start and how many users will be performing the activity. In a school setting, PBA can be used to plan technical resources for classroom projects on the computer or to handle a large number of students logging into the system before dinner to post homework.
Scenario Two
A major hotel and restaurant management company called Highrise has gained the reputation of providing the best and most attractive holiday spots in the country. They currently have the highest ratings for their hotels and restaurants than many of the competitors. Another hotel chain called Easy8 has recently announced they are willing to be purchased. Easy8 has catered to business travel and have become a preferred hotel because of their commitment to meet the IT needs of their guests. In recent years, they have been having difficulty keeping up with new technologies and been focusing with some success on providing other accommodations for attracting guests. Despite the difference in strategy, Highrise's management feel that the two companies share many of the basic requirements and the two different strategies could complement each other. After several weeks of discussion, the decision was made to purchase Easy8. As mentioned, Easy8 has had great success with their commitment to their business guests by providing network connections, business centers and conference rooms on each floor, conference centers, and even workstations in private rooms. The idea allows a business person to simply connect their laptop to the hotel's IT infrastructure and have access to many of the IT capabilities available in their regular office. Unfortunately, the company has not been able to keep up with several emerging technologies including wireless networking in a manner that has been costeffective.
125
Highrise managers have realized that though both markets require the same level of hospitality management, there are some critical differences between vacationing guests and business guests. These are: Providing business accommodations requires a greater degree of detail and flexibility in serving individual guests needs, especially in scheduling and availability of services. Hotel service disruptions can be tolerated better than IT service disruptions. Vacationing guests have a general pattern that is predictable with schedules done far in advance, that require guidance to hotel or local services. Business guests however are very unpredictable with last minute requirements and rapidly changing needs.
Question 3
You are the Capacity Manager for Highrise and you have been asked to find opportunities to reduce costs. What is the best approach for accomplishing your task?
a) Each hotel has their own dedicated IT person to support their own guests and staff, and each hotel offers different levels of IT support. E.g. Some hotels offer video conferencing or business centers, others don't. To understand the specific requirements for each hotel, you meet with hotel management to understand the types of IT services used in the hotel and the volumes of use. Use this information with resource utilization reports to understand the impact of guest activity on IT. Using modeling techniques, you create a viable server consolidation strategy by determining the capacity requirements for anticipated activities at each hotel. Hotels with extremely low activities will share resources with a neighboring hotel. b) You decide that understanding the benefits of IT support in a hotel for a guest would assist in understanding the demand on IT. You meet with hotel management to learn the patterns of guest activities that have been seen at each hotel and any future trends that the hotel may experience. You identify that the IT services offered int eh hotel are not consistent from one hotel to the next. Using modeling techniques, you make two recommendations: the first is to provide a consistent level of IT services across every hotel which would lower the cost by establishing a standard configuration while providing a single service package which will improve guest satisfaction. The second recommendation is a server consolidation strategy based on normal peak volumes. You recommend that on-demand computing can be used to handle any unexpected peaks as well provide additional support options to the guests without having to maintain excess capacity during low volume seasons. c) Consolidating services seems the most likely options for reducing costs and gaining efficiencies. To justify your position you analyze the volume of activities against resource utilization. With the assistance of hotel IT support, you establish the optimal utilization levels for each hotel based on how the IT resources are used at different levels of demand. From this information, you recommend a server consolidation. d) You suspect that costs can be greatly reduced by consolidating servers that are being underutilized. You start to look at the server utilization reports for each hotel. From the reports, you identify that nearly half of the servers have utilization of less than 25 per cent with no one server ever reaching 65 per cent. You make a decision to set an utilization objective for 62 per cent for every server in the company and start removing servers from the system, moving their work to other servers. This continues until every existing server meets the objective.
The Art of Service
126
Question 4
You are the Director of IT for Highrise and you have been asked to speak at the next stockholder's meeting about the current state of IT services in the hotel chain. What is the best approach for communicating the current state of IT services?
a) Provide the stockholders a report on various aspects of IT services ranging from resource utilization to availability levels. Reference how the IT staff interact with the hotel staff to meet the guest needs. Provide a summary of what improvements are being made now and in the future. b) Provide the stockholders with a presentation that starts with the IT vision and its alignment with the current vision of the overall business. Discuss a few of the most visible goals and objectives of IT. Explain to the stockholders where IT services is at in fulfilling their goals and the specific targets that are the focus for the next 3-6 months. Explain some of the steps for meeting those targets and how the stockholders will know that those targets have been met. End the communication with an opportunity for questions. c) Focus on Capacity Management as the subject of your presentation. Explain what capacity management is and why it is important to the experience of the guest. Describe what the current goals are for this service and where the company is at in meeting those goals. Inform the stockholders of the current service levels on capacity and performance and why they are important from a guest perspective. Describe some of the activities that are currently planned to ensure that the company will meet its goals and what monitoring methods are in place to stay on track. End the communication with an opportunity for questions. d) Provide the stockholders with a presentation that starts with the business value of IT within Highrise, focusing on how IT provides guests with the capability that they would find in their office. Review the current status of IT management and the organization. Review some of the services that IT support provides to the business and to guests. Discuss some of the knowledge and experience that can be found in the IT department. End the presentation with a recent story describing a guest's satisfaction with IT.
Scenario Three
A graphics design company called BitFil provides numerous services in Seattle. One market that they have experienced recent success in is the film industry, where they have been providing any number of products from movie posters to background settings. The work required is increasing and the company has agreed to open a new facility in Los Angeles. The Design Team consists of fifteen talented designers. Their responsibilities include identifying customer's requirements and providing at least three designs that meet those requirements. One of the designs is approved and the designers are then responsible to work with their own project team to deliver the final product on time. Four of the Design Team have agreed to move to the new Los Angeles office. While the new
127
facility has state-of-the-art equipment, the primary network and centralized design database is still located at the Seattle site. The Designers are expected to utilize the central database to store all design documents and update progress on any projects running outside of the facility. From the central database, projects are reviewed and designs approved by management and consultants located in Seattle. A severe ice storm in Washington State recently disrupted network connectivity for a 3 days. Though the Los Angeles facility were able to work locally, they were unable to utilize the centralized database.This contributed to delays in the design process and ultimately a number of projects were at risk of missing deadlines. After an exercise to analyze the cost benefits, a localized network was created with a central database that replicates the Seattle office. Manual procedures have been created to handle delivery of project information and designs through other transportation methods. The solution has resolved delays in maintaining the database, though it is recognized that significant difficulties exist if prolonged disruptions occur in the future. Question 5
An IT Service Continuity Plan is now in place and requires testing. The CIO has concerns about the impact of a full test on the business but has asked for assurances that the plan will be effective.
That is the MOST appropriate recommendation for testing the IT Service Continuity Plan. a) To minimize the impact on the business, an initial walk through simulation of the plan should be conducted with representatives from across the business and the IT organization. This will also identify any potential issues with the plan. The results of this simulation should be reviewed. A full test should be scheduled to verify a full recovery of the business processes and IT services. This full test should be carefully planned with the CIO and announced to minimize the risk to the business. Regular testing of the plan should be agreed and established. b) To minimize the impact on the business, a separate walk through should be conducted at each site to contain the overall risk to the business. Upon successful completion of a walk through, a full test of the plan should be conducted to verity a full recovery of the business. The test should be announced to minimize the potential risk to the business. Future tests should involve all sites. c) A series of separate walk through tests should be conducted against a number of scenarios. This will minimize the impact to the business and will sufficiently identify any improvements required to the plan. To further minimize the impact of testing on the business, the test should involve only IT personnel who are responsible for the recovery of IT services within the plan. Regular testing of scenarios should be conducted to ensure the readiness of the IT staff. d) A full test of the plan should be announced to the staff and performed. Clear objectives and critical success factors should be established with the CIO for the test beforehand. Issues identified by the test should be reviewed to identify any improvement opportunities. A regular schedule for testing should be established. Question 6
128
In order to have the Continuity Plan be successful, people need to understand their responsibilities, specifically key people inside the process. What is the MOST appropriate approach to communicating responsibilities?
a) Create an organizational perspective of the Continuity Plan. Start at the executive level to explain the authority, control, and responsibility of executive members during crisis management. Have the executive team assign capable individuals who can be responsible for the coordination of activities at each site including one person who has responsibility to coordinate activities between the sites. Sit down with these individuals and explain their roles. Have them take the lead during tests of the continuity solution so that confidence and trust is built. Bring these coordinators along when working with business and service teams to describe the recovery activities in detail. Perform walkthroughs of the recovery activities before testing to ensure that individuals and groups understand what's expected. b) Create a document of roles and responsibilities for recovery activities at each site. Provide this document to the site manager for distribution at each location. Follow-up with the assigned coordinators to ensure that everyone has received and signed they have read and understand their roles in a crisis. Keep copies of the signed statements for audit purposes to show that every person in the company is aware of their responsibilities to business and IT Continuity. Create an organizational tree of recovery teams for the coordinators, to be used as a tool for knowing and understanding the recovery teams. c) Create an organizational chart for the continuity plan. Work with the recovery teams to identify who will be on what team and identify their specific responsibility. Work with the teams during testing to ensure they understand and and can perform their duties during a crisis. Perform walkthroughs of the recovery activities after testing to illicit lessons learned and identify improvements for the plan and process. Update the chart whenever changes to personnel are made. Introduce the chart to the executive team along with testing results, so that the members of the recovery team are recognizable to the executive team during a crisis. d) Work with the executive team to have them understand the continuity plan, when to expect during its execution, and the results of testing. Ensure they are part of the testing to build confidence and trust in the plan and their own authority in crisis management. Work with the recovery teams to have them clearly understand the activities required to recover, the priority of activities from situation to situation, and how to quickly respond to breakdowns in the plan. Perform walkthroughs before testing to ensure that individuals and groups understand what's expected and review the test results to identify improvements to the plan and its execution. Scenario Four INB is a subsidiary of a major financial company and specializes in creating and managing tax portfolios for upper-middle class individuals and small businesses. The majority of business is from lawyers and accountants looking to provide tax shelters and manage tax payments for their customers, many of whom use INB's services because of their reputation for securing their client's personal and financial data. A month ago, INB launched a new program providing tax preparation services and support. Because of this new program, they have experienced a 9 per cent increase in their client base with nearly 60 per cent of their existing clients using the new program.
129
INB uses the IT services provided by their parent company and the appropriate SLAs are in place for these services. In addition to securing client data, the IT department also provide support in capacity, availability, as well as managed desktop services to INB. The primary business application used by INB is the Customer/Client Management System (CCMS). The system is basically a relational database that matches the INB clients (lawyers, accountants, etc.) with their customer's portfolios. A single client could have a dozen customers using INB services. Each client has a profile created in the CCMS that can also be a portal to their customer's individual portfolios. CCMS maintains client profiles and customer portfolios within the same application. The design provides great flexibility and stability for all parties involved and the primary reason for INBs success. The CCMS has high service levels for performance and security in place. A couple of weeks ago, the capacity of the network reached the performance threshold. Performance of the system has degraded slowly since that time and some incidents in data loss have been reported. Though no indication has been found of a security breach, the situation has placed pressure in INB to resolve the situation quickly. Unfortunately, the launching of the new program has used most of the budget allotted to them by their parent company and the IT department is saying no funding is available to resolve the problem. A service review meeting is scheduled for this week.
Question 7
As the Capacity Manager, you are concerned about the problems that have occurred. What is the best approach to resolve the issue?
a) First of all, perform an analysis of the problem focused on identifying any trends in business activities that may have changed between now and a month ago. Work with INB to describe any new business activities or changes to existing business activities that may have been the result of the new tax preparation service. Document the business activities and determine the capacity requirements for each. Review the current Capacity Plan to determine if the solution will meet the new requirements found in the analysis. Make the appropriate adjustments to the plan and solution using change control and monitor the results with the aim of tuning the solution. Make a further recommendation to resize the CCMS and impacted applications before INB offers a new service to ensure that any new capacity requirements are discovered early on. b) Review the baseline model of the current capacity solution against current trends in performance to determine if the baseline is accurate. Create a new baseline based on current information. Create an analytic model using the new baseline and paying particular attention on the response times for the impacted systems. Fine tune the model until an optimum performance rate is achieved above the performance targets. Using change management to communicate and control, make the appropriate changes to the infrastructure based on the results of the model. c) Perform a sizing of the CCMS application to estimate the resources required to support the application in its current state. Work with the CCMS support team to identify gaps between the current support and what is required paying particular attention on volume and frequency of application use. Create an analytic model to substantiate the findings and make the appropriate changes to the Capacity Plan and infrastructure using change
130
control. Monitor the results with the intention of tuning the solution for better performance. d) Perform a root cause analysis of the problem with a focus on any changes to business activities related to the new service provided by INB. Work with INB to understand these changes in business activities and identify any changes in requirements to IT capacity they may have. Document the business activities and any new capacity requirements. Review the current Capacity Plan to determine if the solution will meet the new requirements found in the analysis. Make the appropriate adjustments to the plan and solution using change control and monitor the results with the intent of tuning the solution. Question 8 As a financial institution, INB has to be audit-ready at all times. As IT Security Lead, you have been asked to ensure that this is indeed true for preparation of an external audit in two weeks. This specific audit has been conducted twice before by the same auditor. What is the best approach to prepare for this audit? a) Review the results of the last two audits against the current Security Policy and Plans to identify any compliance issues that may be present. Perform an internal audit against the Security Policy. Review the results of the internal audit with IT personnel to identify any issues and determine the proper actions to resolve those issues. b) With appropriate personnel, review the regulatory standards that serve as the foundation for the audit and the results of the last two audits. This activity will provide the framework for understanding overall expectations as well as any issues that were found beforehand. Ensure that any issues found in the last audits have been resolved. Review the Security Policy and Plans against the regulatory standards and raise any questions of noncompliance. Distribute the Security Policy to business and IT personnel with a request to ensure they are compliant to the policy. If possible, perform an internal audit to verify compliance to the Security Policy and Plan. Resolve any non-compliances immediately. Note any activities that currently have a high risk factor that may cause non-compliance and the controls currently in place to mitigate that risk. c) Review the results of the last two audits to identify what issues were raised. Ensure those issues were resolved and how they were resolved. If the issues has not been resolved, identify the problem and the controls in place to mitigate the risks involved. Perform a selfassessment using the past audits as the basis to determine compliance. Resolve any noncompliances found immediately. Distribute the Security Policy to business personnel to ensure they are in compliance. d) Review the Security Policy and Plans against the results of the last two audits to determine where possible threats to compliance may be present. Work with the IT teams to isolate any security threats as definable by the audit. Distribute the Security Policy to IT and business personnel with a directive to ensure that they comply with the policy. Perform a self assessment on audit readiness. Review the results of the self-assessment with IT teams and resolve any issues that appeared.
131
132
Our positioning guides the organization in making decisions between competing resources and capability investments. Our positioning help managers test the appropriateness of a particular course of action. Our service providers have the capabilities to support business activities. We know the recurring patterns of activity in the customers business. We know if our customers activity varies based on the time of the year, location, or around specific events. There are enough resources to fulfil the demand from the customer's business activity as it occurs. We are aware of potential scheduling conflicts that may lead to situations with inadequate capacity. We know if the customers business is subject to regulations. Our Service Providers have knowledge and experience with regulatory compliance. If services come in direct contact with the customers of customers, we have additional policies and guidelines required to handle user interactions and user information. We know who our customers are. We know who our customer's customers are. We know how we create value for our customers and how they create value. We know what assets we deploy to provide value, and which of our clients assets receive value. We know which assets we should invest in and which of our assets our clients value most. How should we deploy our assets? How do they deploy their assets? We know what services we provide, and what outcomes we support. We know what constraints our customers face. We know which customer assets we support and what assets we deploy to provide value. We know who the users of our services are. We know what type of activity we support and how we create value for them. We know how we track performance and what assurances we provide. We know our market space. We know what our market space wants. We are offering unique products/services in our market space. Our market space is not already saturated with good solutions. We have the right portfolio of services developed for a given market space. We have the right catalogue of services offered to a given customer. Every service is designed to support the required outcomes. Every service is operated to support the required outcomes. We have the right models and structures to be a service provider. We know which of our services or service varieties are the most distinctive. There are services that the business or customer cannot easily substitute.
133
We know which of our services or service varieties are the most profitable. We know which customers, channels or purchase occasions are the most profitable. We know what makes us special to our business or customers. We have measurements that tell us when we are successful and know when that must be achieved. We are not vulnerable to substitution. There are means to outperform competing alternatives. We know what task or activity the service needs to carry out and what job the customer is seeking to execute. We know what outcomes the customer is attempting to obtain and what the desired outcome is. We know what constraints may prevent the customer from achieving the desired outcome, and how we can remove these constraints. We know our strengths and weaknesses, priorities and risks. We know how our resources and capabilities are to be allocated. We know what the long-term goals of the service organization are. We know services are required to meet our long-term goals. We know what capabilities and resources are required for the organization to achieve those services. We know how we will get to offer the services that are required to meet our longterm goals.
134
When deciding to outsource we know if the candidate services require extensive interactions between the service providers and the business's competitive and strategic resources and capabilities. When deciding to outsource we know if the customer or market space expect us to do this activity. The customer or market space will give us credit for performing an outsourcing activity exceptionally well.
135
136
We have defined Availability Management's process activities, methods and techniques. The Reactive activities of Availability Management are defined. The Proactive activities of Availability Management are defined. We have defined Availability Management's triggers, inputs, outputs and interfaces. We have defined Availability Management's KPIs. We have defined Availability Management's information management reporting. We have defined Availability Management's challenges, Critical Success Factors CSFs)and risks.
137
Service Continuity Management's Stage 1: Initiation is defined. Service Continuity Management's Stage 2: Requirements and strategy are defined. Service Continuity Management's Stage 3: Implementation is defined. Service Continuity Management's Stage 4: On-going operation is defined. We have defined Service Continuity Management's triggers, inputs, outputs and interfaces. We have defined Service Continuity Management's KPIs. We have defined Service Continuity Management's Information Management reporting. We have defined Service Continuity Management's challenges, CSFsand risks.
138
139
12 Glossary
Alert: A warning that a threshold has been reached, something has changed, or a failure has occurred. Asset: Any resource or capability. Application Sizing: Determines the hardware or network capacity to support new or modified applications and the predicted workload. Baselines: A benchmark used as a reference point for later comparison. CMDB: Configuration Management Database. CMS: Configuration Management System. Configuration Item (CI): Any component that needs to be managed in order to deliver an IT Service. DML: Definitive Media Library. Function: A team or group of people and the tools they use to carry out one or more processes or activities. Incident: An unplanned interruption to, or reduction in the quality of an IT service. Known Error: A problem that has a documented Root Cause and a Workaround. KEDB: Known Error Database. Maintainability: A measure of how quickly and effectively a CI or IT service can be restored to normal after a failure. Modeling: A technique used to predict the future behavior of a system, process, CI etc. MTBF: Mean Time Between Failures (Uptime). MTBSI: Mean Time Between Service Incidents.
140
OLA: Operational Level Agreement.
Process: A structured set of activities designed to accomplish a specific objective. Process Owner: Role responsible for ensuring that a process is fit for purpose. Remediation: Recovery to a known state after a failed Change or Release. RFC: Request for Change. Service: A means of delivering value to Customers by facilitating Outcomes Customers want to achieve without the ownership of specific Costs and risks. Service Owner: Role that is accountable for the delivery of a specific IT service. SCD: Supplier and Contracts Database. Service Assets: Any capability or resource of a service provider. Serviceability: Measures Availability, Reliability, Maintainability of IT services/CIs under control of external suppliers. SIP: Service Improvement Plan. SKMS: Service Knowledge Management System. SLA: Service Level Agreement. SLM: Service Level Manager. SLR: Service Level Requirements. SSIP: Supplier Service Improvement Plan. Status Accounting: Reporting of all current and historical data about each CI throughout its lifecycle. Trigger: An indication that some action or response to an event may be needed. Tuning: Used to identify areas of the IT infrastructure that could be better utilized.
141
UC: Underpinning Contract. Utility: Functionality offered by a product or service to meet a particular need. Often summarized as what it does. VBF: Vital Business Function. Warranty: A promise or guarantee that a product or service will meet its agreed requirements.
142
143
13 Certification
13.1 ITIL Certification Pathways
There are many pathway options that are available to you once you have acquired your ITIL Foundation Certification. Below illustrates the possible pathways that available to you. Currently it is intended that the highest certification is the ITIL V3 Expert, considered to be equal to that of Diploma Status.
For more information on certification and available programs please visit our website http://www.artofservice.com.au
144
For more information on certification and available programs please visit our website http://www.artofservice.com.au
145
D C A
Question 2
Question Rationale Most Correct A Second Best D Third Best Distracter B C Focuses on the reasons for using Patterns of Business Activities.
This answer covers most of the fundamental components of a PBA, and relates these to the school setting. It goes beyond its use in capacity management to also touches on security and availability uses. It also speaks of PBAs becoming a distinct asset. Some of the same benefits as A, but restricts its uses within Capacity Management and does not mention that PBAs can become assets to IT services. Provides a good explanation of what a PBA is, but is short on explaining the possible uses of a PBA. The answer doesn't have any noticeable reference to IT, so it is not clear how the method can be used specifically.
Question 3
Question Rationale Most Correct B Second Best A Third Best C Distracter D The question concentrates on Capacity Management, specifically on processes for identifying cost reduction opportunity. This is the best answer because it focuses on and stays inside the guest experience of IT, providing a recommendation that should increase guest satisfaction . It uses modelling techniques to justify the recommendations. This answer focuses on the business's ability to support IT services more than the guest experience. It uses a modelling technique to understand and justify the strategy for server consolidation. Like answer A, this lacks any focus on guest experience. In performs a small level of analysis to substantiate its direction, but not enough to be the best solution. This approach becomes an arbitrary consolidation based on reducing cost only, without regard to business or guest impact.
146
Question 4
Question Rationale Most Correct B Second Best C Third Best D Distracter A
The question focuses on Describing a Service within the context of a stockholder's meeting. It answers the task at hand directly by speaking about the entire IT Service Management solution. It answers all the relevant questions used to implement a service what, where, where, how? This answer uses the same framework as answer b, but focuses on a single service. Though it may not provide all the information about IT services in general, it does provide a creative and structured approach for communicating the value of IT to the guests in a very specific way. The framework used here is used to define a service. It has all the components, but the content is dry and not relevant because it lacks guest perspective. This approach is highly technical and statistical. Though it may have a lot of good information, it will not communicate.
Question 5
Question Rationale Most Correct A Second Best D Third Best B Distracter C The question focuses on the testing of the IT Service Continuity Plan and the proper approach when dealing with business concerns. The best answer because of its steady approach of simulation testing before a full test to build confidence in the plan. The effort involves the business and IT organization from the very beginning and creates a structure for regular testing in the future. This recommendation involves the CIO if not the entire business and establishes future testing of the plan. It not as good as answer A because it doesnt take time to build confidence in the plan before performing a test. This recommendation does not involve the business or business representation in the planning of the test. Conducting separate walkthroughs at the sites is a good step, but does not build confidence or identify issues in the effectiveness a full recovery of the business. The least effective answer because a full test is never performed, nor does it involve the business. This answer will never effectively build confidence or trust in the full recovery of the business.
Question 6
Question Rationale Most Correct A Focuses on IT Continuity Planning and ensuring that people understand their roles and responsibilities during a crisis. The best answer works each level of the organization downward to ensure that the entire plan is understood. This is the preferred method because IT continuity requires executive support. In addition, the senior level is used to establish and provide input to the lower levels of the organization, with the intent to build relationships and confidence in the plan. Like A, this can be an effective approach. Unfortunately, though the activities may be present, nothing hands-on is available for the participants like a presentation or an organizational chart which runs the risk of forgetting the plan. Takes the opposite approach than answer A but still effective. At the centre of this effort is the creation of a organizational chart that can be used by members of the team at all levels to identify and relate to other persons responsible for business and IT continuity.
147
Wrong Answer. Creating and distributing a document for roles and responsibilities does not ensure that people understand the roles and responsibilities given to them even if they sign a statement that they do. All this answer does is ensure that an audit trail is available to show that work has been done to meet the objective.
Question 7
Question Rationale Most Correct Demonstrate an understanding of Capacity Management and how it works with Demand Management, particularly in determining changes to requirements in service. The best answer because it starts with understanding the patterns of business activities which will more accurately determine the requirements A on capacity and other services. The approach also ends with a recommendation for application sizing whenever the business changes their services, not just for IT. The approach is similar to A in that it resolves the current problem at hand, D but it doesn't not address any future changes in business requirements Not the best answer, but workable. A new service from the business probably will change the baseline and it needs to be re-evaluated. B However, the baseline does not take into consideration the customer perspective of their requirements. This is the least effective solution because it is performing the sizing after the application or new service it supports has been introduced. Application C sizing should always be done before a new service or major change in existing services.
Distracter
Question 8
Question Rationale Most Correct B The question focuses on Information Security Management, specifically in the preparation of an audit. This approach ensures that the Security Policy and regulatory standards are aligned, and that the issues from the last two audits are reviewed and resolved. It performs an internal audit against the Security Policy, not the regulatory standard. It also specifies what action to take if a noncompliance is evident and not resolvable. Not as comprehensive as answer B, but the internal audit against the Security Policy gives a more likely success rate. This is a very reasonable approach and the simplest to implement. However, a self-assessment is usually not as strong as an internal audit, though it can build a great deal of confidence. Wrong answer. Firstly, the Security Policy and Plans are never reviewed. Secondly, the self-assessment is against the last two audits.
A D
Distracter
148
15 References
ITIL. Continual Service Improvement (2007) OGC. London. TSO. ITIL. Passing your ITIL Foundation Exam (2007) OGC. London. TSO. ITIL. Service Design (2007) OGC. London. TSO. ITIL. Service Operation (2007) OGC. London. TSO. ITIL. Service Strategy (2007) OGC. London. TSO. ITIL. Service Transition (2007) OGC. London. TSO. ITSMF International (2007). Foundations of IT Service Management Based on ITIL V3. Zaltbommel, Van Haren Publishing. ITSMF International (2008). ISO/IEC 20000: An Introduction. Zaltbommel, Van Haren Publishing. ITSMF International (2006). Metrics for IT Service Management, Zaltbommel, Van Hare Publishing. The Art of Service (2008) CMDB and Configuration Management Creation and Maintenance Guide, Brisbane, The Art of Service. The Art of Service (2008) How to Develop, Implement and Enforce ITIL v3 Best Practices, Brisbane, The Art of Service. The Art of Service (2008) IT Governance, Metrics and Measurements and Benchmarking Workbook, Brisbane, The Art of Service The Art of Service (2007) ITIL Factsheets, Brisbane, The Art of Service (2008) Risk Management Guide, Brisbane, The Art of Service. Websites www.artofservice.com.au www.theartofservice.org www.theartofservice.com
149
150
B backups 62, 70, 91-2 balance 23, 46, 53, 59, 69, 91-2 Balancing service 24 baselines 44, 81, 83, 119, 129, 139, 147 Basic Concepts for Availability Management 55 BCM 86-7, 91 BCM process 87 BCP (Business Continuity Plans) 92-4 benefits 9-10, 25, 33, 36, 40, 46, 64, 80, 83, 119, 125, 145 potential business 65 Benefits of Service Design 40 BENEFITS of SERVICE DESIGN 3 Benefits of Service Strategy 36 BENEFITS of SERVICE STRATEGY 3 BIA (Business Impact Analysis) 85-9, 108, 110 Brisbane 148 budgets 8, 18, 37-8, 47, 120, 122, 129, 135 bundle 23-4 business 5, 7, 9-11, 33-4, 40-1, 44-7, 52-3, 62, 65-7, 74, 78-9, 84-6, 95-8, 126-8, 130-7, 145-7 [15] anticipated 105 associated 59 changing 39 customer s 132 interview 72 normal 70 business accommodations 125 business action 33 business activities 38, 47, 49, 78, 105, 123-4, 129-30, 145, 147 customer's 132 customer s 47 patterns of 37, 44, 48, 76 business application, primary 129 business applications 65 Business Apps 30 business areas 95, 109 business availability requirements 107 Business Capacity Management 78 Business Capacity Management sub-process 136 business case 33, 74 business centers 124-5 business changes 109, 147 business commitment 74 Business Continuity Management 86 Business Continuity Management practices 85 Business Continuity Management process 109 Business Continuity Plans (BCP) 92-4 business Customer 111 business/customers 36 business developments 66 business direction 85, 94 business drivers 106 business experience 67 business goals 5, 135 business growth 9, 51 business guests 124-5 business hours 65 business impact 58-9, 89, 134 Business Impact Analysis see BIA initial 135 updated 66
151
152
capacity management process 84 capacity management works 44 Capacity Manager 107, 123, 129 Capacity Plan 76, 107, 129-30 Capacity plans 84 capacity-related incidents 106, 108 capacity requirements 105, 107, 123, 125, 129-30, 145 defined 123 minimum 123 CCMS 129 certification 4, 143-4 Certification Pathway 4, 143-4 CFIA (Component Failure Impact Analysis) 60-2, 67, 80 challenges 22, 51, 74, 84, 101, 122, 145 defined Availability Management's 136 defined Capacity Management's 136 defined Service Continuity Management's 137 change control 129-30 Change Management and Service Validation 48 change management process 94 changes services 104 chart 128 organizational 128, 146 choice, wide range of 26-7 CI (Configuration Item) 31, 33, 42, 56-7, 61-2, 69-70, 72, 82, 110, 115, 121, 139-40 CIO 127, 146 CIs (Configuration Items) 61-2, 69-70, 110 classification 31, 99, 115 clients 5, 83, 128-9 collaboration 116 combination 5, 7, 30-1, 53, 63, 92 Commercial Confectionery Organizations 46 commitment 109, 124 Common Capacity Management Activities 80 Common Terminology 3, 21, 33 communities 115-16 company 5, 122, 124-6, 128 parent 129 Complete regular Business Impact Analysis 85 complexity 27, 44, 67, 72, 80 compliance 40, 70, 87, 97, 103, 105, 119, 130 Component Capacity Management 78-9, 82, 84 Component Capacity Management sub-process 79, 136 Component Failure Impact Analysis, see CFIA component failures 57, 61, 91 component unavailability 56 components 9, 22, 27, 34, 53-7, 59-62, 64, 74, 76, 79-82, 84, 94, 107, 124, 135, 139 [1] behavioral 29 computer equipment 122 concepts 3, 8, 10, 14, 16-17, 21-3, 25, 55, 57 basic 135-7 confidence 64, 68, 111, 128, 146-7 confidentiality 96, 110 Configuration Item, see CI Configuration Items, see CIs Configuration Management 37, 41 Configuration Management System 61, 139 conjunction 58, 61, 80, 97, 104, 110 Consolidating services 125 constraints 25, 132-3 context 49, 52, 67, 78, 80, 86, 146 Continual Service Improvement 13, 17-18, 37-8, 41-2, 58, 66, 99, 103, 118-21, 148 Continual Service Improvement Model 3, 118-19
153
continuity 14, 26, 85, 88, 95, 120, 128, 146 continuity planning 85, 146 continuity plans 95, 109, 128 Continuous Service Improvement, see CSI contracts 30, 57, 85, 97, 109, 113 cook 22 coordinates 18, 23, 104, 111 coordination 37, 41, 77, 104-5, 128 coordinators 128 Core Service Package 26-7, 50 cost-effectiveness 46, 52-3 Cost-justifiable service quality 9 costs 22-4, 27-8, 33, 35-7, 40, 46, 51, 55, 59, 62, 66-7, 71, 75, 77, 91-2, 125 [7] reducing 125, 133, 145 Creating Service Value 25 creation 104, 111, 145-6 crisis 128, 146 Critical Success Factors, see CSFs Crown Copyright 14, 21, 23-5, 29, 40, 48, 54, 60, 63-4, 68, 71, 82, 86, 89-91, 98, 100 [2] CSFs (Critical Success Factors) 45, 136-7 CSI (Continuous Service Improvement) 17-18, 37, 41-2 culture, organizational 35, 133 Customer and User demands 9 customer assets 24, 37, 132 Customer/Client Management System 129 customer contact, primary 103 customer employees 134 customer enquiries 22 customer expectations 134 customer experience 22, 55 customer perceptions 131 customer perspective 46-7, 50, 52, 147 customer portfolios 129 customer values 131 customers 7-8, 14-15, 18, 21-9, 33, 36, 40, 44-7, 49-50, 52, 55, 58-9, 67-8, 110-11, 128-9, 131-5 [6] associated 34 common 134 customer's 132 external 33 organization s 36 outcomes 22, 140 customer's requirements, identifying 126 customers 58 customer s 26 D damage 68, 88, 94, 100 data diagnostic 69 key 72 Combining Component Capacity Management 84 database 83, 113, 127 decisions 29, 46, 92, 94, 111, 124-5, 132 defined Availability Management's purpose 135 Defined business 57 defined Capacity Management's purpose 136 defined process 30 defined Service Continuity Management's purpose 136 Defining Processes 28 delivery service 25 demand 18, 23-4, 36-7, 44, 46-52, 77, 105-8, 111, 113, 125, 131-2 Demand and Capacity management 106 Demand Management 3, 44, 46-52, 76, 78, 147
154
Demand Management and Service Level Management 44 Demand Management Process Manager 107 Demand Management Process Owner 105 Demand Management Strategies 38, 120 demand management techniques 46, 106, 108 department 31, 45, 126, 129 dependencies 58, 61-2, 94 deploy 23, 41, 132 deploy service assets 37 description 26, 104-10, 123-4 structured 123-4 design 5, 8, 14, 18, 29, 35, 39-41, 44, 46, 48, 50, 57, 78, 104-5, 126-7, 135 [7] component 76 design of services 3, 37, 40-1, 54, 58, 63, 120, 135 design plans 135 design policies 105, 135 design process 127 design services 135 design stages 80-1 Design Team 126 designers 126-7 Desktop 30-1 diagnosis 52, 69-70, 76, 107 diagnosis of capacity-related incidents 106, 108 diagram 86-7 differentiation 25-6, 50 direction 37, 39, 64, 145 Directory Services 30 disaster 44, 85-6, 92-4, 109 disk 62, 83 display 111 disruption detection, timely 54 disruptions 52, 55-6, 58-60, 62-3, 66-9, 72, 77, 79, 85, 89-90, 92, 94 frequency of 56, 58 service-affecting 52 Distracter 145-7 Distribute 130 district 122-3 expanded 123 school 122-3 document Business Requirements 135 documentation 41, 59, 65, 68-9, 99, 103, 105, 110 documents 10, 13, 21, 34, 54, 65, 87, 90, 92-3, 97, 105, 110-11, 115, 128-30, 135, 147 downtime, total 56, 69 E effectiveness 71, 104-5, 108, 146 efforts 10, 22, 28, 37, 57-8, 122, 146 employees 47-8 end-to-end business, integrated 39 Enforce ITIL 148 Error 14, 18, 70, 139 evaluation 37, 41, 78 event management systems, enhanced 107 events 18, 31, 48, 50, 52, 59-60, 63, 69, 79, 85, 92, 132, 140 evolutions 13 execution 10, 47, 84, 99, 128 experience 5, 7, 9, 37, 105, 125-6, 132 guest 145 External Service Providers 33 F facilities 85, 89, 92, 122-3, 126-7, 145
155
156
individuals 13, 128 industry 5, 7, 13 information 18, 23, 41, 48-9, 61-2, 64, 66, 69, 84, 92, 96, 106, 115, 123-5, 143-4, 146 [3] customer-generated 115 processing 115 sensitive 49 information management defined Availability Management's 136 defined Capacity Management's 136 defined Service Continuity Management's 137 information security 96, 98 Information Security (ISM) 3-4, 44, 58, 90, 96-9, 101, 109-10, 137, 147 Information Security Management 3-4, 96-7, 99, 101, 137, 147 Information Security Management Activities 99 Information Security Management and Capacity Management 58 Information Security Management process 96 Information Security Management System, see ISMS Information Security Policy 96-9, 109-10, 137 Information Security Policy Management System 137 Information Technology Infrastructure Library 13 infrastructure 9, 18, 22-3, 33, 44, 51, 53-5, 62-3, 66, 80-1, 83, 85, 91-4, 100, 104, 129 [5] infrastructure components 52, 80, 107 infrastructure information 61 Infrastructure Library 13 initiation 70, 86-7, 103, 137 initiation process 88 input events 63 instigating 109-10 intangible costs 67 Integrated centralized processes 9 interfaces 36-7, 39, 41, 104, 119, 136-7 Internal Service Providers 33 international travel 49 Internet Service Providers (ISP) 26-7 invocations 94, 109 ISM, see Information Security ISM policy 96-7 ISMS (Information Security Management System) 97-9 ISO/IEC 4, 144, 148 ISP (Internet Service Providers) 26-7 IT Service Continuity Management, see ITSCM IT Service Management (ITSM) 3, 5, 7-11, 13-14, 22, 33, 36, 40, 44, 76, 97, 113, 144, 146, 148 iterative process 59, 81 ITIL accredited 5 official 7 ITIL V3 Intermediate exams 122 ITIL s 15 ITSCM (IT Service Continuity Management) 3-4, 44, 54, 61, 64-6, 85-7, 90-1, 94-5, 107, 110, 136 ITSCM, scope of 85 ITSCM plans 85-6, 94-5, 109 ITSCM process 86, 93, 108 ITSM, see IT Service Management ITSM processes 101 ITSM set of organizational capabilities 33 ITSMF 148 K key infrastructure components 57 Key Performance Indicators, see KPIs Key Skills 105-6, 111 know 96, 126, 131-4 knowledge 7-8, 15, 18, 23, 103, 105-6, 108, 115, 126, 132
157
L lawyers 128-9 layers 10 Level Management 76 levels 26-7, 41, 44, 48-9, 54, 59, 64-5, 79-80, 83, 88-9, 97, 107-8, 110-11, 123, 125, 145-6 [3] license 14, 21, 23-5, 29, 40, 48, 54, 60, 63-4, 68, 71, 82, 86, 89-91, 98, 100 [2] lifecycle 14, 16, 18, 40, 55, 86, 140 locations 123-4, 128, 132 locks 83 logic operators 63 London 148 loss 36, 58-9, 86, 88 loss of service 88 Luther, Martin 122 M Mainframe 30-1 maintainability 37, 54, 56-7, 59, 107, 139-40 maintenance 54, 57, 64-5, 86, 97, 99, 111, 115 preventative 64-5, 79 Major Concepts of ITIL 16 management 7, 18, 23, 33, 44-6, 53, 59, 63-4, 76, 78-9, 85, 95-7, 99, 107-8, 115, 126-7 [4] crisis 128 customer s 23 Implementing Systems 60 Service Portfolio 78 Systems 59 managers 28, 111 Business unit 9 market space 33, 36-7, 132, 134 matrix 61, 100 meal 22 Mean Time 56 Mean Time Between Failures, see MTBF Mean Time Between Service Incidents, see MTBSI Mean Time to Restore Service, see MTRS meeting, service review 129 meeting business requirements 50 memory 83 methodology 90-1 standard project planning 87 metrics 40-1, 56, 74, 95, 101, 118, 135, 148 misuse 96-7 mitigate 63, 130 model 129 analytic 129 modified service offerings 52 momentum 42, 118-19 monitor 18, 66, 79, 82, 103, 107, 110, 123, 129-30 monitoring 59-60, 64, 67, 79-80, 82, 108 monitoring service levels 114
158
MOST 123, 127-8 Most Correct 145-7 MTBF (Mean Time Between Failures) 56-7, 62, 74, 139 MTBSI (Mean Time Between Service Incidents) 56-7, 139 MTRS (Mean Time to Restore Service) 56-7, 74, 140 N natural process 14 network capacity 139 network resources 108 new services 61, 99, 105, 107-8, 129-30, 135, 147 sizing of 105, 107 new technologies 65, 79-80, 84, 106, 108, 124 Nominated budgets 38, 42, 120-1 non-business 94 non-compliance 130, 147 O objectives 10-11, 34-5, 39, 45-7, 52-3, 67, 76, 85, 87, 96, 98, 106, 116, 118-19, 126 strategic 14, 35, 38-9, 46, 120 Objectives of Service Design 39 OBJECTIVES of SERVICE DESIGN 3 Objectives of Service Strategy 35 OBJECTIVES of SERVICE STRATEGY 3 Office of Government Commerce, see OGC OGC (Office of Government Commerce) 13, 21, 23-4, 29, 40, 48, 54, 60, 63-4, 68, 71, 82, 86, 89-91, 98, 148 [3] OLAs 18, 42, 74, 98-9, 111, 113, 121, 140 operating system software resources 108 Operational Activities of Capacity Management 81 optimization processes 3, 34, 104, 118 organization 7-8, 10, 13-14, 22-4, 27-9, 33, 35-7, 40, 46-7, 64, 66-7, 87-93, 97-8, 111, 132-3, 146 [14] international 45 multiple independent 24 project 87 umbrella 33 organization can/will 34 organization houses 92 organization works 23 organizational capabilities 7, 33 specialized 7, 33 organizational changes 30, 133 organizational goals 10 organizational management processes 76 organizational objectives 7 organizational requirements 53 organizational structure 31, 98 suggested 31 organizational tree 128 organization's objectives 11 organization s 9, 85, 103, 115 organization s approach 97 organization s Business Continuity Management process 109 organization s continuity strategy 109 organization s control 99 organization s culture 133 organization s distinctiveness 131 organization s objectives 11, 37, 64 outcomes 22, 24-5, 28, 131-3 required 132 outputs 7, 28-9, 61, 121, 136-7 service design 42
159
P packages 26 developing service 26 single service 125 parents 122 participants 131, 146 Particularly Capacity and Availability Management 48 partners 7, 96, 98 pathways 4, 143-4 patterns 29, 36-7, 47-9, 52, 125, 131, 147 Patterns of Business Activities 123-4, 145 Patterns of Business Activity, see PBAs PBA code 49 PBAs (Patterns of Business Activity) 18, 37-8, 47, 49-51, 105, 113, 120, 123-4, 145 perceptions, customers value 36 performance 23-6, 28, 40, 42, 46, 50-1, 53-4, 56, 76, 78-9, 81-4, 95, 101, 105-6, 108, 129-30 [3] component 108 system 108 performance levels 58, 80-1 performance service levels 106, 108 performance thresholds 106, 108, 129 person 28, 33, 124-5, 128, 146 person/role 103 personnel 123, 127-8, 130, 145 perspectives 3, 8, 62, 82, 100 organizational 128 phases 15, 18, 34, 118 physical components 29 PIRs (Post Implementation Reviews) 104 Planner 104 planning 3, 58, 61, 78-9, 85, 94, 122, 124, 146 effective 52 high level 34, 76 planning phases 105 Planning Protection 3 Planning Protection and Optimization 119 planning tool 33 plans 34, 71, 86-8, 90, 92-4, 98, 101, 104-5, 108, 124, 127-30, 146-7 organizational 98 service management 47 Service transition 135 Point of Sale Services 11 policies 28, 34, 37, 86-7, 97-9, 101, 103-4, 106, 109, 130, 132 defined Availability Management's 135 defined Capacity Management's 136 defined Service Continuity Management's 136 portfolios, customer's 129 Post Implementation Reviews (PIRs) 104 potential risks 40, 127 power 92, 119 PPO processes 44-5, 104, 118 Practice of Service Management 131 PRACTICE of SERVICE MANAGEMENT 3 practices best 13-14, 106, 115, 148 good 7, 13, 15, 90 presentation 126, 146 price 25 principles 28, 55, 86, 90, 135-7 Principles of Capacity Management 77
160
Priorities & Risks 38, 120 Proactive activities of Availability Management 136 Problem Management 44, 54, 69, 71, 79, 110, 119 problem management processes 66 process compliance 99 process design 30 process effectiveness 101 process engine 114 process guides 90 Process Improvements 18, 118 process integration 79 process manager 28 process map 105 Process metrics and KPIs 38, 42, 120-1 process owner 28, 103, 119, 140 Process Perspective 8 process procurement orders 134 process strategy 103 processes 7-8, 10, 14, 16-18, 28-31, 33-4, 39-41, 44-7, 49-53, 77-9, 103-6, 110-11, 118-19, 123-4, 135, 139-40 [17] Processes & Functions 28 PROCESSES & FUNCTIONS 3 procurement process 135 production environment 40-1, 70, 81, 84, 123 products 7, 11, 59, 76, 98, 114, 126, 133, 141, 148 products/services, unique 132 programs 94, 122, 143-4 new 128-9 project management 104-5 Projected Service Outage (PSO) 65 projects 33, 88, 127 Proprietary knowledge of organizations 13 protection 3, 34, 45, 86, 99, 104, 110, 115, 120 Protection & Optimization 5, 44, 52 Protection & Optimization Best Practices actions 73 Protection & Optimization Best Practices Figure 89, 91, 100 Protection & Optimization Best Practices Question 123, 146 Protection & Optimization Best Practices Server 62 Protection & Optimization Best Practices transaction 71 Protection & Optimization Best Practices 78 Protection & Optimization Processes 3 Protection and Optimization processes 34, 104, 118 Protection and Optimization Processes 34 providers 24-6, 92, 109, 111 modern service 113 modern Service 25 multiple service 134 PSO (Projected Service Outage) 65 Q quality 5, 8, 18, 22, 36, 39-40, 42, 44, 46, 53, 67, 79, 81, 88, 113, 115 [2] perceived 22 quality solution designs, deployment of 104-5 Quantitative availability requirements 58 R RACI 31 RACI Model 31 Rationale 145-7 RCI 31 Reactive activities of Availability Management 136 reactive process 77 recommendations 71-4, 83, 108, 125, 127, 129, 145-7
161
recovery 59-61, 69-70, 89, 91-3, 105, 127, 140 full 56, 127, 146 recovery activities 128 recovery facilities 92 recovery mechanisms 85 recovery plans 61, 85, 93-4 recovery requirements 70, 95 recovery teams 128 Reduced customer 78 Reduced time 75 reduction 74-5, 95, 97, 99, 139 redundancy 37, 53, 59, 61 regulations 37, 40, 45, 88, 98, 132 relationship 3, 15, 23-4, 34, 49, 51, 58, 146 Relative component recovery sequences timeframes 61 reliability 37, 54, 56-7, 59, 62, 74, 107, 140 repair 69-70 requirements 7, 13, 26, 37, 52-4, 58-9, 76, 78, 87-8, 90, 98, 104-5, 123-6, 129-30, 145, 147 [9] basic 40, 124, 144 businesses 58 compliance 88, 98 serviceability 59, 107 Requirements Portfolio 38, 120 resistance, Customer 51 resolution 52, 70, 72, 76 resource requirements 9, 41, 81, 88 resource usage 106, 108 resource utilization 125-6 resources 23-4, 28, 33, 37, 40, 57-9, 66-7, 69, 72, 76-7, 83, 86-7, 108, 125, 131-3, 139-40 [3] business s 133 strategic 133-4 responsibilities 3, 7, 9, 28, 30-1, 34, 45, 49-50, 64, 66, 76, 87, 94-6, 103-5, 128, 146-7 [4] restaurant 22, 124 restrictions, demand management 46, 51 results 13-14, 36, 44, 56, 62-3, 65-6, 69, 75-7, 79, 81, 85, 90-1, 115, 119-20, 127-30 retirement 115 review 63, 74, 90, 94, 99-100, 107, 110-11, 123, 126, 128-30, 135 regular 93, 95, 107, 109 rewards 115-16 Risk Analysis 90 Risk Analysis and Management 63-4 risk management 64, 90, 107, 109-10 risk profile 91 risk reduction 89, 91 risks 22, 24, 33, 35, 37, 41, 46, 62-4, 90-2, 95-7, 109, 111, 119-20, 127, 130, 135-7 [6] associated 58, 97 longer-term business 85 roles 3, 7-9, 17, 22, 28, 30-1, 34, 44-5, 49, 52, 93, 95, 103-7, 128, 140, 146-7 [3] full-time 94 Roles and Responsibilities for PPO 103 row 32, 83 S Sale Services 11 scenarios 86, 88, 122, 124, 126-8 Scheduled services 83 schedules 65, 125, 135 testing 65, 107, 109 school board 122-3 school setting 123-4, 145 schools 122-3 scope 9, 13, 45, 53, 72, 76, 85-7, 94, 96, 113 defined Availability Management's 135
162
defined Capacity Management's 136 defined Service Continuity Management's 136 Seattle 126-7 Second Best 145-7 security 26, 44, 52, 96-100, 105, 107, 110, 119, 129, 137, 145 security controls 97, 99-100, 110, 124 security incidents 99-100, 110 security measures 44, 98-9 security policies 96-7, 99 Security Policy 130, 147 Security Policy and Plans 130, 147 security requirements 59, 96 self-assessment 130, 147 server consolidation strategy 125 servers 30-1, 55, 57, 83, 91, 125 Service Acceptance Criteria 42, 121 service analytics 134 Service and Component Capacity Management 78 Service and Process Improvements 18 service areas 109 Service Asset and Configuration Management 37 service assets 24, 37, 41-2, 121, 134, 140 changed 18 service assets interact 37 service availability 18, 52, 54, 56, 60-1, 65-6, 71, 74, 107, 125 level of 52, 55, 57 recover 70 service availability levels 59 service backups 59 service-based on-process flows 8 service breaks 74 service capacity 79 Service Capacity Management 79, 82-4 Service Capacity Management sub-process 79, 136 Service Catalog 111 service catalogue meeting business demand 120 Service Catalogues 18, 37, 41-2, 47, 50, 113, 121 Service Components 42, 121 service configuration 78 Service Continuity 14, 93-4 Service Continuity and Continuity Plans 109 service continuity issues, potential 109 Service Continuity Management 4, 54, 85-6, 107, 136 defined 137 Service Continuity Management's Stage 137 Service Continuity Manager 108 Service Continuity Plans 85, 92-3, 109, 127, 146 Service Continuity Strategy 91 service dashboards 113 Service Delivery 8, 45, 75 Service Delivery and Support 30 Service Delivery performance 111 service demand 47 Service Design 13, 17-18, 34, 37-42, 52, 58, 61, 80, 104, 120-1, 148 Service Design and Service Transition 54, 58 SERVICE DESIGN INTERFACES 3 Service Design Manager 104-5 Service Design Packages 42, 104, 121 Service Design Principles 4, 135 service design processes 53, 58, 104 service design stage 61 Service Desk 30-1, 47, 65, 70, 135 Service Desk for Support 27
163
service disruptions 56, 60, 69, 125 Hotel 125 service disruptions Reliability 56 service downtime 58 Service Economics 35 service failure 60, 72 Service Failure Analysis, see SFA service focus 55 service group 134 Service Improvement Plans 18, 140 Service Improvement Plans/Programmes, required 111 Service Incidents 139 service interruptions 71 Service Knowledge Management Systems 113, 140 Service Level Agreements 59, 115, 140 Service Level Management 11, 17, 40-1, 44, 49, 58, 68, 79, 81, 110 Service Level Manager 41, 105, 107, 110, 123, 140, 145 Service Level Packages 26-7, 50, 78 Service Level Requirements 42, 78, 121, 123, 140, 145 service levels 78, 80, 83, 86, 105, 123, 126, 134 high 129 normal 79 required 81 Service Lifecycle 3, 14-18, 22, 34-7, 41-2, 48, 53, 77, 99, 104, 121 Service Lifecycle approach 18 Service Lifecycle Model 14 Service Lifecycle Phases 3, 16, 18 Service Lifecycle processes 63 Service Lifecycle Service Strategy 121 Service Lifecycle work 18 Service Lifecycle Work 3, 18 Service Management 5, 7, 10, 13, 15, 35, 113 service management activities 18, 96 Service Management Frameworks 105 service management practices 9 service management processes 40, 97, 119 Service Models 38, 120 Service Offerings & Agreement processes 113 Service Offerings & Agreements 9, 24, 51, 113, 131 Service Offerings & Agreements processes 113 Service Operation 13, 18, 30, 36-8, 41-2, 53, 58-60, 66, 70, 79-80, 119-21, 148 implementing 119 normal 56, 93 surrounding 119 Service Operation and Continual Service Improvement 54 service operation processes 54, 119 Service Operations Lifecycle 44 service organization 7, 10, 133 Service Owner 33, 103, 140 Service Package Example 26-7 Service Package Example Fi 26 Service Packages 25-7, 49-50 multiple 27 Service Packages and Service Level Packages 26-7 service performance 18, 76, 111 service portfolio 36, 39, 50, 76, 105, 119, 131, 135 service Portfolio 135 service portfolio information 18 Service Portfolio Management and Change Management 85 Service Portfolios 35, 38-9, 42, 120-1 service providers 14-15, 24-5, 27, 33, 46-7, 49, 66-70, 80, 88, 94, 111, 119, 132, 134, 140, 144 service provider s ability 78 service provision 9, 51, 85, 120
164
service quality 36, 46, 50, 103, 106, 119-20 defined 131 improving 131 service-related enquiries 103 service requests 18, 47, 49, 134 service requirements 123-4, 135 service solutions 39, 135 Service Strategy 3, 13, 18, 25, 34-9, 42, 46, 58-9, 104, 120-1, 148 developing 36 effective 36 Developing 35 Service Strategy and Service Design 34 Service Strategy in Service Design 118 SERVICE STRATEGY INTERFACES 3 Service Strategy outputs 38 Service Strategy Principles 3, 35, 131 Service Strategy processes 49 service targets 41, 76 service teams 128 Service Time 56 Service Transition 13, 18, 37-8, 41-2, 58, 120-1, 148 Service Transition and Service Operation 115 Service Transition Plans 42, 121 Service units 24 Service Units 23-4 Service Utility 25-6 Service Utility and Service Warranty 26 Service Validation Criteria 38, 120 service value 25, 36, 48, 50 relative 106 service varieties 132-3 Service Warranty 25 Serviceability 57, 140 services broadband internet 26 bundled 26 business 23 candidate 133-4 changed 39 communication 49 core 26, 50 designing 41 disrupt 55 end-to-end 22 highly-available 60 improving 119 key 36 know 133 local 125 managed desktop 129 managing 40 modified 8, 41, 80-1 modifying 37 package 50 recover 92 report 66 single 146 supporting 26, 38, 42, 47, 49, 57, 118, 120-1 tax preparation 128-9 third-party recovery 109 tuning 108 workflow 116 Event 116
165
166
timescales 40, 92 tools 8, 28, 31, 39, 69, 71, 98, 113, 128, 139 project-planning 87 systems management 72, 74 transactions 74, 83, 124, 131 trends 79, 81, 83, 108, 125, 129 trust 128, 146 TSO 148 tuning 83, 129-30, 136, 140 tuning techniques 83 U UCs 18, 42, 111, 121 un-availability requirements 44 unavailability 52, 56, 66-7, 75, 94 unavailability of services 74, 107 understanding 22, 41, 47, 53, 55-6, 58, 62-3, 72, 96, 105, 111, 125, 128, 130, 147 usage 47, 53, 84, 105, 107-8 user profiles 47, 49-50, 105 users 5, 9, 22, 29, 44, 49, 52, 55, 57, 61-2, 67-71, 74, 80, 97, 115, 123-4 [1] utility 25-7, 37, 141 level of 26, 50 utilization 47, 80-1, 125, 136 Processor 82 V Validation, Service 41 value 7, 22-6, 28, 30, 33-4, 46, 50, 131-3, 140, 146 defined Availability Management's 135 defined Capacity Management's 136 defined Information Security Management's 137 defined Service Continuity Management's 136 value creation 25 Van Haren Publishing 148 variations 47, 65 VBFs (Vital Business Function) 57-8, 62, 92, 141 vision 118-19, 126 Vital Business Function, see VBFs Vital Business Functions 57-8, 62, 92, 141 volumes, service design 7 W walkthroughs 128, 146 warranty 25-7, 37, 46, 48, 50, 141 website 115, 143-4, 148 work 10, 18, 26, 31, 44, 49-50, 58, 69, 77, 79, 84, 86, 107, 123, 125-30, 147 Workflow Management 116 workflow process 116 workloads 47, 83 www.artofservice.com.au 143-4, 148 www.theartofservice.com 148 www.theartofservice.org 148