You are on page 1of 1

PRINCIPLES AND BASIC CONCEPTS

Problemmodels
Incidents vs problems
Reactive and proactive problem
management:
Reactive: process activities are triggered in
reaction to an incident that has taken place
Proactive: process activities are triggered by
activities seeking to improve services
dsd
REQUEST FULFILMENT
Request fulfilment is the process responsible for
managing the lifecycle of all service requests from the
users.
Formal request forma user for something to be provided
E.g. Password changes, access to printers, PC moves
Define &
Explain Event
Purpose
Scope
Objectives
Principles and Basic Concepts of Incident Management
Known-Error Database
Reactive ProblemManagement
Pro-active ProblemManagement
Service Desk Services
Handles incidents, resolving as many as possible, where the resolution is
straightforward
Owns incidents that are escalated to other support groups for resolution
Reports problems to the problem management staff members
Handles service requests
Provides information to users
Communicates with the business about major incidents, upcoming
changes, and so on
Manages requests for change on the users behalf if required
Manages the performance of third-party maintenance providers
Monitors incidents and service requests against the targets in the SLA
Updates the CMS as required
Gathers availability figures, based on incident data
Objectives
Help plan, implement and maintain stable technical
infrastructure to support business processes
Well designed, resilient and cost effective topology
Keep infrastructure in optimumcondition
Diagnose and resolve technical failures
Roles
Technical manager/teamleader
leadership, control and decision making for the team
providing technical knowledge and leadership
ensuring training, awareness and experience levels
maintained
performing line management
reporting to senior management on technical issues as
required
Technical analyst/architect
Determine evolving needs of users, sponsors, stakeholders
Establish system requirements
defining and maintaining knowledge about systems dependencies
performing cost benefit analyses
developing operational models that will optimize resource utilization and
maximize performance
configuring the infrastructure to deliver consistent and reliable performance deliver
defining all the tasks required to manage the infrastructure
A service desk is a functional unit made up of a
dedicated number of staff responsible for
dealing with a variety of service activities,
usually made via Telephone calls, web interface
or automatically reported infrastructure events.
Objectives:
Improved customer service, perception and satisfaction
Increased accessibility through a single point of contact
Better quality and faster turnaround
Improved teamwork and communication
Enhanced focus and proactive approach to service
provision
Reduced negative business impact
Better management infrastructure and control
Improved usage of IT support resources
More meaningful management
Role:
Logging all relevant incidents
Providing first line investigation and diagnosis
Resolving incidents at first contact
Escalating incidents that cannot be resolved within
agreed timescale
Keeping users informed of progress
Closing all resolved incidents
Conducting customers surveys
Communicating with users
Organizational Structure/Types of service desks.
Higher volume of calls
Higher skill levels
Different time zones
specialised groups of users
VIP status of users
Use technology and tools
to give impression of single
service desk.
24hour coverage
lowcosts
Technical
Management
ITIL SERVICE OPERATION
Benedito, Christian, Kara, Abrahams, Peters, Smith, Nombewu
Purpose
Undertake activities and
processes to manage and
deliver services at the levels
agreed with business users
and customers.
The ongoing management
of the technology that is
used to deliver and support
services
Objectives
Deliver the service as
agreed on in the SLA
Reduce both the number
and impact of outages
Controlling access to IT
services
Scope
Service Operation covers all
areas of service
delivery, including:
services (internal, external
and customer/user)
service management
processes (see next slide)
technology
Value
Effective Service Operation
processes and
functions help organizations
to:
reduce the impact and
frequency of outages
provide access to
standard service
PROBLEM MANAGEMENT
Problem: The unknown cause of one or more incidents
Purpose:
Objectives
Scope:
EVENT
MANAGEMENT
Any change of state that has
significance for the management
of a configuration item(CI) or IT
service
Events are recognized by
notifications through IT service,
CI or monitoring tool
Manage events through their
lifecycle
Event management is the basis
for operational monitoring and
control
Configuration Items (CI)
Some are included because they
need to stay in a constant state
Some are included because their
status needs to change frequently
Environmental conditions
Software license monitoring
Security
Normal Activity
Detect changes of state for the
management of a CI and IT service
Determine control action for events and
ensure these are communicated to the
appropriate functions
Provide the trigger to execute many
service operation processes and operation
management activities
Comparing actual operating performance
and behavior against design standards and
SLAs
Provide a basis for service assurance,
reporting and improvement
APPLICATION
MANAGEMENT
Role:
Custodian of technical knowledge and expertise
Provides the actual resources to support the service
lifecycle. Providing guidance to IT operations on howto
carry out the ongoing operational management of
applications.The integration of the application
management lifecycle.
Objectives:
To support the
organizations business process.
These objectives are achieved through:
Applications that are well designed, resilient and cost-
effective
The required functionality is available to achieve the
required business outcome
Organization of adequate technical skills
App Development vs Management
A single interface to the business for all stages of the business lifecycle, common
requirements and specific-setting process
Development teams show be held partially accountable for design flaws that create
operational outages
Management staff show be held partially accountable for contribution to the technical
architecture and manageability design of applications
A single change management process for both groups
A clear mapping of development and management activities throughout the lifecycle
Focus on integrating functionality and manageability requirements
INCIDENT
MANAGEMENT
Incident:
An unplanned interruption
to an IT service or reduction
in the quality of an IT
service
Scope:
Incident Management includes
any event which disrupts, or
could disrupt a service. This
includes events which are
communicated directly by users,
wither through service desk or
through an interface from event
management to incident
management tools.
Priority To agree and allocate
an appropriate prioritization code
to an incident, this will determine
how the incident is handled both
by support tools and support staff
Urgency Refers to how quickly
the business needs a resolution to
an incident
Impact Indication of impact is
often the number of users being
affected
Purpose:
Purpose of Incident
Management is to restore
normal service operation as
quickly as possible and
minimize the adverse impact
on a business operations,
thus ensuring agreed levels
of service quality are
maintained.
Objectives:
Ensure that standardized methods and
procedures are used
Increase visibility and communication
of incidents to business and IT support
staff
Enhance business perception of IT
through use of professional approach in
resolving and communicating incidents
Align incident management activities
and priorities with those of the
business
Maintain user satisfaction with quality
of IT services
Interfaces:
Service Design
Service level management Input for SLA
Information security management Security
related incidents
Capacity management Trigger for performance
monitoring
Availability management Availability of IT
services
Service Transition
Service Asset and Configuration Management ID
faulty equipment
Change Management Workaround need a RFC
Service Operation
ProblemManagement Investigate and resolve
underlying cause
Access Management Unauthorized access
attempts
Activities of incident
management
Methods of Incident Management
Incident Identification Work can only
begin when it is known that an incident had
occurred
Incident Logging All relevant information
of incident must be logged and date/ time
stamped
Incident Categorization Must be
allocated with an incident categorization
coding so exact type of incident is recorded
Incident Prioritization Allocate an
appropriate prioritization code to
determine how the incident is handled
Incident Closure Service desk to check if
incident is resolved and that users are
satisfied
Techniques of Incident Management
Functional Escalation
Management Escalation
Hierarchic Escalation
Incident Models:
An incident model is a way of
predefining the steps that should
be taken to handle a process in
an agreed way.
Steps that should be taken to
handle incident
Chronological order these
steps should be taken in
Responsibilities
Precautions to be taken
Incident Tracking:
Incidents should be tracked throughout their lifecycle to support proper handling and
reporting on the status of incidents.
Open Incident recognized but not yet assigned to a support resource
In Progress Incident in progress of being investigated
Resolved Resolution has been placed for incident but normal state service
operation has not yet been validated
Closed User or business has agreed that incident has been resolved
Major incidents are separate procedures, with shorter timescales and greater urgency.
Definition of what constitutes a major incident must be agreed and ideally mapped onto
overall incident prioritization scheme.
PROBLEM
MANAGEMENT
PROCESS FLOW
Detect Problem
Reactive or proactive
detection (triggers in
Notes)
Log Problem Raise
record with details of
problem
Categorise Problem
Record service/
component affecte
Prioritize Problem ID
importance of incident
based on impact
and urgency
Resolution and
Recovery Cause
removed and service
restored
Raise known error
record Problemwith
a documented cause
and workaround stored
in KEDB
Workarounds
Temporary way of
overcoming difficulty
ProblemInvestigation
and Diagnosis
Diagnose root cause
ProblemClosure
Check that all events
are recorded
Major Problem Review
Reflect on major
problems as part of
training for support
staff or proactive
problemmanagement
STEP 1
STEP 2
STEP 3
STEP 4
STEP 5
STEP 6
STEP 7
STEP 8
STEP 9
STEP 10
Problemdetection
Suspicion or detection of a
cau7se of one or more incidents
by the service desk.
Analysis of incident
Notification of supplier or
controller
Problemprioritization.
Can systembe recovered?
How much will it cost?
How long will it take to fix the
problem?
Raising a known error record.
Known error is defined as a problem
with a documented root cause and
workaround.
Known error record should identify
the problemrecord it relates to and
document the status of actions being
taken to resolve the problem.
Workarounds
In some cases may be possible to find a
workaround to the incidents.
When workaround is found, it is important
that the problem record remain open.
In some cases may be multiple
workarounds.
ProblemLogging
User details
Service details
Equipment details
Date/time initially logged
Incident description
Problem
Categorization
Problems should be
categorized same way as
incidents.
True nature of the
problem must be easily
traced
A
C
C
E
S
S

M
A
N
A
G
E
M
E
N
T
Propose
Purpose of access management is
to provide the right to be able to
use a service or group of services.
Objectives:
Manage access to services based on policies
and actions.
Efficiently respond to requests for granting
access to services.
Oversee access to services and ensure rights
being provided are not improperly used.
Scope:
Access management is effectively the
execution of the policies in information
security.
AM gives rights to use a service but also
makes sure its available at agreed times.
AM is a process that is executed by all
technical and application management
functions, usually not separate function.
Purpose is to allow storage of previous knowledge
of incidents and problems.
Known error record should hold exact details.
Essential that any data put into the database can be
quickly and accurately recovered.
Care should be taken to avoid duplication of
records.
Event Types
Super user
Recruited from business to take on some IT responsibilities
Facilitate communication between IT and business
Reinforce user expectations about agreed service levels
Training for users in their area
Support for minor incidents
Involved with new releases and roll outs
Management Positions
Shift Leader
IT Operations Manager
IT Operation Analysis
IT Operator
a notification that a threshold
has been reached, something
has changed, or a failure has
occurred
a means of acquiring human
intervention
often created and managed by
system management tools
ALERT
take place as a result of an incident
report
help prevent the incident from
recurring or provide a
workaround if avoidance is impossible
analyzes incident records to identify
underlying causes of
incidents
analysis of previous incidents reveals
a trend or pattern that
was not apparent when each incident
occurred
Technical Management is treated in ITIL as a "function".
It plays an important role in the management of the IT
infrastructure.
Many Technical Management activities are embedded in various ITIL processes -
but not all Technical Management activities. For this reason, at IT Process Maps
we decided to introduce a Technical Management process as part of the ITIL
Process Map which contains the Technical Management activities not covered in
any other ITIL process.
Technical Management activities embedded in other
processes are shown there, with responsibility assigned to
the Technical Analyst role.
Follow the sun Virtual Service Desk Local Service desk Centralised Service Desk
IT OPERATIONS
MANAGEMENT
Regular scrutiny and
improvements to achieve
improved service at
reduces costs
Maintenance of the
status of day to day
processes and activities
Swift application of
operational skills to
diagnose and resolve
any IT failures that
occur.
Information Technology Operations Control consists of:
Maintainence
Purformance
Console
Management
Backup &
Restore
Print & Output
Job
Schedueling
Maintain user and customer satisfaction
Source and deliver the components of requested
standard services
Assist with general information, complaints or
comments
Objectives
To fulfil a request will vary depending upon exactly
what is being requested.
Note that ultimately it will be up to each organization to
decide and document which service request it will
handle through the request fulfilment process
Scope
To manage the lifecycle of all
problems from first identification.
Seeks to minimize the adverse
impact of incidents
Prevent problems and resulting
incidents from happening
Eliminate recurring incidents
Minimize the impact of incidents
that cannot be prevented
Includes the activities required to
diagnose the root cause of incidents.
Will also maintain information about
problems and the appropriate
workarounds and resolutions
Role of Communication in Service Operation
Role:
All communication must have an intended purpose or a resultant action.
Any means of communication can be used as long as stakeholders understand when
and where communication will take place.
Types of communication:
Routine operational communication
Between shifts, Projects
Performance reporting
Communication related to change, exception & emergency
Training on new or customized processes and service designs
Communication of strategy, design, and transition to service operation teams
Process activities, methods
and techniques.
Informational: signifies something expected and
normal
has happened, and which does not require any action
E.g. scheduled backup has completed normally
Warning: A notification that a pre-defined threshold
has been reached. Action may or may not be required
E.g. % 5 hard disk capacity available
Exception: A notification that a service or component is
operating abnormally. Action is usually required
E.g. a router failing