This action might not be possible to undo. Are you sure you want to continue?
The primary goal of the Incident Management process is to restore normal service operation as quickly as possible and minimise the adverse impact on business operations, thus ensuring that the best possible levels of service quality and availability are maintained. 'Normal service operation' is defined here as service operation within Service Level Agreement (SLA) limits
INCINDENT MGT PROCESS
Inputs: Incident details sourced from (for example) Service Desk, networks or computer operations configuration details from the Configuration Management Database (CMDB) response from Incident matching against Problems and Known Errors resolution details response on RFC to effect resolution for Incident(s).
RFC for Incident resolution. tracking and communication The roles and functions related to the Incident Management process are: first. After resolution of the cause of the Incident and restoration of the agreed service. A resolution or Work-around should be established as quickly as possible in order to restore the service to Users with minimum disruption to their work. the Incident is closed. Incident Handling Most IT departments and specialist groups contribute to handling Incidents at some time. second-and third-line support groups. The Service Desk is responsible for the monitoring of the resolution process of all registered Incidents . Incidents that cannot be resolved immediately by the Service Desk may be assigned to specialist groups. The process is mostly reactive. To react efficiently and effectively therefore demands a formal method of working that can be supported by software tools. monitoring. Incident Handling (incident life cycle) .in effect the Service Desk is the owner of all Incidents. Incident Management activities: Incident detection and recording Classification and initial support investigation and diagnosis resolution and recovery Incident closure Incident ownership. updated Incident record (including resolution and/or Work-arounds) resolved and closed Incidents communication to Customers management information (reports). including specialist support groups and external suppliers (roles) Incident Manager (role) Service Desk manager (function).
Targets for resolving Incidents or handling requests are generally embodied in an SLA. the process should still include registration by the Service Desk the majority of Incidents (perhaps up to 85% in a highly skilled environment) will be resolved at the Service Desk the Service Desk is the 'independent'function monitoring Incident resolution progress of all registered Incidents. as follows: all Incidents are reported to and registered by the Service Desk .where Incidents are generated automatically.Priority The priority of an Incident is primarily determined by the impact on the business and the urgency with which a resolution or Work-around is needed. the main actions to be carried out by the Service Desk are: . On receipt of an Incident notification. The Service Desk plays an important role in the Incident Management process.
without directly addressing the underlying cause of the Incident. the result of failures or errors within the IT infrastructure. Normally a Problem record is raised only if investigation is warranted. Problems. in effect. i. resulting in a repair. if possible. . Relationship between Incidents. resolution advice is given: this frequently will be possible for routine Incidents or when a match to a known Problem/error is achieved following successful resolution the Incident record is closed: details of the resolution action and the appropriate category code are added the Incident is assigned to second-line support (i.record basic details . then it may be appropriate to raise a Problem record.e. Perhaps by rebooting a PC or resetting a communications line. upon the business services. to complete the Incident record the appropriate priority is assigned and the User is given the unique systemgenerated Incident number (to be quoted at the beginning of all further communication) the Incident is assessed and.this includes timing data and details of symptoms obtained if a service request has been made. (both actual and potential). Known Errors and RFCs Incidents. Successful processing of a Problem record will result in the identification of the underlying error. a Workaround or an RFC to remove the error. In some cases the Incident itself. and the number of similar Incidents apparently sharing a common underlying cause that have reported. indicative of an unknown error within the infrastructure. the effect or potential effect upon the Customer. Where the underlying cause of the Incident is not identifiable. and both the Problem record and the investigation into its cause can persist even after the initial Incident has been successfully closed. This logical flow. result in actual or potential variations from the planned operation of the IT services. can be dealt with quickly. It can be seen therefore that a Problem record is independent of associated Incident records. the request is handled in conformance with the organisation's standard procedures from the CMDB. and the record can then be converted into a Known Error once a Work-around has been developed. This impact will often be assessed via the impact. This may be appropriate even where the actual result of the Incident has been addressed.e. A Problem is thus. and/or an RFC. the Configuration Items (CI) reported as the cause for an Incident is selected. The cause of Incidents may be apparent and that cause can be addressed without the need for further investigation. a specialist group) following unsuccessful resolution or recognition that a further level of support is needed.
Handling Problems is quite different from handling Incidents and is therefore covered by the Problem Management process. is shown in Figure We thus have the following definitions: Problem Known Error RFC The unknown underlying cause of one or more Incidents. During the Incident-resolution process the Incident is matched against the Problem and Known Error database. and it is possible that the Problem will not be diagnosed until several Incidents have occurred. . A Problem that is successfully diagnosed and for which a Workaround is known. If a Workaround or resolution is available. If not. over a period of time. A Problem can result in multiple Incidents. It should also be matched against the Incident database to see whether there is a similar Incident outstanding. A Request For Change to any component of an IT Infrastructure or to any aspect of IT services. the Incident can be resolved immediately. or whether there has been resolution action taken for any previous similar Incident.from an initial report to the resolution of an underlying Problem.
When Incident Management finds a Work-around it will be analysed by the Problem Management team who will update the associated Problem record (see Figure 5. Where it is felt at Incident logging that an Incident should be treated as a Problem. where. as always. thereby increasing effectiveness the proactive identification of beneficial system enhancements and amendments the availability of business-focused management information related to the SLA.Incident Management is responsible for finding a resolution or Work-around with minimum disruption to the business process. which the Problem Management team would have to create. Benefits of Incident Management The major benefits to be gained by implementing an Incident Management process are as follows: For the business as a whole: reduced business impact of Incidents by timely resolution. For the IT organization in particular: improved monitoring. but at this point there may not be a Problem record for the communication line failure. remain responsible for pursuing a resolution to the Incident with minimal possible disruption to the business processes. Incident Management will. a new Problem record will be raised. . The process is then that the Service Desk will link Incidents that are clearly the result of an existing Problem record. allowing performance against SLAs to be accurately measured improved management information on aspects of service quality better staff utilisation.for example. finds a Work-around or a resolution for a Problem and/or some related Incidents. It is also possible that the Problem Management team. the Work-around may be to send a report by fax due to a communication line failure.5). while investigating the Problem associated with the Incident. the Problem Management team should inform the Incident Management process in order that open Incidents have their status changed to 'Known Error' or 'closed' as appropriate. Note that an associated Problem record may not exist at this time . if appropriate. leading to greater efficiency elimination of lost or incorrect Incidents and service requests more accurate CMDB information (giving an ongoing audit while registering Incidents) improved User and Customer satisfaction. then it should be referred immediately to the Problem Management process. In this case.
Paper-based systems are not really practical or necessary. resulting in nonavailability of resources for implementation lack of clarity about business needs working practices not being reviewed or changed poorly defined service objectives. as highlighted by the following points: An up-to-date CMDB is a prerequisite for an efficiently working Incident Management process. Planning and implementation: Critical success factors Successful Incident Management requires a sound basis. or incorrectly or badly managed Incidents. This will greatly speed up the process of resolving Incidents.hence Incidents may become more severe than necessary and adversely affect IT service quality specialist support staff being subject to constant interruptions. and determining impact and urgency will be much more difficult and time-consuming. goals and responsibilities no provision of agreed Customer service levels lack of knowledge for resolving Incidents inadequate training for staff lack of integration with other processes lack of. tools to automate the process . Third-party Known Error databases should also be available to assist in this process. Possible problem areas Be prepared to overcome: no visible management or staff commitment. now that good and cheap support tools are available. Timely Incident resolution will satisfy Customers and Users. information about Configuration Items (CIs) related to Incidents should be obtained manually. If a CMDB is not available. making them less effective business staff being disrupted as people ask their colleagues for advice frequent reassessment of Incidents from first principle rather than reference to existing solutions lack of coordinated management information lost. An effectively automated system for Incident Management is fundamental to the success of a Service Desk. A 'knowledge base' in the form of an up-to-date Problem/error database should be developed to provide for resolutions and Work-arounds. Forge a close link with the Service-Level Management (SLM) process to obtain necessary Incident response targets.In contrast. or expense of. failing to implement Incident Management may result in: no one to manage and escalate Incidents .
Actions: classifying Incidents matching against Known Errors and Problems . The Incident should also be classified. Incident records raised in the previous activity are now analysed to discover the reason for the Incident. Classification and initial support Inputs: recorded Incident details configuration details from the CMDB response from Incident matching against Problems and Known Errors. resistance to change. Outputs will be: updated details of Incidents the recognition of any errors on the CMDB notice to Customers when an Incident has been resolved. Incident Management activities This section discusses in more detail the six activities encapsulated… Incident detection and recording classification and initial support investigation and diagnosis resolution and recovery Incident closure ownership. Resultant actions are to: record basic details of the Incident alert specialist support group(s) as necessary start procedures for handling the service request. tracking and communication Incident detection and recording Incident details from Service Desk or event management systems are the inputs for Incident Management. the process on which further resolution actions are based. Annex 5A provides some examples of classification codes. monitoring.
Outputs: Incident details yet further updated. and Work-arounds for Incidents. and a specification of the selection or required Work-around . however. find quick resolution) closing the Incident or routing to a specialist support group. This is not always the case. which should require no further investigation effort Investigation and diagnosis Inputs: updated Incident details configuration details from the CMDB. and resolution (including any Work-around) or a route to n-line support. Successful matching gives access to proven resolution actions. and a procedure for matching Incident classification data against that for Problems and Known Errors is necessary. Outputs: RFC for Incident resolution updated Incident details. and informing the User(s). and thereby defining priority assessing related configuration details providing initial support (assess Incident details.informing Problem Management of the existence of new Problems and of unmatched or multiple Incidents assigning impact and urgency. Classification is the process of identifying the reason for the Incident and hence the corresponding resolution action. Many Incidents are regularly experienced and the appropriate resolution actions are well known.or thirdline support. Actions: assessment of the Incident details. collection and analysis of all related information. or Incident routed to second.
Outputs: RFC for future Incident resolution resolved Incident. the Service Desk should ensure that: details of the action taken to resolve the Incident are concise and readable classification is complete and accurate according to root cause resolution/action is agreed with the Customer .Resolution and recovery Inputs: updated Incident details any response on an RFC to effect resolution for the Incident(s) any derived Work-around or solution. resolved Incident. Actions: the confirmation of the resolution with the Customer or originator 'close' category Incident. including recovery details. Incident closure Inputs: updated Incident details. alternatively.verbally or. closed Incident record. to raise an RFC (including a check for resolution) take recovery actions. preferably. When the Incident has been resolved. such that: • the Customer is satisfied • cost-centre project codes are allocated . updated Incident details. Actions: resolve the Incident using the solution/Work-around or. Outputs: updated Incident detail. by email or in writing all details applicable to this phase of the Incident control are recorded.
• • the time spent on the Incident is recorded the person. many organisations combine the roles of Change Management and Configuration Management. Roles of the Incident Management process Processes span the organisation's hierarchy. in many organisations roles may be combined because of the small size of the organisation or because of cost. tracking and communication Inputs: Incident records. and Customer reports and communication. Ownership. it is advisable to use the concept of roles. For example. 5. monitoring.8. A role embraces a set of responsibilities. . tasks and levels of authorisation. To remain flexible. Therefore it is important to define the responsibilities associated with the activities that have to be performed in the process.1 Incident Manager An Incident Manager has the responsibility for: driving the efficiency and effectiveness of the Incident Management process producing management information managing the work of Incident support staff (first-and second-line) monitoring the effectiveness of Incident Management and making recommendations for improvement developing and maintaining the Incident Management systems. date and time of closure are recorded. Actions: monitor Incidents escalate Incidents inform User. Outputs: management reports about Incident progress escalated Incident details.
The following metrics are examples for the effectiveness and efficiency of the Incident Management process: total numbers of Incidents mean elapsed time to achieve Incident resolution or circumvention.2 Incident-handling support staff First-line support (Service Desk) responsibilities include: Incident registration routing service requests to support groups when Incidents are not closed initial support and classification ownership. 5. tracking and communication tasks cover: monitoring the status and progress towards resolution of all open Incidents keeping affected Users informed about progress escalating the process if necessary.should be set.In many organisations. including the Configuration Items affected Incident investigation and diagnosis (including resolution where possible) detection of possible Problems and the assignment of them to the Problem Management team for them to raise Problem records the resolution and recovery of assigned Incidents. . by impact code) average cost per Incident percentage of Incidents closed by the Service Desk without reference to other levels of support Incidents processed per Service Desk workstation number and percentage of Incidents resolved remotely. monitoring. tracking and communication resolution and recovery of Incidents not assigned to second-line support closure of Incidents. Key Performance Indicators To judge process performance.8. Second-line support (specialist groups that may be part of the Service Desk) will be involved in tasks such as: handling service requests monitoring Incident details. without the need for a visit. clearly defined objectives with measurable targets . the role of Incident Manager is assigned to the (function) Service Desk Supervisor. monitoring.often referred to as Key Performance Indicators (KPIs) . broken down by impact code percentage of Incidents handled within agreed response time (Incident response-time targets may be specified in SLAs. Ownership. for example.
diagnose. in collaboration with the Service Desk and support groups handling Incidents.Reports should be produced under the authority of the Incident Manager. who should draw up a schedule and distribution list. .) related Configuration Item support group/person to which the Incident is allocated related Problem/Known Error resolution date and time closure category closure date and time. recovery. closing etc. To have control during the complete Incident life-cycle. for example via SLA reports.) date/time of action description and outcome of action. INCIDENT MANAGEMENT Data requirements for service Incident records The following data should be recorded during the Incident life cycle: unique reference number Incident classification date/time recorded name/id of the person and/or group recording the Incident name/department/phone/location of User calling call-back method (telephone. closed etc. Consider also making the data available to Users and Customers.) description of symptoms category (often a main category and a subcategory) impact/urgency/priority Incident status (active. Distribution lists should at least include IT services management and specialist support groups. waiting. mail etc. resolving. for every action is recorded: name/id of the support group or person recording the action type of action (routing.