Drafted Incident Management SOP2

2023
INCIDENT MANAGEMENT
STANDARD OPERATING
PROCEDURE
Purpose:
This document provides a step-by-step processes and procedure for incident management and
Service Level Agreements (SLAs).
Related Topics:
 Incident response plan and Status feedback

 Incident Intake and Escalation
 Incident Workflow
 Service Level Agreement
 Reporting
1. Incident response plan and Status feedback
PREPARATION POST
ANALYSIS RECOVERY
INCIDENT
FEEDBACK
Preparation
To prepare for incidents, compile a list of necessary information about the incident. Set up
monitoring so you have a baseline of normal activity. Determine which types of security events
should be investigated and create detailed response steps for common types of incidents.
Analysis
Involves identifying a baseline or normal activity for the affected systems, correlating related
events and seeing if and how they deviate from normal behavior.
Recovery
Is to restore to normal services. Your recovery strategy will depend on the level of damage the
incident can cause, the need to keep critical services available to the customers, and the
duration of the solution A temporary solution for a few hours, days or weeks, or a permanent
solution.
Post Incident
Incident response methodology is learning from previous incidents to improve the process.
You should ask, investigate and document the answers to the following questions:
 What happened, and at what times?

 How well did the incident response team deal with the incident? Were processes
followed, and were they sufficient?
 What information was needed sooner?
 Were any wrong actions taken that caused damage or inhibited recovery?
 What could staff do differently next time if the same incident occurred?
 Could staff have shared information better with other organizations or other
departments?
 Have we learned ways to prevent similar incidents in the future?
 Have we discovered new precursors or indicators of similar incidents to watch for in the
future?
 What additional tools or resources are needed to help prevent or mitigate similar
incidents?
2 Service Level Agreement
Incident Priority is set by Severity and Impact:
Category Description Resolution Expected Expected Reporting

Response Time Resolution Frequency
Time
Feature Feature does Resolution may Within 1 hours 4 hours 4 hours

Request not affect not be of confirmation
normal required
operations
Low Individual work Resolution may Within hours of 2 hours 2 hours

hindrance not be confirmation
acceptable required
Normal Individual work Immediate Within 1 hours 1 hours 1 hour

hindrance not resolution may of confirmation
acceptable not be
required
High Interruption to Immediate Within 20 45 minutes 30 Minutes

critical resolution is minutes of
processes required confirmation
Urgent Interruption to Immediate Within 15 30 minutes 15 Minutes

critical resolution is minutes
processes required
of confirmation
affecting many
users or
departments
Incident Escalation Matrix
Priority assignment Incident response

Incident Intake & Root Cause
& incident risk and Status
Escalation Analysis Report
assessment Feedback
Service Desk Service Desk Incident Coordinator S.D Manager
2.1 Incident Reception

Upon reception of an incident notice from the customer, the recipient is obliged to gather as
much information as possible concerning the incident in question and open a ticket as master
Incident ticket and hand that to incident coordinator.
Goal of this is that the information gathered will help the appropriate leadership and technical
resources to:
a) Assess the seriousness of the incident.
b) Assess the extent of the damage.
c) Identify the vulnerability created.
d) Estimate the additional resources required to mitigate the incident.
Below is a set of critical information to gather during incident reception:
1. Practice Name and location
a. Phone number
2. Point of contact
a. Name
b. Title
3. Incident background
a. Problem faced and when it started.
b. device(s) affected [device(s) name(s)]
c. Business processes impacted.
i. Is business impacted?
ii. Is there a financial impact.
iii. What else cannot be completed?
All incidents must be immediately posted in the #Incident-management Slack channel and start
the Incident Process with initial customer communications. Existing or new ticket priority is set
based on the SLA matrix.
3.0 Incident Response Plan and Status feedback
3.1 incident Response plan
3.1.1 Resource Gathering

The following resources should be set up prior to contacting the Point of contact:
1. Incident coordinator
a. Gather people and resources to address the issue, gather updates to
send based on priority schedule to stakeholders.
2. Incident repair manager
a. Approve needed emergency repairs and assist in tracking change
tickets.
3. Incident repair technician
a. Lead Agent to repair the issue.
4. Documentation Support
a. Knowledge team member to assist with documentation retrievals,
note needed updates to existing documentation, assist with ticket
documentation for compliance.
3.1.2 Incident Handling
Step 1: - ANYONE
- Announce an incident and ensure an incident coordinator or manager is
aware and starts the next step – else call a manager immediately
Step 2: INCIDENT COORDINATOR:
- Send initial Communication to stakeholders within 15 minutes.
- For initial notice – a ticket number is required while waiting for full details is
not required – incident members can be identified on the next update once
gathered.
Step 2: - INCIDENT COORDINATOR:
- Gather team and Schedule Teams-bridge with a dial-in option for 3rd parties.
- Share Teams-bridge in the Slack chat thread for this one incident.
- Update should include changes in roles if people change shifts or come in to
help.
- Anything touched must be documented in a timeline format – document
changes ie. Tech 1 made change 2 – 3 rd party agent made changes made to
systems.
Step 3: - REPAIR AGENT or REPAIR MANAGER:
- Post updates to Topic Thread for awareness and documentation
- Test with user and from technical- We should state the following here “who
we test with to make sure the issue is resolved”.
Step 4: - INCIDENT COORDINATOR: continue updates and comms until resolved.
a) Send out a final incident notice.
b) Update weekly incident resolution report.
c) Drop incident to a ticket for post-incident resolution monitoring -24 hours.
d) Close the incident ticket and open an RCA ticket for an RCA report generation
with a due date of 3 business days.
Step 5: Make RCA Set Ticket number.
Documentation Support shall take the captured the work performed and action(s)
implemented for future references and Job-aid drafting.
Step 6: Follow-up (This could be someone from Support, Dev Ops Manager, etc. To gather
feedback from the user)
4.0 Reporting
 An Incident report shall be created and approved by the Incident Repair Manager and the
Incident Manager prior to being presented to Upper Management.
 Furthermore, a Root Cause Analysis report shall be created and signed off by the Incident Repair
Manager before being uploaded to the repository.
 The Ticket Management team shall create a weekly Incident Summary report detailing the
status, details for all incidents worked on during that very week, the projected resolution date if
the resolution date is not specified and the action taken for each incident worked on.
 Root Cause Analysis will be delivered within 24.

Glossary
Escalation matrix: A document or system that defines when escalation should happen and who
should handle incidents at each escalation level.
Impact: The effect or influence that an incident has on the organization’s daily business

processes.
Incident: An unplanned interruption to or quality reduction of an IT service.
Incident intake: Incident reception and information gathering.
Job-aid: Clear instructions on how to perform a certain task.
Point of contact: A person that can be approached for information concerning the incident.
Process: A process is a series of steps involved in the way work is completed
RCA: Root Cause Analysis report
Severity: the extent to which the incident affects the organization.
SLA: Service Level Agreement, the expectations between the service provider and
the customer
Teams-bridge: Audio conferencing in Microsoft Teams

Drafted Incident Management SOP2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Drafted Incident Management SOP2

Uploaded by

Copyright:

Available Formats

2023

 Incident response plan and Status feedback

1. Incident response plan and Status feedback

 What happened, and at what times?

Incident Priority is set by Severity and Impact:

Category Description Resolution Expected Expected Reporting

Feature Feature does Resolution may Within 1 hours 4 hours 4 hours

Low Individual work Resolution may Within hours of 2 hours 2 hours

Normal Individual work Immediate Within 1 hours 1 hours 1 hour

High Interruption to Immediate Within 20 45 minutes 30 Minutes

Urgent Interruption to Immediate Within 15 30 minutes 15 Minutes

Priority assignment Incident response

Service Desk Service Desk Incident Coordinator S.D Manager

2.1 Incident Reception

3.0 Incident Response Plan and Status feedback

3.1 incident Response plan

3.1.1 Resource Gathering

 Root Cause Analysis will be delivered within 24.

Impact: The effect or influence that an incident has on the organization’s daily business

Incident: An unplanned interruption to or quality reduction of an IT service.

Incident intake: Incident reception and information gathering.

Job-aid: Clear instructions on how to perform a certain task.

Process: A process is a series of steps involved in the way work is completed

RCA: Root Cause Analysis report

Severity: the extent to which the incident affects the organization.

Teams-bridge: Audio conferencing in Microsoft Teams

You might also like