You are on page 1of 5

Strengthening the Business Continuity

Process with Methodical Drills


Written by By Asif Khan, Farooq Khan, & Ahmad H. Al-Sharidah
The weakest link in the chain is the strength of resilience for any organization. Business
continuity drills are the key to detect, address, and strengthen that weakest link.

Why drills? Without a solid drill plan in place, the business continuity team can never
provide the needed assurance that organizations’ critical services will be available at all
times. With periodic drills you could ascertain how effective each component of the
business continuity plan is and identify gaps needed to be addressed. With today’s
growing system dependencies, it becomes an increasingly difficult task to verify the
business continuity drills are effectively productive. That would mean business continuity
drills are conducted methodically to touch each service, its dependences, and the gaps
identified in these drills are not only addressed but also re-tested, in a drill, to determine
their effectiveness.
A well-planned business continuity drill regime is the quantum share of your
organization’s business continuity management (BCM) program. In fact, this regime
furnishes the true reflection of the popular axiom “what gets measured, gets done.”
Professionals strive to provide a much-needed sense of reassurance to BCM program
sponsors and proponents to gain their confidence. Nothing could be more prevailing than
a well-managed calendar of business continuity drills, augmented by fact-finding reports
when providing a thoughtful yet proven response to the executives’ concerns on the
bottom line to maintain stakeholder confidence in the reliability of critical systems.
We will provide a framework to establish a solid business continuity drill regime which
would not only provide a systematic checklist of each component in the BCM program but
also gives you the ability to quickly adapt to changes as they come along.
What is a Business Continuity Drill?
In his book “Disaster Recovery Testing: Exercising Your Contingency Plan,” Philip Jan
Rothstein notes, “The goal of testing and exercising your plan is not to find out if it works,
but to determine how it does not.”
The business continuity drill is a simulation of an outage scenario for which an accepted
level of resilience has been in place. The goal of this simulated scenario is to gauge the
actual resilience of an organization compared to expectations. Think of these simulations
as “what if” scenarios. A business continuity drill should never be deemed unsuccessful.
It should always identify gaps and opportunity for improvement and optimization.
What Do You Need For a Drill?
The first step is to define the outage scenarios, or the failure of each platform/component
within the environment. Once a business continuity scenario is identified, all of its
dependencies must be charted out, such as system requirement, systems support
personals, applications support staff, testing area, end-user testing, etc.
Figure 1: Example of charting system requirements

Types of Business Continuity Drills


Business continuity has several kinds of drills, such as single component, entire service,
table-top, and disaster simulation. Single component/platform is the most common and a
good point to start. In this type of drill you choose a single component/platform to test all
applications dependent on it such as network attached storages, database, or middleware
systems. In a service drill, an entire service for a particular outage scenario is selected.
These drills tend to be difficult to manage and require much more planning and support
staff. However, the results are worth the effort as you clearly know the
application dependency maps and test their atomicity while they are provided as one
service.
Viable drill types include scheduled, surprise, plan review, tabletop, walk through,
modular/component, and functional/line of business, simulation/mock, and
comprehensive/full scale.
How To Plan For a Drill?
Planning for a drill is half the job. Execution is the other half. When it comes to drill-
planning only three things are important: plan, plan, and plan. A calendar of business
continuity drills is a very handy tool. One may put all the drills on the calendar with
tentative dates before the start of the year and share it with all stakeholders. This way
you are giving a reasonable heads-up about the workload and your expectation of each
stakeholder.
Once you decide to conduct a drill you need to give a heads-up to the support team and
all stakeholders at least two weeks ahead of time. The importance must be given to see
if there are any changes in the pipelines that may require re-testing the scenario. If that
is the case, it is advised to postpone the drill until the system changes are made.
enario. If that is the case, it is advised to postpone the drill until the system changes are
made.
MARCH
S S M T W T F

1 2 3 4 5 6 7

8 9 10 11 12 13 14
15 16 17 18 19 20 21

22 23 24 25 26 27 28

29 30 31

Scenario 1

Scenario 2

Scenario 3

Sample of a drill calendar

Testing plans for each organization would arguably be different from one organization to
another. However, following a standard methodology would keep the efforts focused and
align with the expectations. We recommend the following testing methodology:

 The plans are tested to the fullest extent possible.


 The costs are not prohibitive.
 Service disruptions are minimal.
 The results provide a high degree of assurance in recovery ability.
 Evaluation provides quality input to plan review and updates.

Follow-Up and Lessons Learned


As stated earlier, every business continuity drill is a success as each drill would surely
identify gaps or shortcomings which would ultimately inch resilience of your critical
systems toward perfection. After each drill, a follow-up meeting with all stakeholders
should be arranged where the objective would strictly never “point fingers” but learn from
issues and ensure proper solutions. A proper drill report with tangible manageable action
items, if there are any, are to be shared with all stakeholders. These drill reports would
become an important document trail when your company is audited
for compliance certification or by internal auditors.

 CONTINUOUS DRILL CYCLE

In the Department of Energy’s “The Public Enquiry into the Piper Alpah Disaster,” W.D. Cullen
reports, “The policy and procedures were in place: the practice was deficient.” A continuous drill
cycle is a continuous loop for the improvement where the drill plan followed by a drill execution
followed by feedback which in turn followed by the addressing the gaps/issues and going back to
drill plan

 MATURITY LEVEL

As you progress in the continuous drill cycle the system would slowly but surely attain the
maturity level which would mean that fewer gaps would be identified during each business
continuity drill, and the critical services covered are well aligned with the scope and
exaptation of the organization’s BCM program. This is a commendable milestone.
However, this is just the beginning of a new phase (un-announced drill).

 UN-ANNOUNCED DRILL

Once you have conducted all your drills in the calendar few times, it’s time for the real
test (the un-announced drill). You can think of un-announced drill as the maturity test of
your end-to-end business continuity setup where the entire process of calling the support
staff to testing site to the end-user testing would reveal your actual maturity. No matter
what the results are, it is important to understand for the entire organization (including
executives) that the un-announced drill is a continuous journey not a destination.
Data center disasters may be caused by nature, equipment failure, or human factors. All
of these factors must be considered in disaster planning. One must have adequate
recovery plans, but it is only a plan if it is never tested.
Testing your recovery plan, unannounced, would help your organization emulate a real
outage scenario and identify problem areas so they could be corrected and prepared for
a real disaster. Recovery plans are complex, therefore it’s critical that thorough
preparations are done before conducting a drill or simulation test for all of the critical
components.
Information sessions/workshops with application and testing support staff should be
planned to explain the purpose, execution, and measurement details of the un-announced
drills.
A tabletop (mock) disaster recovery drill would provide an organization’s support staff a
practical checklist of procedures to follow during a recovery phase. The business
continuity team should go over each critical procedures and planning document needed
to ensure each step is covered in restoring critical services at the recovery site
The unannounced drill is a method of practicing for the actual scenario which would
generally be activated via an organization-wide announcement, followed by a grace
period (if available) to shut down gracefully. Operations then would be alerted to call-in
the on-call back-up support staff.
First and foremost, priority is production operations, which would be secured and
guaranteed to be operational during the entire drill period. If any interruptions are
observed the drill will be aborted at once.
A well-thought and flawless “abort” plan is a must which should be ready to be exercised
-- just in case if there are any unforeseen issues
 PROCESS FLOW: UN-ANNOUNCED DRILL

An email message would be sent to all concerned stakeholders of the drill announcing
the brief interruption of services during the failover of the services to recovery site. A grace
period of 15 to 30 minutes would be given to secure the in-process application data and
communicate the brief interruption to the end-users. The datacenter operations would be
alerted with the un-announced drill who would contact to bring on call/back-up support
on-site.
All support/application/users staff participating would be asked to report any problems
encountered during this recovery drill. For example, reporting to worksite, communication
difficulties, surprise technical issues, application issues, performance problems, etc.
One may follow similar communication and prepare in advance.

 DOCUMENT CONTROL

Once a business continuity drill is executed and a proper follow-up has been conducted,
an official report of the concluding the activity, its finding, and recommendations are
essential factors to

 To keep the management engaged and seek the support and attention from the support
organizations.
 To ensure the drill goals, results, and its execution are fully documented for establishing
a baseline and audit purpose.

Conclusion
Based on the way organizations operate, it is imperative that business continuity drill
strategies are embedded in the operational routine. For this approach, many of the critical
success factors focus on building and utilizing support within the organization.

 Build the confidence of users/customers by periodic drills/testing to validate that the


business continuity plans remain effective and organization has the proven capability to
sustain continuity of its critical operations in the event of an incident.
 Communicate the upcoming business continuity drills and its execution procedures with
support and end-users’ entities. Publish the annual business continuity drill calendar and
share the progress, revisions, and reminders with all stakeholders on a quarterly basis.
 Collaborate representation of support and end-users to ensure active participation during
the execution of business continuity drills and testing the validity of the business
continuous services.
 Foremost priority is the production operation, and every effort should be made to ensure
continued production operations during the business continuity drills as per expectations.
 Capitalize on the lessons learned and adhere to the Plan-Do-Check-Act cycle.
 Alignments of drill objectives to organization goals, management acknowledgement on
reports, test results, follow-up on lesson learned, and recognition.

You might also like