Professional Documents
Culture Documents
Why drills? Without a solid drill plan in place, the business continuity team can never
provide the needed assurance that organizations’ critical services will be available at all
times. With periodic drills you could ascertain how effective each component of the
business continuity plan is and identify gaps needed to be addressed. With today’s
growing system dependencies, it becomes an increasingly difficult task to verify the
business continuity drills are effectively productive. That would mean business continuity
drills are conducted methodically to touch each service, its dependences, and the gaps
identified in these drills are not only addressed but also re-tested, in a drill, to determine
their effectiveness.
A well-planned business continuity drill regime is the quantum share of your
organization’s business continuity management (BCM) program. In fact, this regime
furnishes the true reflection of the popular axiom “what gets measured, gets done.”
Professionals strive to provide a much-needed sense of reassurance to BCM program
sponsors and proponents to gain their confidence. Nothing could be more prevailing than
a well-managed calendar of business continuity drills, augmented by fact-finding reports
when providing a thoughtful yet proven response to the executives’ concerns on the
bottom line to maintain stakeholder confidence in the reliability of critical systems.
We will provide a framework to establish a solid business continuity drill regime which
would not only provide a systematic checklist of each component in the BCM program but
also gives you the ability to quickly adapt to changes as they come along.
What is a Business Continuity Drill?
In his book “Disaster Recovery Testing: Exercising Your Contingency Plan,” Philip Jan
Rothstein notes, “The goal of testing and exercising your plan is not to find out if it works,
but to determine how it does not.”
The business continuity drill is a simulation of an outage scenario for which an accepted
level of resilience has been in place. The goal of this simulated scenario is to gauge the
actual resilience of an organization compared to expectations. Think of these simulations
as “what if” scenarios. A business continuity drill should never be deemed unsuccessful.
It should always identify gaps and opportunity for improvement and optimization.
What Do You Need For a Drill?
The first step is to define the outage scenarios, or the failure of each platform/component
within the environment. Once a business continuity scenario is identified, all of its
dependencies must be charted out, such as system requirement, systems support
personals, applications support staff, testing area, end-user testing, etc.
Figure 1: Example of charting system requirements
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31
Scenario 1
Scenario 2
Scenario 3
Testing plans for each organization would arguably be different from one organization to
another. However, following a standard methodology would keep the efforts focused and
align with the expectations. We recommend the following testing methodology:
In the Department of Energy’s “The Public Enquiry into the Piper Alpah Disaster,” W.D. Cullen
reports, “The policy and procedures were in place: the practice was deficient.” A continuous drill
cycle is a continuous loop for the improvement where the drill plan followed by a drill execution
followed by feedback which in turn followed by the addressing the gaps/issues and going back to
drill plan
MATURITY LEVEL
As you progress in the continuous drill cycle the system would slowly but surely attain the
maturity level which would mean that fewer gaps would be identified during each business
continuity drill, and the critical services covered are well aligned with the scope and
exaptation of the organization’s BCM program. This is a commendable milestone.
However, this is just the beginning of a new phase (un-announced drill).
UN-ANNOUNCED DRILL
Once you have conducted all your drills in the calendar few times, it’s time for the real
test (the un-announced drill). You can think of un-announced drill as the maturity test of
your end-to-end business continuity setup where the entire process of calling the support
staff to testing site to the end-user testing would reveal your actual maturity. No matter
what the results are, it is important to understand for the entire organization (including
executives) that the un-announced drill is a continuous journey not a destination.
Data center disasters may be caused by nature, equipment failure, or human factors. All
of these factors must be considered in disaster planning. One must have adequate
recovery plans, but it is only a plan if it is never tested.
Testing your recovery plan, unannounced, would help your organization emulate a real
outage scenario and identify problem areas so they could be corrected and prepared for
a real disaster. Recovery plans are complex, therefore it’s critical that thorough
preparations are done before conducting a drill or simulation test for all of the critical
components.
Information sessions/workshops with application and testing support staff should be
planned to explain the purpose, execution, and measurement details of the un-announced
drills.
A tabletop (mock) disaster recovery drill would provide an organization’s support staff a
practical checklist of procedures to follow during a recovery phase. The business
continuity team should go over each critical procedures and planning document needed
to ensure each step is covered in restoring critical services at the recovery site
The unannounced drill is a method of practicing for the actual scenario which would
generally be activated via an organization-wide announcement, followed by a grace
period (if available) to shut down gracefully. Operations then would be alerted to call-in
the on-call back-up support staff.
First and foremost, priority is production operations, which would be secured and
guaranteed to be operational during the entire drill period. If any interruptions are
observed the drill will be aborted at once.
A well-thought and flawless “abort” plan is a must which should be ready to be exercised
-- just in case if there are any unforeseen issues
PROCESS FLOW: UN-ANNOUNCED DRILL
An email message would be sent to all concerned stakeholders of the drill announcing
the brief interruption of services during the failover of the services to recovery site. A grace
period of 15 to 30 minutes would be given to secure the in-process application data and
communicate the brief interruption to the end-users. The datacenter operations would be
alerted with the un-announced drill who would contact to bring on call/back-up support
on-site.
All support/application/users staff participating would be asked to report any problems
encountered during this recovery drill. For example, reporting to worksite, communication
difficulties, surprise technical issues, application issues, performance problems, etc.
One may follow similar communication and prepare in advance.
DOCUMENT CONTROL
Once a business continuity drill is executed and a proper follow-up has been conducted,
an official report of the concluding the activity, its finding, and recommendations are
essential factors to
To keep the management engaged and seek the support and attention from the support
organizations.
To ensure the drill goals, results, and its execution are fully documented for establishing
a baseline and audit purpose.
Conclusion
Based on the way organizations operate, it is imperative that business continuity drill
strategies are embedded in the operational routine. For this approach, many of the critical
success factors focus on building and utilizing support within the organization.