Syntel CQA Forum

Definitions

Diaster Recovery Plan
• • User operations

CQA Doc No 32

Disaster is any sudden or unplanned calamitous event that causes a significant disruption in information systems and/or telecommunications systems that affects the operation of an organization. Disaster Recovery/Business Contingency These are commonly used terms to refer to the recovery of service following either a disaster or other actions which would disrupt business activity. Recovery Plan is a documents used to define actions to be taken in the event of a disaster and reduce the number of decisions required during a stressful situation. Business resumption is the process of restoring business activity to an acceptable level, and then to a normal level after an emergency event has disrupted normal operations. Purpose and Scope Most organizations now have installed very complicated on-line and diverse network systems. Although organizations may have similar equipment and operating systems, they generally do not have the capacity to add a large number of users from another online environment to their systems even if the technical problems could be solved. A trend is evolving to provide alternate sites near the central local systems where any additional equipment needed can be shipped in rapidly, and critical on-line operations for the organization can be resumed in a reasonable time. Redundancy in the communications network and a tie-in to the alternate site, or the ability to rapidly tie-in, is an important part of the disaster plan. This type of site is called a cold backup site, as opposed to a hot backup site which contains all equipment necessary to start immediate operations. DISASTER PLANNING PROCESS In the event of a disaster, a business should have a back-up for the following: • • • • • Data file storage and retrieval Customer Services Communications and User Operations Hardware Software

Facilities for MIS and for users

An effective disaster recovery plan clearly identifies even the obvious details of how you will respond to disaster to ensure some of those details do not escape attention. It spells out those details, which establishes the plan is comprehensive and well-organized. It is important to keep in mind, that the aim of the planning process is to: • • • Assess existing vulnerabilities Implement disaster prevention procedures avoidance and

Develop a comprehensive plan that will enable the organization to react appropriately and in a timely manner if disaster strikes Create an an awareness program to educate management and senior individuals who will be required to participate in the project

Preparing for a Disaster This section contains the minimum steps necessary to prepare for a possible disaster and as preparation for implementing the recovery procedures. An important part of these procedures is ensuring that the off-site storage facility contains adequate and timely computer backup tapes and documentation for applications systems, operating systems, support packages, and operating procedures. General Procedures Responsibilities have been given for ensuring each of following actions have been taken and that any updating needed is continued. • • Maintaining and updating the disaster recovery plan. Ensuring that periodic scheduled rotation of backup media is being followed for the off-site storage facilities. Maintaining and periodically updating disaster recovery materials, specifically documentation and systems information, stored in the off-site areas.

Software Safeguards Administrative software and data are secured by full backups each week and differential backups each weekday evening. The first backup of each month is retained for one Page 1of 3

10718264.doc

Syntel CQA Forum

Diaster Recovery Plan

CQA Doc No 32

year. Nightly differential backups are retained in Systems & Operations until the next full backup. A copy of the full backups is also stored in a safe deposit box. Backups are stored on 4mm DAT tapes and other compact media. Recovery Procedures Systems & Operations In case of either a move to an alternate site, or a plan to continue operations at the main site, the following general steps must be taken: • Determine the extent of the damage and if additional equipment and supplies are needed. Obtain approval for expenditure of funds to bring in any needed equipment and supplies. Notify local vendor marketing and/or service representatives if there is a need of immediate delivery of components to bring the computer systems to an operational level even in a degraded mode. If it is judged advisable, check with thirdparty vendors to see if a faster delivery schedule can be obtained. Notify vendor hardware support personnel that a priority should be placed on assistance to add and/or replace any additional components. Notify vendor systems support personnel that help is needed immediately to begin procedures to restore systems software Order any additional electrical cables needed from suppliers. Rush order any supplies, forms, or media that may be needed. In addition to the general steps listed at the beginning of this section, the following additional major tasks must be followed in use of the alternate site: Notify officials that an alternate site will be needed for an alternate facility. Coordinate moving of equipment and support personnel into the alternate site with appropriate personnel. Bring the recovery materials from the offsite storage to the alternate site.

As soon as the hardware is up to specifications to run the operating system, load software and run necessary tests. Determine the priorities of the client software that need to be available and load these packages in order. These priorities often are a factor of the time of the month and semester when the disaster occurs. Prepare backup materials and these to the off-site storage area. return

• • •

Set up operations in the alternate site. Coordinate client activities to ensure the most critical jobs are being supported as needed. As production begins, ensure that periodic backup procedures are being followed and materials are being placed in off-site storage periodically. Work out plans to ensure all critical support will be phased in. Keep administration and clients informed of the status, progress, and problems. Coordinate the longer range plans with the administration, the alternate site officials, and staff for time of continuing support and ultimately restoring the Systems & Operations section.

• • •

Managing Recovery Procedures A disaster recovery plan must do more than identify alternatives. It should identify how all members of a business organization should respond in the event of a disaster. This includes the following: • Develop a clear chain of command. It should be clear whom employees need to go to if they see a problem. Establish a clear sense of individual responsibility. All employees should be trained to look for signs of disaster and to respond immediately when they see these.

• • •

• •

Documenting Procedures and Training Employees Documentation of the disaster recovery plan needs to be readily available to all employees. It should be • written out and frequently revised as the business and threats change; Page 2of 3

10718264.doc

Syntel CQA Forum
• • available on the business network;

Diaster Recovery Plan

CQA Doc No 32

referred to frequently in the training and ongoing education of all employees.

Training should help build employee awareness of what disasters might look like as they begin. Also, employees need to be aware of how, in addition to notifying someone else, they might immediately respond. Training should include • • practice exercises; clearly written documents but frequent instruction in identifying who is to be notified and how to respond to a disaster.

Testing the Disaster Recovery Plan A disaster recovery plan is only as good as its ability to be put into action. You may want to consider the following kinds of tests of your plan to maintain its effectiveness: • • • An initial comprehensive test of responses to several kinds of disasters Tests that simulate specific disasters, affecting one part of the network Tests that simulate worst-case scenarios, when the entire network is affected

10718264.doc

Page 3of 3

Sign up to vote on this title
UsefulNot useful