You are on page 1of 136

EMC NAS Celerra Network Server

Replication Manager 5.0.3

iSCSI Clustered Disaster Recovery Solution Implementation Guide


P/N 300-006-067 REV A01

EMC Corporation Corporate Headquarters: Hopkinton, MA 01748-9103


1-508-435-1000 www.EMC.com

Copyright 2007 EMC Corporation. All rights reserved. Published November, 2007 EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS IS. EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All other trademarks used herein are the property of their respective owners. For the most up-to-date regulatory document for your product line, go to the Document/Whitepaper Library on EMC Powerlink.

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Contents

Warnings and Cautions ......................................................................... 11 Preface ................................................................................................... 15 Chapter 1 Getting Started


Overview............................................................................................ Disaster recovery phases .......................................................... Limitations and minimum system requirements......................... Limitations.................................................................................. Using this guide ................................................................................ Concepts and terms................................................................... 20 21 22 22 25 25

Chapter 2

Setting up Disaster Recovery


Overview............................................................................................ 28 Step 1: Setting up the production cluster ...................................... 31 Step 1a: Install Windows Server 2003 on each node and create a cluster ........................................................................................ 31 Step 1b: Install all recommended updates from Microsoft... 32 Step 1c: Install Microsoft iSCSI Initiator.................................. 32 Step 1d: Install EMC Solutions Enabler................................... 32 Step 1e: Create source LUNs..................................................... 32 Step 1f: Perform Replication Manager pre-installation steps for Microsoft Clusters ................................................................ 33 Step 1g: Install Replication Manager ....................................... 34 Step 1h: Set up your user application ...................................... 39 Step 1i: Optionally setup the production site mount host.... 40 Step 2: Setting up the disaster recovery cluster ............................ 41 Step 2a: Install Windows Server 2003 on each node and create
Contents
3

Contents

a cluster ........................................................................................ 41 Step 2b: Install all recommended updates from Microsoft .. 41 Step 2c: Install Microsoft iSCSI Initiator.................................. 42 Step 2d: Install EMC Solutions Enabler................................... 42 Step 2e: Create destination LUNs ............................................ 42 Step 2f: Perform Replication Manager pre-installation steps for Microsoft Clusters ................................................................ 44 Step 2g: Install Replication Manager ....................................... 44 Step 2h: Install user applications.............................................. 51 Step 2i: Optionally setup the DR site mount host.................. 51 Step 3: Setting up replication .......................................................... 53 Step 3a: Set up trust relationship.............................................. 53 Step 3b: Create replication jobs ................................................ 54 Step 3c: Verify replication ......................................................... 58

Chapter 3

Failing Over
Failover overview ............................................................................. 64 Step 1: Preparing for application data failover............................. 66 Step 1a: Determine necessity of failover ................................. 66 Step 1b: Install user applications.............................................. 67 Step 1c: Fill out sdccluster and sdc-celerra worksheets ........ 67 Step 1d: Fill out failover support worksheet .......................... 67 Step 2: Failing over the RM server ................................................. 70 Step 3: Promoting clone replica to production and failing over 71 Step 3a: Fail over appropriate replication sessions ............... 71 Step 3b: Mask new LUNs to disaster recovery server .......... 71 Step 3c: Promoting the iSCSI clone replica to Production.... 72 Step 3d: Recover replicas........................................................... 74 Step 3e: Complete application setup ....................................... 75 Step 3f: Recover applications .................................................... 76 Step 3g: Archive primary data center worksheets................. 77

Chapter 4

Preparing for Failback


Overview............................................................................................ 80 Step 1: Setting up the new production cluster.............................. 83 Step 1a: Install Windows Server 2003 and create a cluster... 83 Step 1b: Install all recommended updates .............................. 83 Step 1c: Install Microsoft iSCSI Software Initiator................. 84 Step 1d: Install EMC Solutions Enabler................................... 84 Step 1e: Create new destination LUNs.................................... 84

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Contents

Step 1f: Perform RM pre-installation steps for Microsoft Clusters.........................................................................................85 Step 1g: Install Replication Manager........................................86 Step 1h: Install user applications ..............................................90 Step 1i: Optionally setup the production site mount host ....90 Step 2: Setting up replication........................................................... 92 Step 2a: Set up trust relationship ..............................................92 Step 2b: Create replication jobs .................................................93 Step 2c: Verify replication..........................................................97 Step 2d: Verify worksheet information....................................98

Chapter 5

Failing Back and Final Recovery


Overview .......................................................................................... 102 Before you begin.............................................................................. 104 Step 1: Preparing to fail back replication sessions...................... 105 Step 1a: Prepare data for failback ...........................................105 Step 1b: Prepare applications for failback .............................105 Step 1c: Fill out new pdccluster and new pdc-celerra worksheets .................................................................................105 Step 1d: Fill out failover support worksheet.........................106 Step 2: Failing back the RM server................................................ 109 Step 3: Promoting clone replica to production and failing back110 Step 3a: Fail over appropriate replication sessions ..............110 Step 3b: Mask new LUNs to new Production cluster ..........110 Step 3c: Promoting the iSCSI clone replica............................111 Step 4: Serving live data from the new production system ...... 114 Step 4a: Install user applications in DR Mode ......................114 Step 4b: Recover replicas..........................................................114 Step 4c: Complete application setup ......................................115 Step 4d: Recover applications .................................................116 Step 5: Setting up final recovery ................................................... 117 Step 5a: Verify and archive worksheets.................................117 Step 5b: Set up protection for the new Production Server..117

Appendix A

Celerra Worksheet
Filling out the worksheet ............................................................... 120 Worksheet definitions.............................................................. 120 Celerra worksheet ........................................................................... 122

Appendix B

Cluster Worksheet
Filling out the worksheet ............................................................... 126
Contents
5

Contents

Worksheet definitions ............................................................. 126 Cluster worksheet........................................................................... 127

Appendix C

Failover Support Worksheet


Filling out the worksheet............................................................... 130 Worksheet definitions ............................................................. 130 Failover support worksheet .......................................................... 131

Appendix D

Mounting Remote Replicas


Mounting remote replicas ............................................................. 134

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Figures

Title 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

Page

Phases of disaster recovery solution............................................................ 21 Setup phase of disaster recovery solution .................................................. 28 Setup locations and activities ....................................................................... 30 Installshield Wizard - Cluster pre-installation steps................................. 34 InstallShield Wizard - Designate primary server type ............................. 36 InstallShield Wizard - Enter communication port for servers................. 37 InstallShield Wizard - Secondary RM server is not currently installed . 38 Installshield Wizard - Cluster pre-installation steps................................. 45 InstallShield Wizard - Designate secondary server type.......................... 46 InstallShield Wizard - Enter communication port for servers................. 47 InstallShield Wizard - Primary RM server is currently installed............ 48 Replication Manager secondary server name and port............................ 49 Job Wizard - Job Name and Settings ........................................................... 55 Job Wizard - Celerra replication storage..................................................... 56 Job Wizard - Mount Options ........................................................................ 57 Viewing replicas in the content panel ......................................................... 58 Failover phase of disaster recovery solution.............................................. 64 Failover locations and activities ................................................................... 65 EMC Replication Manager - Promoting the iSCSI clone replica to Production 72 Confirmation for mounting the replica....................................................... 73 Promote Replica dialog box .......................................................................... 73 Promote Replica progress panel .................................................................. 74 Failback preparation phase of disaster recovery solution........................ 80 Failback preparation locations and activities............................................. 82 Replication Manager - Designate secondary RM server type ................. 86 InstallShield Wizard - Enter communication port for servers................. 87 Replication Manager - Primary RM server is currently installed ........... 88 Replication Manager secondary server name and port............................ 89 Job Wizard - Select Replication technology................................................ 94

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Figures

30 31 32 33 34 35 36 37 38

Job Wizard - Celerra replication storage .................................................... 95 Job Wizard - Mount Options ........................................................................ 96 Failback and final recovery phase of disaster recovery solution .......... 102 Failback locations and activities ................................................................ 103 EMC Replication Manager - Promote the iSCSI clone replica............... 111 Confirmation before promoting the replica ............................................. 112 Promote Replica dialog box........................................................................ 112 Promote Replica progress panel ................................................................ 113 Mount options example .............................................................................. 135

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Tables

Title 1 2 3 4 5 6 7 8 9 10 11

Page

Destination examples for failover worksheet ............................................. 68 Destination examples for sdc-celerra worksheet........................................ 98 Destination examples for failover worksheet ........................................... 107 Celerra general information ........................................................................ 122 Celerra iSCSI targets ..................................................................................... 122 Celerra trust relationships............................................................................ 122 Celerra LUN configuration .......................................................................... 123 General cluster information......................................................................... 127 Application databases .................................................................................. 127 Host information ........................................................................................... 131 Drive and LUN information ........................................................................ 131

iSCSI Clustered Disaster Recovery Solution Implementation Guide

10

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Warnings and Cautions

The following warnings and cautions pertain throughout this guide. WARNING Trained service personnel only. This EMC product has more than one power supply cord. To reduce the risk of electric shock, disconnect all power supply cords before servicing. Ground-circuit continuity is vital for safe operation of the machine. Never operate the machine with grounding conductors disconnected. Remember to reconnect any grounding conductors removed for or during any installation procedure. Additional warnings and cautions Before attempting to service EMC hardware described in this document, observe the following additional Warnings and Cautions: WARNING The hardware enclosure contains no user-serviceable parts, so it should not be moved or opened for any reason by untrained persons. If the hardware needs to be relocated or repaired, only qualified personnel familiar with safety procedures for electrical equipment and EMC hardware should access components inside the unit or move the unit. WARNING This product operates at high voltages. To protect against physical harm, power off the system whenever possible while servicing.

iSCSI Clustered Disaster Recovery Solution Implementation Guide

11

Warnings and Cautions

WARNING In case of fire or other emergency involving the EMC product, isolate the products power and alert appropriate personnel.

CAUTION Trained personnel are advised to exercise great care at all times when working on the EMC hardware. Remember to:

Remove rings, watches, or other jewelry and neckties before you begin any procedures. Use caution near any moving part and any part that may start unexpectedly such as fans, motors, solenoids, etc. Always use the correct tools for the job. Always use the correct replacement parts. Keep all paperwork, including incident reports, up to date, complete, and accurate.

12

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Warnings and Cautions

Static precautions

EMC incorporates state-of-the-art technology in its designs, including the use of LSI and VLSI components. These chips are very susceptible to damage caused by static discharge and need to be handled accordingly. CAUTION Before handling printed circuit boards or other parts containing LSI and/or VLSI components, observe the following precautions:

Store all printed circuit boards in antistatic bags. Use a ground strap whenever you handle a printed circuit board. Unless specifically designed for nondisruptive replacement, never plug or unplug printed circuit boards with the power on. Severe component damage may result.

iSCSI Clustered Disaster Recovery Solution Implementation Guide

13

Warnings and Cautions

14

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Preface

As part of an effort to improve and enhance the performance and capabilities of its product line, EMC from time to time releases revisions of its hardware and software. Therefore, some functions described in this document may not be supported by all revisions of the software or hardware currently in use. For the most up-to-date information on product features, refer to your product release notes. If a product does not function properly or does not function as described in this document, please contact your EMC representative. Audience This guide is part of the Celerra Network Server documentation set, and is intended for use by system administrators during installation and setup of the iSCSI Disaster Recovery solution for iSCSI replication. Readers of this guide are expected to be familiar with the following topics:

Celerra Manager and CLI interface Microsoft Windows server installation and administration Microsoft iSCSI Software Initiator and iSCSI technology Replication Manager 5.0.3 Installation and administration of applications such as NTFS, Microsoft SQL Server, and Microsoft Exchange Server

Organization

Here is a list of where information is located in this guide. Chapter 1, Getting Started, provides information necessary before beginning the disaster recovery procedures. It includes minimum system requirements and a high-level overview of the entire process.

iSCSI Clustered Disaster Recovery Solution Implementation Guide

15

Preface

Chapter 2, Setting up Disaster Recovery, explains the procedures necessary to install and prepare systems for disaster recovery, including tips for Windows servers, Celerra Network Servers and software applications. The setup procedures here may differ from setup procedures described in Replication Manager or Celerra Network Server documentation. They are tailored specifically to provide a disaster recovery configuration. Chapter 3, Failing Over, explains how to start serving live data from the disaster recovery system at the secondary data center in the event of a catastrophic failure of the production system in the primary data center. Chapter 4, Preparing for Failback, explains how to set up a new production system that can serve live data when you fail back from the disaster recovery system. Chapter 5, Failing Back and Final Recovery, explains how to serve live data from the new production system at the primary data center and then set up disaster recovery again at the secondary data center. Appendix A, Celerra Worksheet, contains a worksheet that you should copy for each Celerra used in the DR environment. The worksheet captures essential information used in the failover and failback phases for both primary and secondary data centers. Appendix B, Cluster Worksheet, contains a worksheet that you should copy for each Windows host used in the DR environment. The worksheet captures essential information used in the failover and failback phases for both primary and secondary data centers. Appendix C, Failover Support Worksheet, contains a worksheet that you should copy for each Windows host that you want to fail over during a disaster recovery procedure. The worksheet organizes drive, LUN and source/destination information to ensure an effective failover of your host applications. Appendix D, Mounting Remote Replicas, contains instructions for mounting replicas on remote Mount Hosts. Remote replicas can be used for further processing outside the context of disaster recovery; mounting them requires specific steps outlined in this appendix, in support of procedures in the two setup chapters. Related documentation Related documents required for this solution are listed under Limitations and minimum system requirements on page 22.

16

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Preface

Conventions used in this guide

EMC uses the following conventions for notes, cautions, warnings, and danger notices.
Note: A note presents information that is important, but not hazard-related.

CAUTION A caution contains information essential to avoid data loss or damage to the system or equipment. The caution may apply to hardware or software. WARNING A warning contains information essential to avoid a hazard that can cause severe personal injury, death, or substantial property damage if you ignore the warning. DANGER A danger notice contains information essential to avoid a hazard that will cause severe personal injury, death, or substantial property damage if you ignore the message. Typographical conventions EMC uses the following type style conventions in this document:
Normal Used in running (nonprocedural) text for: Names of interface elements (such as names of windows, dialog boxes, buttons, fields, and menus) Names of resources, attributes, pools, Boolean expressions, buttons, DQL statements, keywords, clauses, environment variables, filenames, functions, utilities URLs, pathnames, filenames, directory names, computer names, links, groups, service keys, file systems, notifications Used in running (nonprocedural) text for: Names of commands, daemons, options, programs, processes, services, applications, utilities, kernels, notifications, system call, man pages Used in procedures for: Names of interface elements (such as names of windows, dialog boxes, buttons, fields, and menus) What user specifically selects, clicks, presses, or types

Bold

iSCSI Clustered Disaster Recovery Solution Implementation Guide

17

Preface

Italic

Used in all text (including procedures) for: Full titles of publications referenced in text Emphasis (for example a new term) Variables Used for: System output, such as an error message or script URLs, complete paths, filenames, prompts, and syntax when shown outside of running text Used for: Specific user input (such as commands) Used in procedures for: Variables on command line User input variables Angle brackets enclose parameter or variable values supplied by the user Square brackets enclose optional values Vertical bar indicates alternate selections - the bar means or Braces indicate content that you must specify (that is, x or y or z) Ellipses indicate nonessential information omitted from the example

Courier

Courier bold Courier italic

<> [] | {} ...

Where to get help

EMC support, product, and licensing information can be obtained as follows. Product information For documentation, release notes, software updates, or for information about EMC products, licensing, and service, go to the EMC Powerlink website (registration required) at:
http://Powerlink.EMC.com

Technical support For technical support, go to EMC Customer Service on Powerlink. To open a service request through Powerlink, you must have a valid support agreement. Please contact your EMC sales representative for details about obtaining a valid support agreement or to answer any questions about your account. Your comments Your suggestions will help us continue to improve the accuracy, organization, and overall quality of the user publications. Please send your opinion of this guide to:
techpub_comments@EMC.com

18

iSCSI Clustered Disaster Recovery Solution Implementation Guide

1
Invisible Body Tag

Getting Started

This chapter outlines phases of the iSCSI Disaster Recovery (DR) solution for a clustered Replication Manager Celerra environment from setup through failover and failback. Key terms and concepts are explained.

Overview ............................................................................................. 20 Limitations and minimum system requirements .......................... 22 Using this guide ................................................................................. 25

Getting Started

19

Getting Started

Overview
The EMC NAS Celerra Network Server Replication Manager 5.0.3 iSCSI Clustered Disaster Recovery Solution Implementation Guide is designed to guide you through the steps necessary to implement a disaster recovery solution in a clustered environment. A clustered environment is one that uses Microsoft Cluster Services. A separate DR solution exists for environments not using Microsoft Cluster Services. This guide also contains information regarding specific tips for environments running Microsoft SQL Server 2000 or 2005 and Microsoft Exchange Server 2003. Tips specific to these applications are called out under subheadings such as the following: Microsoft SQL Server If you are protecting an SQL Server 2000 or 2005 application in DR, make sure you read the information under this heading as it occurs throughout the document. Microsoft Exchange Server If you are protecting a Microsoft Exchange Server 2003 application in DR, make sure you read the information under this heading as it occurs throughout the document.

20

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Getting Started

Disaster recovery phases

Chapters 2 through 5 in this guide correspond to the phases in the disaster recovery solution as shown in Figure 1 on page 21.
SETTING UP DISASTER RECOVERY
Install hardware and software Establish trust Create replication jobs

FAILING OVER
Prepare DR system Fail over live data to DR system

PREPARING FOR FAILBACK


Install new production system Establish trust Create replication jobs

FAILING BACK AND FINAL RECOVERY


Fail back from DR system to new production system Initiate replication from new production system back to DR system

Done
CIP-000707

Figure 1

Phases of disaster recovery solution

Each DR phase consists of a sequence of steps and procedures summarized in Figure 1 and detailed in the following chapters. To successfully complete all phases of the DR solution, ensure that your environment meets the minimum requirements outlined under Limitations and minimum system requirements on page 22.

Overview

21

Getting Started

Limitations and minimum system requirements


Disaster recovery requires a defined set of hardware, software, documentation, and networking components. Before using the procedures in this guide, ensure that you understand the limitations and minimum system requirements:

Limitations

The Windows cluster in the secondary data center must be dedicated to DR. You cannot use it for serving any primary applications or data. This is because, after disaster recovery is complete, you must reformat the clustered nodes and reinstall Windows. If you use it to serve other data you will lose that data.

CAUTION This guide does not describe a procedure to test Disaster Recovery setup. The failover process described in Chapter 3, Failing Over, assumes that the primary site is unavailable. In addition, the failback in procedure described in Chapter 4, Preparing for Failback, assumes that all of the relevant data must be replicated back to the primary site before application services can be restored at that site. This procedure refers to two clustered systems, each with multiple network names and IP address resources associated with the cluster. To successfully complete this procedure, ensure the following condition is met:

The failover and recovery process requires that the DNS system only return one IP address per hostname, and one hostname per IP address. During this procedure you will be creating a new IP address and network name resources in each cluster. Make sure that the DNS system is updated to reflect the appropriate mappings.

Note: Windows 2000 is not supported in clustered implementations with iSCSI.

CAUTION Please make sure that your primary and secondary RM servers have a method to resolve the name of their peer server into a network address. Example: DNS This procedure assumes, and requires,

22

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Getting Started

that the primary and secondary RM servers will be able to address each other by name, and will not work properly if this assumption is not met. If this setup is not possible in your environment, please contact your EMC support representative. Hardware The following hardware should be installed in both the primary and secondary data centers. The DR system configuration should match the production system. Hardware differences between systems are not supported.

A Windows cluster, configured for the applications you are running, and in the case of the secondary data center, dedicated only to the DR function An EMC Celerra Network Server with at least one Data Mover An IP network between the two data centers

Software

The following software is required in both the primary and secondary data centers. The DR system software should exactly match the production system software. Software revision differences between the two systems are not supported. For example, if you run Windows Server 2003 on the production server, you cannot run Windows 2000 on the DR server.

Celerra Network Server 5.5 Microsoft Windows Server 2003 SP1 or SP2 Latest Microsoft patches on servers, including hotfixes KB891957 and KB898790 for Windows Server 2003 Microsoft iSCSI Software Initiator 2.0 or later EMC Solutions Enabler 6.3.2.0 or later EMC Replication Manager 5.0.3

Documentation

This guide refers to the following documents. If you are not familiar with the referenced products and processes, ensure availability of these documents before attempting these procedures:

Best Practices for Celerra iSCSI: Considerations to Understand When Deploying Celerra iSCSI within Your Environment Configuring iSCSI Targets on Celerra DART 5.5 New and Enhanced Features A Detailed Review EMC Solutions Enabler Version 6.3 Installation Guide

Limitations and minimum system requirements

23

Getting Started

EMC Replication Manager Version 5.0.3 Product Guide EMC Replication Manager Version 5.0.3 Administrators Guide EMC Replication Manager Version 5.0.3 Release Notes Using Celerra Replicator for iSCSI Version 5.5

Microsoft Exchange Server only EMC Solutions for Exchange 2003 NS Series iSCSI - Best Practices Planning white paper

EMC Solutions for Exchange 2003 NS Series iSCSI - Reference Architecture Guide

Microsoft SQL Server only EMC Solutions for Microsoft SQL Server 2005 EMC Celerra NS Series iSCSI - Applied Best Practices Guide

EMC Solutions for Microsoft SQL Server 2005 NS Series iSCSI Reference Architecture

EMC documentation is available from http://Powerlink.EMC.com. Other documentation is the property of their respective vendors. Additional Microsoft and application documentation may be necessary for the installation and setup of your applications.

24

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Getting Started

Using this guide


This guide assumes that you are comfortable in a system administrator role, administering and installing networks and software applications. It refers to various component documentation for details of some procedures, for example, on installing Windows operating systems or server applications. If you need additional information on procedures relating to the various hardware and software components involved in disaster recovery, refer to the related documentation. The guide details the exact sequence of steps and events that must take place for DR processes to operate successfully in the Replication Manager Celerra environment. It is important that each step be completed in its entirety as described.

Concepts and terms

This guide uses a predefined set of terms to describe disaster recovery concepts. Before using this guide, ensure that you understand the following disaster recovery concepts: clustered environment A data processing environment that uses Microsoft Cluster Services. disaster An event involving the total failure of the production system. This guide assumes that you must fail over live data because of a system failure at the primary data center. disaster recovery (DR) Includes the setup, failover, and failback procedures required to actuate live data from a failed production system to a disaster recovery system, and then back to a new or restored production system. disaster recovery (DR) system The components consisting of a Windows server, Celerra Network Server, and associated software, located at a secondary data center, that replicate the data from the production system at the primary data center. failback The process by which live data on the disaster recovery system is failed over to a restored or new production system in a primary data center. Failback as used in this document does not imply a return to any preexisting state on the production cluster in the primary data center. It involves failing over from the disaster recovery cluster to the new production cluster once it is online.

Using this guide

25

Getting Started

failover The process by which the replicated data set on any given server transitions to live data, due to failure of the system from which the replication was made. Initially, failover occurs from the primary data center to the secondary data center. In recovery, failover occurs from the secondary data center to the newly functional system at the primary data center. live data Data used in the live user environment, whether served by the production system at the primary data center or the disaster recovery system at the secondary data center. primary data center The physical location of the production system primary RM server Host that is running Replication Manager server software, and that controls the replication. production host or cluster Host or cluster that is running your database/application and contains all of the database production data. production system The components consisting of a Windows cluster, Celerra Network Server, and associated software at a primary data center. During a failure the production system becomes unavailable. replica The snapshot of live data that the disaster recovery system uses to assume the role of serving live data. replication The process by which Replication Manager creates snapshots of live data from the production system and copies that information to the disaster recovery system at the secondary data center. secondary data center The physical location of the disaster recovery system. secondary RM server Host that is running Replication Manager server software with a read-only configuration. The RM database is automatically kept synchronized with the primary server hosts RM database.

26

iSCSI Clustered Disaster Recovery Solution Implementation Guide

2
Invisible Body Tag

Setting up Disaster Recovery

This chapter details the steps to set up a primary data center and a secondary data center for disaster recovery in a clustered environment.

Overview ............................................................................................. Step 1: Setting up the production cluster........................................ Step 2: Setting up the disaster recovery cluster ............................. Step 3: Setting up replication............................................................

28 31 41 53

Setting up Disaster Recovery

27

Setting up Disaster Recovery

Overview
Successful disaster recovery begins with proper setup. At the primary data center that hosts the production system, you need to provide certain parameters to ensure a successful failover to the disaster recovery system in the event of failure at the primary data center. In the overall disaster recovery flowchart as shown in Figure 2, setup is the phase upon which all others depend:

SETTING UP DISASTER RECOVERY


Install hardware and software Establish trust Create replication jobs

FAILING OVER
Prepare DR system Fail over live data to DR system

PREPARING FOR FAILBACK


Install new production system Establish trust Create replication jobs

FAILING BACK AND FINAL RECOVERY


Fail back from DR system to new production system Initiate replication from new production system back to DR system

Done
CIP-000708

Figure 2

Setup phase of disaster recovery solution

28

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

To successfully complete the failover phase, please ensure that your environment meets the minimum requirements as outlined under Limitations and minimum system requirements on page 22.

CAUTION Failure to execute the setup steps as described in this chapter may cause problems during failover, including loss of data. The setup phase consists of activities that take place at two different sites for two different systems, as shown in Figure 3. The primary data center (PDC) represents the location where the production system resides, serving live data. The secondary data center (SDC) represents the location where the disaster recovery system resides. It stores the replicas of the live data. Refer to Concepts and terms on page 25 for more information.

Overview

29

Setting up Disaster Recovery

PRIMARY DATA CENTER (PDC) (Production System)

SECONDARY DATA CENTER (SDC) (Disaster Recovery System )

SB14

SB15

SB14

SB12

SB13

SB12

SB10

SB11

SB10

SB9

SB8

SB6

SB7

SB6

SB4

SB5

SB4

SB2

SB3

SB2

SB0

SB1

SB0

PS0

PS1

PS2

PS3

PS4

SMB0 SMB1

PS0

PS1

PS2

PS3

PS4

SMB0 SMB1

pdc-celerra Create iSCSI LUNs Create trust

pdccluster Install RM

sdc-celerra Create iSCSI LUNs Create trust

SB1

SB3

SB5

SB7

SB9

SB8

SB11

SB13

SB15

sdccluster Install RM

IP Network SETTING UP FOR DISASTER RECOVERY


Figure 3

CIP-000969

Setup locations and activities Note: The example used in this document assumes that the RM server and application server are the same host. This is an acceptable configuration if you are protecting only one host. For larger configurations your RM server and application servers should be on separate hosts. Please see the EMC Replication Manager Version 5.0.3 Administrators Guide for more information about deploying Replication Manager in your environment.

30

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

Step 1: Setting up the production cluster


The production cluster at the primary data center (PDC) needs to be configured according to the appropriate recommendations for the applications you are running.

Step 1a: Install Windows Server 2003 on each node and create a cluster
It is assumed that this procedure is well understood so it is not covered in any detail here. 1. Make copies of Appendix B, Cluster Worksheet for every cluster you are protecting using this DR procedure. 2. For the production cluster, note the location, contact and identifying information on the worksheet and in Table 8, General cluster information, in Appendix B, Cluster Worksheet. For this example, the production cluster installed at the primary data center is pdccluster. 3. Create a cluster using the nodes that you have installed with the quorum drive on the production site Celerra. In this example we will call the Celerra at the production site pdc-celerra. The quorum drive should be on an iSCSI target separate from all of the data LUNs. Make a copy of Appendix A, Celerra Worksheet, for pdc-celerra and fill in Table 4, Celerra general information, and then fill in the target data in Table 5, Celerra iSCSI targets,for the quorum drive using this example. Determining the iSCSI target number If the IQN for the Celerra iSCSI target is: iqn.1992-05.com.emc:apm000550050290000-15, then the target number is 15. The relevant number is in bold. Note it in the Number column. This target number should be unique per Data Mover and will help you fill out the other tables. Fill out the Alias and Data Mover columns for the iSCSI target.

Step 1: Setting up the production cluster

31

Setting up Disaster Recovery

Step 1b: Install all recommended updates from Microsoft


Updates for Microsoft software are available using Windows Update. Before proceeding, install all recommended updates, with any optional updates that apply to host systems. Updates are available from the following site: http://windowsupdate.microsoft.com Windows Server 2003 Verify that you have installed hotfixes KB891957 and KB898790. These two hotfixes address specific Windows Server 2003 issues when using this DR procedure.

Step 1c: Install Microsoft iSCSI Initiator


The minimum supported version for the Microsoft iSCSI Initiator is 2.0. The Initiator is used to connect to the shared iSCSI LUNs on the Celerra systems. Full details on installing and using the iSCSI Initiator can be found in the accompanying documentation.

Step 1d: Install EMC Solutions Enabler


Install Solutions Enabler v6.3.2.0 (edit level 787) or later from http://Powerlink.EMC.com The Solutions Enabler provides a common set of tools for communicating with other EMC products. Accept all default options during installation. Refer to the EMC Solutions Enabler Version 6.3 Installation Guide for more information.

Step 1e: Create source LUNs


You need to set up an iSCSI LUN for each application that you want to protect in the DR environment. The following list gives an example of which LUNs need to be replicated:

All LUNs containing files that are being shared from the nodes on this cluster All Microsoft SQL Server database and transaction log files. In most cases, these will be on separate iSCSI LUNs. All of these LUNs must be replicated.

32

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

All Microsoft Exchange Server storage groups

Apart from the data LUNs, additional LUNs need to be created and these need not be replicated to the secondary data center. The additional LUNs are required for hosting the Quorum drive and to store the RM database which would be shared by all the nodes of the cluster. After creation of the iSCSI LUNs, record the following information in Appendix A, Celerra Worksheet:

The Target #, LUN #, Hostname and Drive letter information for each application LUN you want to replicate in the source section of Table 7, Celerra LUN configuration. Note the iSCSI Targets' information in Table 5, Celerra iSCSI targets.

Note: The quorum drive should be on an iSCSI target separate from all of the data LUNs.

Step 1f: Perform Replication Manager pre-installation steps for Microsoft Clusters
Before installing the Replication Manager Server component in a MSCS environment, verify that:

The appropriate Microsoft hotfixes are installed. The EMC Replication Manager Support Matrix provides more information. A supported version of MSCS is installed. The EMC Replication Manager Support Matrix provides the latest support information. One or more file systems are configured for failover where you can install the Replication Manager Server binaries and the internal database. The file systems configured for failover are in the same cluster resource group. The cluster resource group has a virtual IP address and a corresponding virtual name.

For detailed steps of RM installation in a clustered environment, refer to the RM 5.0.3 Administrators Guide.

Step 1: Setting up the production cluster

33

Setting up Disaster Recovery

Step 1g: Install Replication Manager


Replication Manager manages the replication jobs that replicate data from the production pdccluster to the sdccluster at the secondary data center (SDC). In a clustered environment,

You install the Replication Manager Server component on only one node of the cluster Replication Manager Server fails over automatically to the other node when the cluster fails over

Replication Manager Agent and console components do not fail over with the cluster. If you install agent or console components on a clustered system, you should install them locally to all nodes. Install Replication Manager 5.0 on production cluster and choose the type of install as Cluster. The Replication Manager installation wizard displays the cluster preinstallation steps as shown in Figure 4, Installshield Wizard - Cluster pre-installation steps.

Figure 4

Installshield Wizard - Cluster pre-installation steps

During RM 5.0 installation, you must enter the path to a resource shared by all nodes of the cluster where the software can be installed.

34

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

Once RM 5.0 basic installation completes, install RM Service Pack 3. But before installing Service Pack 3, failover the RM server resource group to the node where RM server was originally installed; then install this service pack on that node only. All upgrade changes will automatically propagate to other nodes in the cluster if a failover occurs. The service pack setup prompts for the following information:
Note: You may notice in the EMC Replication Manager Version 5.0.3 Administrator's Guide that it is recommended that you upgrade the secondary RM server to Service Pack 3 before upgrading the primary RM server. In this document, in order to maintain a consistent flow, you upgrade the primary RM server first. Both upgrade sequences will work. When you upgrade the primary server first, you may notice some warning messages in your server event log. These are expected, since the primary server is attempting to contact the secondary server which has not yet been upgraded. You will also notice that the primary RM server stays in read-only mode until the secondary server is available. For more information, please see the EMC Replication Manager 5.0 Service Pack 3 Release Notes.

Step 1: Setting up the production cluster

35

Setting up Disaster Recovery

1. For the DR Server type (Secondary, Standalone, or Primary), choose Primary Server to designate this cluster as the primary RM server.

Figure 5

InstallShield Wizard - Designate primary server type

36

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

2. For the communication port, enter the desired communication port through which the primary RM server will communicate with the secondary RM server to synchronize data in their respective RM databases. The default communication port is 1964.

Figure 6

InstallShield Wizard - Enter communication port for servers

Step 1: Setting up the production cluster

37

Setting up Disaster Recovery

3. When asked if the secondary Replication Manager Server is installed, select No, the Secondary Server is not currently installed.

Figure 7

InstallShield Wizard - Secondary RM server is not currently installed

4. Verify the status of RM server by using the commands provided by the Replication Manager command line interface (CLI).
Example 1

Steps to verify status of RM Server

A set of commands are used to verify the status of RM server. Following is the syntax of those commands: > rmcli connect host=<RM_host_name> port=<server_control_port_number> > login user=<Administrator> password=<mypassword> > dr-get-state The following example illustrates the commands used to verify the RM server status:
C:\Program Files\EMC\rm\gui> rmcli connect host=rm2_host port=65432 Connected to rm2_host (port=65432). RM-CLI : login user=Administrator password=mypassword

38

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

Login as user 'Administrator' successful. RM-CLI : dr-get-state 0 PRIMARY ALONE.

The status PRIMARY ALONE indicates that the current status of the RM server is primary and there is no secondary server configured for this server.
Note: After the secondary server is configured, the status of primary server will be PRIMARY ACTIVE.

Step 1h: Set up your user application


Note: The example used in this document assumes that the RM server and application server are the same host. This is an acceptable configuration if you are protecting only one host. For larger configurations your RM server and application servers should be on separate hosts. Please see the EMC Replication Manager Version 5.0.3 Administrators Guide for more information about deploying Replication Manager in your environment.

It is assumed that this procedure is well understood outside of the DR context. When installing each application, ensure that you select the appropriate iSCSI LUN drives for data storage during the installation process. The following sections outline specific considerations for applications in the DR environment.

NTFS This procedure protects only shared data that resides on iSCSI LUNs on pdc-celerra. Microsoft SQL Server Your Microsoft SQL Server system databases may reside on local storage or on iSCSI LUNs if they do not need to be replicated. All databases that require protection must be on iSCSI LUNs on pdc-celerra. Refer to the published best practices from Microsoft regarding installation of Microsoft SQL Server on Windows 2003 clusters. Microsoft Exchange Server All of the Microsoft Exchange storage groups for the server must be located on iSCSI LUNs on pdc-celerra.

Step 1: Setting up the production cluster

39

Setting up Disaster Recovery

Refer to the published best practices from Microsoft regarding installation of Microsoft Exchange on Windows 2003 clusters. Fill out the application information required for the sdc-cluster in Appendix B, Cluster Worksheet in: Table 8, General cluster information Table 9, Application databases

CAUTION The RM database LUN should not be used for any application data. This configuration is not supported and may negatively impact your ability to recover the system in the event of a disaster.

Step 1i: Optionally setup the production site mount host


In a clustered environment, you cannot mount replicas from the cluster to an individual cluster node for processing. Therefore, you must set up an RM Mount host that is separate from the cluster to administer and mount replicas. The hardware requirements for the mount host are identical to the requirements for a single cluster node. The mount host should be running Windows 2003 with EMC Solutions Enabler 6.3 and RM agent software installed. The mount host does not need to have any shared storage dedicated to it at this time, but it will need to have the Microsoft iSCSI initiator installed. The mount host must not be a member of a cluster, and does not need to have cluster software installed. Replication Manager Agent software on the production host and corresponding mount host must be at the same version.
Note: The mount host in the example is pdc-mount.

40

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

Step 2: Setting up the disaster recovery cluster


Setting up the disaster recovery cluster at the secondary data center is similar to setting up the production cluster. The cluster needs to be configured according to the appropriate recommendations for the application you are running.

Step 2a: Install Windows Server 2003 on each node and create a cluster
It is assumed that this procedure is well understood, so it is not covered in any detail here. 1. Make copies of Appendix B, Cluster Worksheet, for every cluster you are using in the DR environment. 2. For the disaster recovery cluster, note the location, contact and identifying information on the worksheet in Table 8, General cluster information, in Appendix B, Cluster Worksheet. For this example, the disaster recovery cluster installed at the secondary data center is sdccluster. 3. Create a cluster using the nodes that you have installed with the quorum drive on the DR site Celerra. The quorum drive should be on an iSCSI target separate from all of the data LUNs. Make a copy of Appendix A, Celerra Worksheet for sdc-celerra and fill in Table 4, Celerra general information, and then fill in the iSCSI target data in Table 5, Celerra iSCSI targets, for the quorum drive using this example. Determining the iSCSI target number If the IQN for the Celerra iSCSI target is: iqn.1992-05.com.emc:apm000550050290000-15, then the target number is 15. The relevant number is in bold. Note it in the Number column. This target number should be unique per Data Mover and will help you fill out the other tables. Fill out the Alias and Data Mover columns for the iSCSI target.

Step 2b: Install all recommended updates from Microsoft


Updates for Microsoft software are available using Windows Update. Before proceeding, install all recommended updates, with any optional updates that apply to host systems.

Step 2: Setting up the disaster recovery cluster

41

Setting up Disaster Recovery

Updates are available from the following site: http://windowsupdate.microsoft.com Windows Server 2003 servers Verify that you have installed hotfixes KB891957 and KB898790. These two hotfixes address specific Windows Server 2003 issues when using this DR procedure.

Step 2c: Install Microsoft iSCSI Initiator


The minimum supported version for the Microsoft iSCSI Initiator is 2.0. The Initiator is used to connect to the shared iSCSI LUNs on the Celerra systems. Full details on installing and using the iSCSI Initiator can be found in the accompanying documentation.

Step 2d: Install EMC Solutions Enabler


Install Solutions Enabler v6.3.2.0 (edit level 787) or later from http://Powerlink.EMC.com The Solutions Enabler provides a common set of tools for communicating with other EMC products. Accept all default options during installation. Refer to the EMC Solutions Enabler Version 6.3 Installation Guide for more information.

Step 2e: Create destination LUNs


You need to set up a destination LUN for each application that you want to protect in the DR environment. The following list gives an example of which LUNs need to be replicated, and why:

All LUNs containing files that are being shared from this server All Microsoft SQL Server database and transaction log files. In most cases, these will be on separate iSCSI LUNs. All of these LUNs must be replicated All Microsoft Exchange Server storage groups

The following steps show you how to set up a LUN for one application.

42

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

CAUTION All drive letters on pdccluster associated with data that will be replicated must be reserved on sdccluster and may not be used at this time.

Before you begin

As you go through the procedure in this step, notice that while you create replication destinations based on iSCSI LUNs, the replication jobs are created based on the applications that use those LUNs. For that reason, create all iSCSI destination LUNs for each application before you create a replication job for that application. 1. Determine the size of the LUN to replicate. The DR LUN must be the same size as the production LUN. This was specified when you set up the LUN, but if you need to verify the size, use the following command:
Note: All of the commands in Step 1 should be run from pdc-celerra. $> server_iscsi <movername> -lun -list

The resulting list displays iSCSI LUNs configured for a specific datamover. Obtain the LUN number that you need for the next command:
$> server_iscsi <movername> -lun -info <lun_number>

The resulting information lists the LUN size and the target with which it is associated. 2. Create the target LUN from the CLI on sdc-celerra.
Note: All of the commands in Step 2 should be run from sdc-celerra.

The command is as follows, as shown in:


$> server_iscsi <movername> -lun -number <lun_number> -create <target_alias_name> -size <size> [M|G|T] -fs <fs_name> -readonly yes Note: The read-only option is required. It designates the LUN as a replication target. You can see this by getting LUN information for the newly created LUN. You can only create destination LUNs from the CLI, not from Celerra Manager.

Step 2: Setting up the disaster recovery cluster

43

Setting up Disaster Recovery

Example 2

Creating iSCSI LUNs from sdc-celerra

[root@sdc-celerra root]# server_iscsi server_2 -lun -number 25 -create NonClust_tgt -size 25G -fs secondary_site_iscsi -readonly yes server_2 : done [root@sdc-celerra root]# server_iscsi server_2 -lun -number 35 -create NonClust_tgt -size 35G -fs secondary_site_iscsi -readonly yes server_2 : done [root@sdc-celerra root]]#

Step 2f: Perform Replication Manager pre-installation steps for Microsoft Clusters
Before installing the Replication Manager Server component in a MSCS environment, verify that:

The appropriate Microsoft hotfixes are installed. The EMC Replication Manager Support Matrix provides more information. A supported version of MSCS is installed. The EMC Replication Manager Support Matrix provides the latest support information. One or more file systems are configured for failover where you can install the Replication Manager Server binaries and the internal database. The file systems configured for failover are in the same cluster resource group. The cluster resource group has a virtual IP address and a corresponding virtual name.

For detailed steps of Replication Manager installation in a clustered environment, refer to the Replication Manager 5.0.3 Administrators Guide.

Step 2g: Install Replication Manager


In a clustered environment,

You install the Replication Manager Server component on only one node of the cluster Replication Manager Server fails over automatically to the other node when the cluster fails over

Replication Manager Agent and console components do not fail over with the cluster. If you install agent or console components on a clustered system, you should install them locally to all nodes.

44

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

Install Replication Manager 5.0 on the DR cluster and choose the type of install as Cluster. The RM installation wizard displays the cluster preinstallation steps as shown in Figure 8, Installshield Wizard Cluster pre-installation steps.

Figure 8

Installshield Wizard - Cluster pre-installation steps

During RM 5.0 installation you must enter the path to a resource shared by all nodes of the cluster where the software can be installed. Once RM 5.0 basic installation completes, install RM Service Pack 3. But before installing Service Pack 3, failover the RM server resource group to the node where RM server was originally installed; then install this service pack on that node only. All upgrade changes will automatically propagate to other nodes in the cluster if a failover occurs.
Note: You may notice in the EMC Replication Manager Version 5.0.3 Administrator's Guide that it is recommended that you upgrade the secondary RM server to Service Pack 3 before upgrading the primary RM server. In this document, in order to maintain a consistent flow, you upgrade the primary RM server first. Both upgrade sequences will work. When you upgrade the primary server first, you may notice some warning messages in your server event log. These are expected since the primary server is attempting to contact

Step 2: Setting up the disaster recovery cluster

45

Setting up Disaster Recovery

the secondary server which has not yet been upgraded. You will also notice that the primary RM server stays in read-only mode until the secondary server is available. For more information, please see the EMC Replication Manager 5.0 Service Pack 3 Release Notes.

1. For the DR Server type (Secondary, Standalone, or Primary), choose Secondary Server to designate this cluster as the secondary RM server.

Figure 9

InstallShield Wizard - Designate secondary server type

46

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

2. For the communication port, enter the desired communication port through which the primary RM server will communicate with the secondary RM server to synchronize data in their respective RM databases. The default communication port is 1964.

Figure 10

InstallShield Wizard - Enter communication port for servers

Step 2: Setting up the disaster recovery cluster

47

Setting up Disaster Recovery

3. When asked if the primary Replication Manager Server is installed, select Yes, the Primary Server is currently installed.

Figure 11

InstallShield Wizard - Primary RM server is currently installed

48

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

4. For the name and port number of the primary server, enter the name of the Primary Replication Manager Server and the communication port number that you entered for the Secondary RM server in Step 2g: Install Replication Manager, substep 2.

Figure 12

Replication Manager secondary server name and port

5. Verify the status of RM server by using the commands provided by the Replication Manager command line interface (CLI).
Example 3

Steps to verify the status of RM Server

A set of commands are used to verify the status of RM server. Following is the syntax of those commands: > rmcli connect host=<RM_host_name> port=<server_control_port_number> > login user=<Administrator> password=<mypassword> > dr-get-state The following example illustrates the commands used to verify the RM server status:
C:\Program Files\EMC\rm\gui> rmcli connect host=rm2_host port=65432 Connected to 'rm2_host' (port=65432).

Step 2: Setting up the disaster recovery cluster

49

Setting up Disaster Recovery

RM-CLI : login user=Administrator password=mypassword Login as user 'Administrator' successful. RM-CLI : dr-get-state 0 SECONDARY ACTIVE.

The status, SECONDARY ACTIVE indicates that the current status of the RM server is secondary and there is a primary server configured for this server. RM database synchronization When installation completes, the primary RM servers database synchronizes with the secondary RM servers database. Larger RM databases may take longer to synchronize than smaller databases. Each time a replication job is created or run on the primary server, the primary and secondary databases synchronize. The secondary RM servers database is a read-only copy of the RM primary servers database. Therefore, any action on the secondary server that would require a database write, such as editing or running jobs, cannot be executed.

50

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

Step 2h: Install user applications


At this time you may set up the user-facing application on the disaster recovery cluster. If you create a new network name or IP address resource, note the appropriate values for sdccluster in Table 8, General cluster information, of Appendix B, Cluster Worksheet. Refer to the application-specific notes in this section. Microsoft SQL Server and NTFS If you are running Microsoft SQL Server or NTFS you may set up the application as described in the published best practices. Doing so now will save time if you need to fail over to the disaster recovery system. The configuration of the sdccluster SQL Server instances must match those on the pdccluster. Microsoft Exchange Server You can install Microsoft Exchange Server on the DR cluster at this time by following the relevant instructions from Microsoft. You should not install any Virtual Server instances at this time since they will be setup after failover.

Step 2i: Optionally setup the DR site mount host


In a clustered environment, you cannot mount replicas from the cluster to an individual cluster node for processing. Therefore, you must set up an RM Mount host that is separate from the cluster to administer and mount replicas. The hardware requirements for the mount host are identical to the requirements for a single cluster node. The mount host should be running Windows 2003 with EMC Solutions Enabler 6.3 and RM agent software installed. The mount host does not need to have any shared storage dedicated to it at this time, but it will need to have the Microsoft iSCSI initiator installed. The mount host must not be a member of a cluster, and does not need to have cluster software installed. Replication Manager Agent software on the production host and corresponding mount host must be at the same version.

Step 2: Setting up the disaster recovery cluster

51

Setting up Disaster Recovery

Note: The mount host in the example is pdc-mount.

52

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

Step 3: Setting up replication


Replication requires that you set up a relationship between the Celerra systems at both the primary and secondary data centers. Both systems must be running at least version 5.5 of the Celerra Network Server software.

Step 3a: Set up trust relationship


At this point our effort has been directed at setting up two servers that are almost identical. Now we need to set up replication between them so that disaster recovery becomes a possibility. To do so, the production and disaster recovery Celerra systems must have a trust relationship. The trust relationship establishes bidirectional communication between the systems. The following steps summarize the procedure for the purposes of disaster recovery. Refer to Using Celerra Replicator for iSCSI for a complete description of how to set up a trust relationship. 1. Make copies of Appendix A, Celerra Worksheet, for pdc-celerra and sdc-celerra. If you have already done it, fill out Table 4, Celerra general information, for each Celerra. 2. Determine a passphrase to be shared between systems. The passphrase must be the same for both pdc-celerra and sdc-celerra. 3. Run the following command as root on pdc-celerra:
#> nas_cel -name <cel_name> -create <ip> -passphrase <phrase_name>

where <cel_name> is the name of sdc-celerra, <ip> is the IP address of the primary Control Station on sdc-celerra, and <phrase_name> is the passphrase that you chose in the previous step. Note these values in corresponding columns of the pdc-celerra worksheet, in Table 6, Celerra trust relationships. Refer to Example 4 for an example illustration.
Example 4

Establish trust relationship from production Celerra to disaster recovery Celerra

[root@pdc-cellera root]# nas_cel -name sdc-celerra -create 10.6.29.126 -passphrase passphrase operation in progress <not interruptible>...

Step 3: Setting up replication

53

Setting up Disaster Recovery

id - 4 name - sdc-celerra owner - 0 device channel net_path - 10.6.29.126 celerra_id - APM000548000410000 passphrase - passphrase [root@pdc-celerra root]#

4. In the same way, establish trust from the disaster recovery Celerra in the secondary data center to the production Celerra in the primary data center. Run the command from the previous step from sdc-celerra, with the appropriate values.
#> nas_cel -name <cel_name> -create <ip> -passphrase <phrase_name>

where <cel_name> is the name of pdc-celerra, <ip> is the IP address of the primary control station on pdc-celerra, and <phrase_name> is the passphrase that you chose. Note these values in corresponding columns of the sdc-celerra worksheet, in Table 6 on page 122.

Step 3b: Create replication jobs


You will need to repeat Step 3b: Create replication jobs for each application set containing data to be protected. You should create additional jobs for each application set that needs to be protected. 1. Start the Replication Manager Console on the primary RM server. 2. Right-click Hosts and select New Host to add the production host to the list of managed hosts. Add the SQL host name of the secondary data center to the list of managed hosts. The Exchange host name cannot be added because it is not created at this time.
Note: For complete details on building a replication job, refer to the EMC Replication Manager Version 5.0.3 Product Guide. The following procedure highlights significant Replication Manager dialog boxes related to disaster recovery setup.

3. Right-click Application Sets and select New Application Set. Use the Application Set Wizard to configure an application set containing the production hosts databases and/or file systems.

54

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

Microsoft SQL Server and Microsoft Exchange Server You are prompted to supply an administrative username and password for these applications. 4. Right-click the application set and select Create job to create a replication job for the application set. 5. In the Replication Technology drop down box, select Celerra iSCSI Copy as shown in the following figure, and then click Next.

Figure 13

Job Wizard - Job Name and Settings

Step 3: Setting up replication

55

Setting up Disaster Recovery

6. In the Celerra replication storage panel, enter the name and Data Mover IP address for the secondary Celerra (sdc-celerra).

Figure 14

Job Wizard - Celerra replication storage Note: When specifying the disaster recovery Celerra for the replication, use the IP address for the Data Mover where the destination iSCSI target is listening.

7. In the Job Wizard Mount Options panel, you can optionally set up this job to automatically mount the replica on another host. This can be used for various operations that are out of the scope of this procedure. These include, but are not limited to: Backup operations Consistency checking Reporting Disaster recovery testing

56

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

Figure 15

Job Wizard - Mount Options Note: By default the remote replica will be mounted as a read-only device. There are options to mount it as a temporarily writable device. For more information, see the EMC Replication Manager Version 5.0.3 Administrators Guide or Appendix D, Mounting Remote Replicas.

8. In the Users to be Notified panel, enter email addresses of users to be notified when the job completes. This is a very important step in the Celerra disaster recovery process, because each email contains the following information: iSCSI Copy session names Destination LUN numbers This information is invaluable later in the disaster recovery process, when you are required to manually fail over iSCSI copy replication sessions and mask the destination LUNs. 9. Set up replication for your other applications at this time. The following points are important to remember: One of the available LUNs with the proper size will be selected from the destination machine, but the selection order is not guaranteed to be predictable. If your environment
Step 3: Setting up replication
57

Setting up Disaster Recovery

requires that a specific LUN number be used as the destination, ensure that when each replication job is set up, there is only one acceptable destination LUN available. It is possible to replicate LUNs with more than one logical partition, however this is not supported. In a Microsoft SQL Server environment, you cannot replicate the system databases. Replicate user databases only. 10. Select Jobs in the tree panel. Individual jobs appear in the content panel. Right-click a job in the content panel and select Run. The job starts processing after you confirm that you want to run the job.
Normal iSCSi snapshot replica

Clone replica that can be promoted to your host

Figure 16

Viewing replicas in the content panel

Step 3c: Verify replication


1. Run nas_replicate list on each Celerra Network Server to verify that both servers know about the replication. The output should show one job for each volume that you are replicating, and each job should be tagged with the hostname that currently owns the volume as well as the drive letter which that volume is

58

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

mounted on. Note this information in the Destination columns of the pdc-celerra worksheet in Table 7 on page 123. Refer to Example 5 on page 59 for examples.
Example 5

Replication information from pdc-celerra

login as: root root@10.6.29.125s password: Last login: Fri Jul 21 13:28:21 2006 from uscsgoliabl1c.corp.emc.com EMC Celerra Control Station Linux Tue Mar 28 12:02:02 EST 2006 [root@pdc-celerra root]# nas_replicate -list server: Local Source session_id fs29_T12_LUN23_APM00055005029_0000_fs32_T10_LUN23_APM00054800041_0000 application_label - RM_pdcserver_F:\ session_id fs29_T12_LUN22_APM00055005029_0000_fs32_T10_LUN22_APM00054800041_0000 application_label - RM_pdcserver_L:\ Local Destination [root@pdc-celerra root]#

2. Optionally enter the command on each Celerra:


server_iscsi <mover_name> lun info <number>

For each LUN number that is a replication source or destination on that Celerra, you should see a representation of the replication job number (from nas_replicate list) and an indicator that the LUN is snapped. This indicates that the LUN is part of the replication relationship. Refer to Example 6 and Example 7 on page 60 for examples.

Step 3: Setting up replication

59

Setting up Disaster Recovery

Example 6

LUN information for sdc-celerra

server_2: Local Source Local Destination session_id fs29_T12_LUN23_APM00055005029_0000_fs32_T10_LUN23_APM00054800041_0000 application_label - RM_pdcserver_F:\ session_id fs29_T12_LUN22_APM00055005029_0000_fs32_T10_LUN22_APM00044800041_0000 application_label - RM_pdcserver_L:\ [root@sdc-celerra root]# server_iscsi server_2 -lun -info 21 server_2 : Logical Unit 21 on target rmse_dr_target: (Production) fsid-32 size-20480MB alloc-0MB dense path-/rm_fs_sdc/fs32_T10_LUN21_APM00054800041_0000/ fs32_T10_LUN21_APM00054800041_0000 (snapped) replication-destination max_extension_size-0MB [root@sdc-celerra root]# server_iscsi server_2 -lun info 22 server_2 : Logical Unit 22 on target rmse_dr_target: (Production) fsid-32 size-20480MB alloc-0MB dense path-/rm_fs_sdc/fs32_T10_LUN22_APM00054800041_0000/ fs32_T10_LUN22_APM00054800041_0000 (snapped) replication-destination max_extension_size-0MB [root@sdc-celerra root]# server_iscsi server_2 -lun info 23 server_2 : Logical Unit 23 on target rmse_dr_target: (Production) fsid-32 size-20480MB alloc-0MB dense path-/rm_fs_sdc/fs32_T10_LUN23_APM00054800041_0000/ fs32_T10_LUN23_APM00054800041_0000 (snapped) replication-destination max_extension_size-0MB [root@sdc-celerra root]#

Example 7

LUN information for pdc-celerra

[root@pdc-celerra root]# server_iscsi server_2 -lun -info 21 server_2 : Logical Unit 21 on target rmse_dr_target: (Production) fsid-29 size-20480MB alloc-0MB dense

60

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Setting up Disaster Recovery

path-/se_fs_sdc/fs29_T12_LUN21_APM00055005029_0000/ fs29_T12_LUN21_APM00055005029_0000 (snapped) replication-source max_extension_size-89235MB [root@pdc-celerra root]# server_iscsi server_2 -lun info 22 server_2 : Logical Unit 22 on target rmse_dr_target: (Production) fsid-29 size-20480MB alloc-0MB dense path-/se_fs_sdc/fs29_T12_LUN22_APM00055005029_0000/ fs29_T12_LUN22_APM00055005029_0000 (snapped) replication-source max_extension_size-89235MB [root@pdc-celerra root]# server_iscsi server_2 -lun info 23 server_2 : Logical Unit 23 on target rmse_dr_target: (Production) fsid-29 size-20480MB alloc-0MB dense path-/se_fs_sdc/fs29_T12_LUN23_APM00055005029_0000/ fs29_T12_LUN23_APM00055005029_0000 (snapped) replication-source max_extension_size-89235MB [root@pdc-celerra root]#

The disaster recovery solution is now set up and ready to respond in the event of a catastrophic failure at the primary data center. Continue to Chapter 3, Failing Over, to implement the failover phase of DR.

Step 3: Setting up replication

61

Setting up Disaster Recovery

62

iSCSI Clustered Disaster Recovery Solution Implementation Guide

3
Invisible Body Tag

Failing Over

This chapter contains the procedures for failing over the live data from a failed production system to the disaster recovery system in the secondary data center.

Failover overview .............................................................................. Step 1: Preparing for application data failover .............................. Step 2: Failing over the RM server................................................... Step 3: Promoting clone replica to production and failing over .

64 66 70 71

Failing Over

63

Failing Over

Failover overview
This chapter details the procedures necessary to complete the Failover phase as shown in Figure 17.

SETTING UP DISASTER RECOVERY


Install hardware and software Establish trust Create replication jobs

FAILING OVER
Prepare DR system Fail over live data to DR system

PREPARING FOR FAILBACK


Install new production system Establish trust Create replication jobs

FAILING BACK AND FINAL RECOVERY


Fail back from DR system to new production system Initiate replication from new production system back to DR system

Done
CIP-000709

Figure 17

Failover phase of disaster recovery solution

To successfully complete the failover phase of the disaster recovery solution, ensure that your environment meets the minimum requirements outlined under Limitations and minimum system requirements on page 22.

64

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Over

CAUTION Failure to follow the procedures exactly as outlined in this chapter may result in data corruption or loss. The failover phase is shown in Figure 18. The primary data center (PDC) has failed and the live data services move to the secondary data center (SDC). Refer to Concepts and terms on page 25 for more information.

PRIMARY DATA CENTER (PDC) (Production System)

SECONDARY DATA CENTER (SDC) (Disaster Recovery System serves live data)

SB14

SB15

SB14

SB13

SB12

SB10

SB11

SB10

SB9

SB8

SB6

SB7

SB6

SB4

SB5

SB4

SB2

SB3

SB2

SB0

SB1

SB0

SB1

SB3

SB5

SB7

SB9

SB8

SB11

SB13

SB12

SB15

PS0

PS1

PS2

PS3

PS4

SMB0 SMB1

PS0

PS1

PS2

PS3

PS4

SMB0 SMB1

sdcserver Configure Exchange and application services

pdc-celerra

pdcserver

sdc-celerra

FAILING OVER
CIP-000716

Figure 18

Failover locations and activities

CAUTION The failover procedure described in this chapter is not appropriate for testing DR setup. The procedure will fail over services to the DR site, and will require that you rebuild the production site hosts before they can be recovered. Do not perform this procedure on a production system unless you are actually recovering from a disaster.

Failover overview

65

Failing Over

Step 1: Preparing for application data failover


At this point in the procedure a disaster has caused the primary data center to cease normal operations.

Step 1a: Determine necessity of failover


Before deciding to fail over services, perform the following actions: 1. Determine if failover is appropriate for your situation. Failover is recommended if: the magnitude of the primary data center outage makes the overhead of failover worthwhile, or the primary data center has experienced data loss. 2. If you determine that failover is the necessary course of action, verify that the secondary data center is ready to accept the services that you need to fail over: a. Ensure that all required network services such as Active Directory, DNS, and NTP are up and running at the secondary data center. The exact list of necessary services depends on the applications being protected. b. Identify the DR server at the secondary data center that will take over the responsibilities of the production server. Ensure that it meets the requirements as stated in Limitations and minimum system requirements on page 22. In our example, the disaster recovery cluster at the secondary data center is called sdccluster. As detailed in Step 2: Setting up the disaster recovery cluster on page 41, there are several requirements for this system related to drive letter mapping and network name. These are not covered in detail here. c. If the system is running Microsoft Cluster Services at the primary site, make sure that the disaster recovery site system is also running Microsoft Cluster Services.
Note: It is possible to successfully failover a two node primary cluster to a single node running Cluster Services at the DR site.

66

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Over

Step 1b: Install user applications


If you did not install applications in Chapter 2, Step 2h: Install user applications on page 51 for any reason, please install them, following the instructions from that step at this time.

Step 1c: Fill out sdccluster and sdc-celerra worksheets


In preparation for failing over sessions, transfer information from the pdccluster and pdc-celerra worksheets to the sdccluster and sdc-celerra worksheets. You will use this information when mapping drives and LUNs during failover. 1. From the pdccluster copy of Appendix B, Cluster Worksheet, copy the Drive letter, Application and Purpose information to Table 9 on page 127 in the sdccluster worksheet. 2. As part of the failover process, the destination LUNs on sdc-celerra become source LUNs so that they can be used. In order to preserve information for failback, we need to record this information in the worksheet for sdc-celerra. Table 7 on the worksheet for sdc-celerra should be empty at this time. In this table, fill in the Source Target #, LUN #, Drive Letter, and Purpose fields with information from the appropriate lines in Table 7 on page 123 on the worksheet for pdc-celerra. You are filling in the source target and LUN number fields for sdc-celerra with the destination target and LUN number fields from pdc-celerra. 3. In Table 7 of the sdc-celerra worksheet, write in the Source Hostname field the name of the host that will take over for the same hosted listed in the same line on the pdc-celerra worksheet.

Step 1d: Fill out failover support worksheet


This step collects important information about the replication environment that will be necessary during failover. 1. Log in to sdc-celerra at the secondary data center. 2. Use the nas_replicate command to display information about your active replication sessions:
#nas_replicate -list

Step 1: Preparing for application data failover

67

Failing Over

This command outputs a list of the current replication sessions, as shown in Example 8.
Example 8

nas_replicate -list command

[root@sdc-celerra root]# nas_replicate -list server_2: Local Source Local Destination session_id = fs23_T3_LUN4_APM00055005029_0000_fs23_T3_LUN14_APM00054800041_0000 application_label = RM_pdcserver_G:\ session_id = fs23_T3_LUN3_APM00055005029_0000_fs23_T3_LUN13_APM00054800041_0000 application_label = RM_pdcserver_F:\ Note: Certain session IDs are associated with drive letters. These are the drive letter locations where the LUNs are mounted on pdccluster, and they will need to be mounted in the same locations on sdccluster.

3. Make a copy of Appendix C, Failover Support Worksheet, for each host you want to fail over. In our example, the worksheet is for the failover of pdccluster to sdccluster. 4. On the worksheet, note the hostname and IP address for sdccluster in Table 10 on page 131. Record the correct device number with its associated LUN on Table 11 on page 131. The device number comes from the device detail tab for each logged in target, while the drive letters and LUN numbers come from previously recorded information in the worksheet. Using the examples from Example 8, the destination information in failover worksheet in Table 1 on page 68 would be as shown as follows:
Table 1

Destination examples for failover worksheet Drive letter F: G: Destination target/LUN 13 14 Device number 3 4

5. On the copy of Appendix B, Cluster Worksheet, for sdccluster, fill in Table 9 on page 127 from the sdc-celerra copy of Table 7 on page 123 Match the Drive letter in Table 9 to the Drive letter in Table 7 and fill in the target and LUN information on Table 9.

68

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Over

6. On the sdccluster copy of Appendix C, Failover Support Worksheet, note the Drive letter and Destination target/LUN number for each LUN on the host. Get this information from either sdc-celerra Table 7 or sdccluster Table 9. You will get the device number information later.

Step 1: Preparing for application data failover

69

Failing Over

Step 2: Failing over the RM server


The Replication Manager command line interface (CLI) provides commands to failover the RM server.
Note: This step assumes that the RM server is unavailable because of the disaster. If this is not true, you may skip this step.

Example 9

Steps to failover the RM Server

The following example illustrates the commands used for RM server failover:
C:\Program Files\EMC\rm\gui> rmcli connect host=rm2_host port=65432 Connected to 'rm2_host' (port=65432). RM-CLI : login user=Administrator password=mypassword Login as user 'Administrator' successful. RM-CLI : dr-set-primary RM-CLI : dr-get-state 0 PRIMARY ALONE.

The status PRIMARY ALONE indicates that the current status of the RM server is primary and there is no secondary server configured for this server.

70

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Over

Step 3: Promoting clone replica to production and failing over


The LUNs on sdc-celerra are still set up as destinations for replication. This step will mount the LUNs as writable volumes on sdccluster.

Step 3a: Fail over appropriate replication sessions

CAUTION From this point you must complete the entire disaster recovery process to the end. After this step, you cannot force your environment back into its original state. From sdc-celerra, run the nas_replicate -failover command for each of the session ID numbers listed in the nas_replicate -list output for pdc-celerra.

CAUTION When the disaster recovery Celerra Network Server (sdc-celerra) converts the LUNs to "writable," a message is sent to the source of that job to force it to be read-only. In a real disaster, this message will not go through to the source, or it will not matter, but in a test scenario or if an incorrect session ID is specified, then it may cause the data to go offline.

Example 10

nas_replicate -failover command

#nas_replicate -failover id=fs23_T3_LUN4_APM00055005029_0000_fs23_T3_LUN14_APM00054800041_0000

Once the sessions are failed over, they no longer appear in the nas_replicate -list output and you are not be able to view the LUN number information that was collected in Step 1d: Fill out failover support worksheet on page 67.

Step 3b: Mask new LUNs to disaster recovery server


Since the new LUNs are now writable on the Celerra Network Server, use Celerra Manager or the CLI to mask the new LUNs so that

Step 3: Promoting clone replica to production and failing over

71

Failing Over

sdccluster can mount them for recovery. Refer to Configuring iSCSI Targets on Celerra for more information.

Step 3c: Promoting the iSCSI clone replica to Production


To promote the iSCSI clone replica: 1. Right-click the iSCSI clone replica and select Promote to Production as shown in the following figure.

Figure 19

EMC Replication Manager - Promoting the iSCSI clone replica to Production

72

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Over

2. The following confirmation message box appears and asks if you have manually mounted the replica to a new production host.

Figure 20

Confirmation for mounting the replica

3. Click Yes only if you have completed steps 3a and 3b successfully. 4. The Promote Replica dialog box appears. Enter the name of the new production host and application set on that host and click OK.

Figure 21

Promote Replica dialog box Note: In the event of a disaster, you may want to disable schedules on the original application set.

Step 3: Promoting clone replica to production and failing over

73

Failing Over

5. The Promote Replica progress panel appears and displays current status of the replica promotion.

Figure 22

Promote Replica progress panel

Step 3d: Recover replicas


Replicas are now added to the newly created application set on the new production host. All of the replication data from pdccluster should be loaded, and you should be able to see the snapshots that are available for restore. To recover the replicas: 1. Mount the disaster recovery volumes on sdccluster. a. From a command prompt, access the following path: C:\Program files\EMC\rm\client\bin> b. Run the following command:
RmMountVolume.bat <device number> <drive letter>

Refer to Example 11 for example uses of the command.

74

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Over

Example 11

RmMountVolume commands

C:\Program files\EMC\rm\client\bin> RmMountVolume.bat 3 F: C:\Program files\EMC\rm\client\bin> RmMountVolume.bat 4 G:

2. Log in to the RM Console of the current primary RM server. 3. Restore the replica of newly created appset running on the DR host. See the following application notes for settings to use. NTFS Use default options. Microsoft Exchange Server Use default options. Microsoft SQL Server Please select the recovery option to leave the database in an operational state. Once this process completes without errors, continue to Step 3f: Recover applications.

Step 3e: Complete application setup


Microsoft SQL Server and NTFS At this time, user applications setup should be complete according the instructions provided in Chapter 2, Step 2h: Install user applications on page 51. Microsoft Exchange In Chapter 2, Step 2h: Install user applications on page 51 you installed Microsoft Exchange Server following the appropriate documentation from Microsoft. However, you did not create any virtual server instances. Now create a Microsoft Exchange Virtual Server instance with the same name as the one originally on pdccluster. Use the newly available disks that were mounted on sdccluster during Step 3d: Recover replicas on page 74 as the required disk resources. It may be necessary to change the IP address associated with the SMTP, HTTP, or other Exchange related servers. Refer to Microsoft Knowledge Base article 315691 which covers conditions that can exist after changing the IP address of an Exchange server. After creating the virtual server instance, stop any of the various Exchange cluster resources, except for Exchange Information Store,
Step 3: Promoting clone replica to production and failing over
75

Failing Over

that may be using the iSCSI LUNs containing the data for logging or other activities. For example, the MTA or SMTP instances may keep log data on the same drive as the Exchange databases. Your message tracking log may also use this drive. Stop these services before proceeding to the next step.

Step 3f: Recover applications


Replication Manager handles application recovery as part of the restore process. However, it is always a good idea to verify recovery. At this time your environment should be ready to service user requests. NTFS Please verify that file data is available using methods appropriate to your environment. Microsoft Exchange Server Users should now be able to access their mailboxes. Microsoft SQL Server The database should now be available. You can check this using the Microsoft SQL Server Management Studio. If you did not select the option to leave the database operational in Step 2h: Install user applications on page 51, then you will need to recover the database. The command is:
Microsoft SQL Server> restore database <databasename> with recovery

Run this command from the Microsoft SQL Server Management Studio application after the database has been attached.

76

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Over

Step 3g: Archive primary data center worksheets


At this point, the worksheets you filled out for pdccluster and pdc-celerra, and the failover support worksheet for the primary data center, no longer apply. These should be archived so that they do not create confusion during the failback phase. During failback you will create new failover support worksheets and worksheets for the new Celerra and Windows server in the primary data center.

Step 3: Promoting clone replica to production and failing over

77

Failing Over

78

iSCSI Clustered Disaster Recovery Solution Implementation Guide

4
Invisible Body Tag

Preparing for Failback

This chapter describes the steps to fail back the production services to the primary data center from the disaster recovery system.

Overview ............................................................................................. 80 Step 1: Setting up the new production cluster ............................... 83 Step 2: Setting up replication............................................................ 92

Preparing for Failback

79

Preparing for Failback

Overview
After the primary data center has been restored to full functionality, you need to engage in setup procedures to prepare for serving live data. This chapter contains procedures to prepare the production system for serving live data, as shown in Figure 23:

SETTING UP DISASTER RECOVERY


Install hardware and software Establish trust Create replication jobs

FAILING OVER
Prepare DR system Fail over live data to DR system

PREPARING FOR FAILBACK


Install new production system Establish trust Create replication jobs

FAILING BACK AND FINAL RECOVERY


Fail back from DR system to new production system Initiate replication from new production system back to DR system

Done
CIP-000710

Figure 23

Failback preparation phase of disaster recovery solution

To successfully complete the failback phase of the disaster recovery solution, ensure that your environment meets the minimum requirements outlined under Limitations and minimum system requirements on page 22.

80

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Preparing for Failback

CAUTION Failure to follow the procedures exactly as outlined in this chapter may result in data corruption or loss. The failback preparation phase is shown in Figure 24. The primary data center (PDC) has failed and the live data services have moved to the secondary data center (SDC) during failover. Refer to Concepts and terms on page 25 for more information.

Overview

81

Preparing for Failback

PRIMARY DATA CENTER (PDC) (Production System)

SECONDARY DATA CENTER (SDC) (Disaster Recovery System)

SB14

SB15

SB14

SB12

SB12

SB13

SB10

SB11

SB10

SB9

SB6

SB7

SB6

SB4

SB5

SB4

SB2

SB2

SB3

SB0

SB1

SB0

PS0

PS1

PS2

PS3

PS4

SMB0 SMB1

PS0

PS1

PS2

PS3

PS4

SMB0 SMB1

pdc-celerra Create iSCSI target LUNs

pdccluster Install RM

sdc-celerra Create replication jobs

SB1

SB3

SB5

SB7

SB9

SB8

SB8

SB11

SB13

SB15

sdccluster

IP Network RECOVERY SETUP


Figure 24
CIP-000970

Failback preparation locations and activities Note: The example used in this document assumes that the RM server and application server are the same host. This is an acceptable configuration if you are protecting only one host. For larger configurations your RM server and application servers should be on separate hosts. Please see the EMC Replication Manager 5.0.3 Administrators Guide for more information about deploying Replication Manager in your environment.

Note: While for the purposes of this example we are assuming that the new primary data center is the same site as the original primary data center, this is not a requirement.

82

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Preparing for Failback

Step 1: Setting up the new production cluster


When it is restored and ready, configure the new production cluster according to the recommendations for the applications you are running.

Step 1a: Install Windows Server 2003 and create a cluster


It is assumed that this procedure is well understood so it is not covered in any detail here. 1. Make a fresh copy of Appendix B, Cluster Worksheet for the new production cluster. 2. For the new production cluster, note the location, contact and identifying information on the worksheet and in Table 8 on page 127. For this example we name the new production cluster installed at the primary data center pdccluster. 3. Create a cluster using the nodes that you have installed with the quorum drive on the Production site Celerra. In this example we will call the Celerra at the production site pdc-celerra. The quorum drive should be on an iSCSI target separate from all of the data LUNs. 4. Make a copy of Appendix A, Celerra Worksheet, for the pdc-celerra and fill in Table 4, Celerra general information, for the quorum drive using this example. Determining the iSCSI target number If the IQN for the Celerra iSCSI target is: iqn.1992-05.com.emc:apm000550050290000-15, then the target number is 15. The relevant number is in bold. Note it in the Number column. This target number should be unique per Data Mover and will help you fill out the other tables. Fill out the Alias and Data Mover columns for the iSCSI target

Step 1b: Install all recommended updates


Updates for Microsoft software are available using Windows Update. Before proceeding, install all recommended updates, with any optional updates that apply to host systems.

Step 1: Setting up the new production cluster

83

Preparing for Failback

Updates are available from the Microsoft Windows update website: http://windowsupdate.microsoft.com Verify that you have installed hotfixes KB891957 and KB898790. These two hotfixes address specific Windows Server 2003 issues when using this DR procedure.

Step 1c: Install Microsoft iSCSI Software Initiator


The minimum supported version for the Microsoft iSCSI Software Initiator is 2.0. The Initiator is used to connect to the shared iSCSI LUNs on the Celerra systems. Full details on installing and using the iSCSI Initiator can be found in the accompanying documentation.

Step 1d: Install EMC Solutions Enabler


Install Solutions Enabler v6.3.2.0 (edit level 787) or later from the EMC Powerlink website: http://Powerlink.EMC.com The Solutions Enabler provides a common set of tools for communicating with other EMC products. Accept all default options during installation. Refer to the EMC Solutions Enabler Version 6.3 Installation Guide for more information.

Step 1e: Create new destination LUNs


You need to set up a read-only LUN for each application that you want to failback to the original production system.

CAUTION All drive letters on pdccluster associated with data that will be replicated must be reserved on sdccluster and may not be used at this time. Before you begin As you go through the procedure in this step, notice that while you create replication destinations based on iSCSI LUNs, the replication jobs are created based on the applications that use those LUNs. For that reason, create all iSCSI destination LUNs for each application before you create a replication job for that application.

84

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Preparing for Failback

1. Determine the size of the LUN to replicate. The DR LUN must be the same size as the production LUN. This was specified when you set up the LUN, but if you need to verify the size, use the following command:
Note: All of the commands in Step 1 should be run from pdc-celerra.

$> server_iscsi <movername> -lun -list The resulting list displays iSCSI LUNs configured for a specific datamover. Obtain the LUN number that you need for the next command: $> server_iscsi <movername> -lun -info <lun_number> The resulting information lists the LUN size and the target with which it is associated. 2. Create the target LUN from the CLI on new pdc-celerra.
Note: All of the commands in Step 2 should be run from sdc-celerra.

The command is as follows, as shown in: $> server_iscsi <movername> -lun -number <lun_number> -create <target_alias_name> -size <size> [M|G|T] -fs <fs_name> -readonly yes
Note: The read-only option is required. It designates the LUN as a replication target. You can see this by getting LUN information for the newly created LUN. You can only create destination LUNs from the CLI, not from Celerra Manager.

Example 12

Creating iSCSI LUNs from sdc-celerra

[root@sdc-celerra root]# server_iscsi server_2 -lun -number 25 -create NonClust_tgt -size 25G -fs secondary_site_iscsi -readonly yes server_2 : done [root@sdc-celerra root]# server_iscsi server_2 -lun -number 35 -create NonClust_tgt -size 35G -fs secondary_site_iscsi -readonly yes server_2 : done [root@sdc-celerra root]]#

Step 1f: Perform RM pre-installation steps for Microsoft Clusters

Step 1: Setting up the new production cluster

85

Preparing for Failback

Step 1g: Install Replication Manager


In a clustered environment,

You install the Replication Manager Server component on only one node of the cluster Replication Manager Server fails over automatically to the other node when the cluster fails over

Replication Manager Agent and console components do not fail over with the cluster. If you install agent or console components on a clustered system you should install them locally to all nodes. Install Replication Manager 5.0 on the disaster recovery cluster and then install Service Pack 3. Service Pack setup prompts for the following information: 1. For the DR Server type (Secondary, Standalone, or Primary), choose Secondary Server to designate this cluster as the secondary RM server.

Figure 25

Replication Manager - Designate secondary RM server type

86

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Preparing for Failback

2. For the communication port, enter the desired communication port through which the primary RM server will communicate with the secondary RM server to synchronize data in their respective RM databases. The default communication port is 1964.

Figure 26

InstallShield Wizard - Enter communication port for servers

Step 1: Setting up the new production cluster

87

Preparing for Failback

3. When asked if the primary Replication Manager Server is currently installed, select Yes, the Primary Server is currently installed.

Figure 27

Replication Manager - Primary RM server is currently installed

88

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Preparing for Failback

4. For the name and port number of the Primary Server, enter the name of the Primary Replication Manager Server which is now running at the DR location and the communication port number that you entered for the Secondary RM server in substep 2 of step 1g.

Figure 28

Replication Manager secondary server name and port

5. Verify the status of RM server by using the commands provided by the Replication Manager command line interface (CLI).
Example 13

Steps to verify the status of the RM Server

A set of commands are used to verify the status of RM server. Following is the syntax of those commands: rmcli connect host=<RM_host_name> port=<server_control_port_number> login user=<Administrator> password=<mypassword> dr-get-state The following example illustrates the commands used to verify the RM server status:
C:\Program Files\EMC\rm\gui> rmcli connect host=rm2_host port=65432

Step 1: Setting up the new production cluster

89

Preparing for Failback

Connected to 'rm2_host' (port=65432). RM-CLI : login user=Administrator password=mypassword Login as user 'Administrator' successful. RM-CLI : dr-get-state 0 SECONDARY ACTIVE.

The status, SECONDARY ACTIVE indicates that the current status of the RM server is secondary and there is a primary server configured for this server.

Step 1h: Install user applications


At this time you may set up the user-facing applications on the new production cluster. If you create a new network name or IP address resource, note the appropriate values for the new pdccluster in Table 8, General cluster information, of Appendix B, Cluster Worksheet. Refer to the application specific notes in this section. Microsoft SQL Server and NTFS If you are running Microsoft SQL Server server or NTFS, you may set up the application as described in the published best practices. Microsoft Exchange Server You can install Microsoft Exchange Server on the DR cluster at this time by following the relevant instructions from Microsoft. You should not install any Virtual Server instances at this time since they will be setup after failback.
Note: All of the drive letters on sdcserver that are associated with data that will be replicated must be reserved on the new pdcserver and may not be used at this time.

Step 1i: Optionally setup the production site mount host


In a clustered environment, you cannot mount replicas from the cluster to an individual cluster node for processing. Therefore, you must set up an RM Mount host that is separate from the cluster to administer and mount replicas. The hardware requirements for the mount host are identical to the requirements for a single cluster node. The mount host should be running Windows 2003 with EMC Solutions Enabler 6.3 and RM
90

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Preparing for Failback

agent software installed. The mount host does not need to have any shared storage dedicated to it at this time, but it will need to have the Microsoft iSCSI initiator installed. The mount host must not be a member of a cluster, and does not need to have cluster software installed. Replication Manager Agent software on then new production host and corresponding mount host must be at the same version.
Note: The mount host in the example is pdc-mount

Step 1: Setting up the new production cluster

91

Preparing for Failback

Step 2: Setting up replication


Replication requires a Celerra Network Server at both the primary and secondary data centers. Both systems must be running at least version 5.5 of the Celerra Network Server software.

Step 2a: Set up trust relationship


To start replication from the disaster recovery system at the secondary data center, set up trust between sdc-celerra and pdc-celerra. For a complete description of how to set up a trust relationship, refer to the Best Practices for Celerra iSCSI white paper. The following steps summarize the procedure for the purposes of disaster recovery: 1. Make a copy of Appendix A, Celerra Worksheet for the new pdc-celerra and fill out Table 4 on page 122. Ensure that you also have the filled out copy of Appendix A, Celerra Worksheet for sdc-celerra. You will make notations on both during this procedure. 2. Determine a passphrase to be shared between systems. The passphrase must be the same for both pdc-celerra and sdc-celerra.
Note: If you use the same name for the new production Celerra as you used for the Celerra that was lost in the disaster event, your passphrase must be the same as it was in Step 3a: Set up trust relationship on page 53, recorded on the sdc-celerra Celerra worksheet. If you name the new production Celerra a different name from the original production Celerra, delete the trust between sdc-celerra and pdc-celerra and build new worksheet tables for both the new pdc-celerra and sdc-celerra.

3. Run the following command as root on the new pdc-celerra:


#> nas_cel -name <cel_name> -create <ip> -passphrase <phrase_name>

where <cel_name> is the name of sdc-celerra, <ip> is the IP address of the primary Control Station on sdc-celerra, and <phrase_name> is the passphrase that you chose in the previous step. Note these values in corresponding columns of the new pdc-celerra worksheet, in Table 6 on page 122.

92

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Preparing for Failback

Refer to Example 4 on page 53 for an example illustration. If you are using a new name for the new production Celerra, run the command from the previous step from sdc-celerra with appropriate values. If you are using the same name, continue to Step 2b: Create replication jobs.
#> nas_cel -name <cel_name> -create <ip> -passphrase <phrase_name>

where <cel_name> is the name of the new pdc-celerra, <ip> is the IP address of the primary control station on pdc-celerra, and <phrase_name> is the passphrase that you chose in the previous step. Note these values in corresponding columns of the sdc-celerra worksheet, in Table 6 on page 122. For examples of the values and entries in the worksheets, refer to Step 3a: Set up trust relationship on page 53.

Step 2b: Create replication jobs


You will need to repeat Step 2b: Create replication jobs for each application set containing data to be failed back. You should create additional jobs for each application set that needs to be failed back. 1. Start the Replication Manager Console on the primary RM server which is on the sdccluster. 2. Right-click Hosts and select New Host to add the new production host to the list of managed hosts. Add the SQL host name of secondary data center to the list of managed hosts. The Exchange host name cannot be added, because it is not created at this time.
Note: For complete details on building a replication job, refer to the EMC Replication Manager Product Guide. The following procedure highlights significant Replication Manager dialog boxes related to disaster recovery setup.

3. Right-click Application Sets and select New Application Set. Use the Application Set Wizard to configure an application set containing the production hosts databases and/or file systems. For Microsoft SQL Server and Microsoft Exchange Server, you are prompted to supply an administrative username and password.

Step 2: Setting up replication

93

Preparing for Failback

4. Right-click the application set and select Create job to create a replication job for the application set. 5. In the Replication Technology drop down box, select Celerra iSCSI Copy as shown in the following figure, then click Next.

Figure 29

Job Wizard - Select Replication technology

6. In the Celerra replication storage panel, enter the name and Data Mover IP address for the primary Celerra (new pdc-celerra).
Note: When specifying the disaster recovery Celerra for the replication, use the IP address for the Data Mover where the destination iSCSI target is listening.

94

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Preparing for Failback

Figure 30

Job Wizard - Celerra replication storage

7. In the Job Wizard Mount Options panel, you can optionally set up this job to automatically mount the replica on another host. This can be used for various operations that are out of the scope of this procedure. These include, but are not limited to: Backup operations Consistency checking Reporting Disaster recovery testing

Step 2: Setting up replication

95

Preparing for Failback

Figure 31

Job Wizard - Mount Options

8. In the Users to be Notified panel, enter email addresses of users to be notified when the job completes. This is a very important step in the Celerra disaster recovery process, because each email contains the following information: iSCSI Copy session names Destination LUN numbers This information is invaluable later in the disaster recovery process, when you are required to manually fail over iSCSI copy replication sessions and mask the destination LUNs. 9. After you finish configuring the job, a summary screen displays. Review the information to ensure that your job has been set up properly. 10. Set up replication for your applications at this time. The following points are important to remember: One of the available LUNs with the proper size will be selected from the destination machine, but the selection order is not guaranteed to be predictable. If your environment

96

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Preparing for Failback

requires that a specific LUN number be used as the destination, ensure that when each replication job is set up, there is only one acceptable destination LUN available. It is possible to replicate LUNs with more than one logical partition; however, this is not supported. In a Microsoft SQL Server environment, you cannot replicate the system databases. Replicate user databases only. 11. Select Jobs in the tree panel. Individual jobs appear in the content panel. Right-click a job in the content panel and select Run. The job starts processing after you confirm that you want to run the job. 12. When the job completes, select the application set from the tree panel to view all associated iSCSI clone and snapshot replicas.

Step 2c: Verify replication


1. Run nas_replicate list on each Celerra Network Server to verify that both servers know about the replication. The output should show one job for each volume that you are replicating, and each job should be tagged with the hostname that currently owns the volume as well as the drive letter which that volume is mounted on. Note this information in the Destination columns of the sdc-celerra worksheet in Table 7 on page 123. Refer to Example 14 for examples.
Example 14

Replication information from sdc-celerra

login as: root root@10.6.29.125s password: Last login: Fri Jul 21 13:28:21 2006 from uscsgoliablic.corp.emc.com EMC Celerra Control Station Linux Tue Mar 28 12:02:02 EST 2006 [root@sdc-celerra root]# nas_replicate -list server_2: Local Source session_id fs29_T12_LUN23_APM00055005029_0000_fs32_T10_LUN23_APM00054800041_0000 application_label - RM_sdcserver_F:\ session_id fs29_T12_LUN22_APM00055005029_0000_fs32_T10_LUN22_APM00054800041_0000 application_label - RM_sdcserver_L:\ Local Destination [root@sdc-celerra root]#

Step 2: Setting up replication

97

Preparing for Failback

Using the examples from Example 14, the Destination information for sdc-celerra in Table 7 on page 123 would be as shown in the following table.
Table 2

Destination examples for sdc-celerra worksheet Source information already recorded sdcserver, drive letter F, target 12 and LUN 23 sdcserver, drive letter L, target 12 and LUN 22 Destination: Celerra # / Target # 1/10 1/10 Destination: LUN # 23 22

2. Optionally verify that the Replication Manager database on the secondary RM server has been updated for the jobs. Open the RM console from the new pdccluster and verify that the information about jobs and application sets is in synch with that of the primary RM server. 3. Optionally enter the command on each Celerra Network Server:
server_iscsi server_2 lun info <number>

For each LUN number that is a replication source or destination on that Celerra Network Server, you should see a representation of the replication job number (from nas_replicate list) and an indicator that the LUN is snapped. This indicates that the LUN is part of the replication relationship. Refer to Example 6 on page 60 and Example 7 on page 60 for examples.

Step 2d: Verify worksheet information


At this point, verify that your worksheets are filled out as indicated: Primary data center worksheets For pdccluster, only Table 8 on page 127 should be filled out. For pdc-celerra, the following tables should be filled out:

Table 4 on page 122 Table 5 on page 122 Table 6 on page 122

Tables 3, 4, and 5 should be blank.

98

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Preparing for Failback

Secondary data center worksheets All tables should be filled out for both sdcserver and sdc-celerra. The disaster recovery solution is now ready to move live data back to the primary data center. Continue to Chapter 5, Failing Back and Final Recovery, to implement the failback phase of DR.

Step 2: Setting up replication

99

Preparing for Failback

100

iSCSI Clustered Disaster Recovery Solution Implementation Guide

5
Invisible Body Tag

Failing Back and Final Recovery

This chapter contains the procedures for failing back live data from a disaster recovery system to a new production system. Final recovery sets up replication from the new production system back to the disaster recovery system.

Overview ........................................................................................... Before you begin............................................................................... Step 1: Preparing to fail back replication sessions....................... Step 2: Failing back the RM server ................................................ Step 3: Promoting clone replica to production and failing back Step 4: Serving live data from the new production system ....... Step 5: Setting up final recovery ....................................................

102 104 105 109 110 114 117

Failing Back and Final Recovery

101

Failing Back and Final Recovery

Overview
At this point in the procedure, the primary data center should be recovered from the disaster. It is now ready to accept user requests to the applications that were transitioned during failover. This chapter details the procedures necessary to complete the failback phase as shown in Figure 32:

SETTING UP DISASTER RECOVERY


Install hardware and software Establish trust Create replication jobs

FAILING OVER
Prepare DR system Fail over live data to DR system

PREPARING FOR FAILBACK


Install new production system Establish trust Create replication jobs

FAILING BACK AND FINAL RECOVERY


Fail back from DR system to new production system Initiate replication from new production system back to DR system

Done
CIP-000711

Figure 32

Failback and final recovery phase of disaster recovery solution

To successfully complete the failback and final recovery phase of the disaster recovery solution, ensure that your environment meets the

102

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Back and Final Recovery

minimum requirements outlined under Limitations and minimum system requirements on page 22.

CAUTION Failure to follow the procedures exactly as outlined in this chapter may result in data corruption or loss. The failback phase is shown in Figure 33. The primary data center (PDC), now prepared to serve live data, is configured to bring the live data online. The secondary data center (SDC) goes offline and sdccluster is removed from the network and reformatted. The disaster recovery setup then begins again with Step 2: Setting up the disaster recovery cluster on page 41.

PRIMARY DATA CENTER (PDC) (Production System)

SECONDARY DATA CENTER (SDC) (Disaster Recovery System)

SB14

SB15

SB14

SB12

SB13

SB10

SB11

SB10

SB8

SB9

SB6

SB7

SB6

SB4

SB5

SB4

SB2

SB3

SB2

SB0

SB1

SB0

PS0

PS1

PS2

PS3

PS4

SMB0 SMB1

pdcserver Configure application services Install RM

PS0

PS1

PS2

PS3

PS4

SMB0 SMB1

pdc-celerra

sdc-celerra

SB1

SB3

SB5

SB7

SB9

SB8

SB11

SB13

SB12

SB15

sdcserver

IP Network FAILING BACK


CIP-000717

Figure 33

Failback locations and activities

Overview

103

Failing Back and Final Recovery

Before you begin

Make sure that all required network services such as Active Directory, DNS, and NTP are up and running at the primary data center. The exact list of necessary services depends on the applications being protected. Identify the new production cluster at the primary data center that will be taking over the responsibilities of sdccluster. We call this server pdccluster. Refer to Limitations and minimum system requirements on page 22 for the requirements necessary for the new pdccluster. Refer to Chapter 2, Setting up Disaster Recovery, for the requirements related to drive letter mapping and network naming in the destination system (in failback, the destination is pdccluster). These are not covered in detail here.

104

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Back and Final Recovery

Step 1: Preparing to fail back replication sessions


This procedure takes the live data offline from the disaster recovery system and sets it up on the new production system.
Note: Clients will not be able to access data once it is taken offline in this procedure. Ensure that you have made proper arrangements for the interruption of application services.

Step 1a: Prepare data for failback


1. Before beginning the process of failing back the live data, take sdccluster offline to prevent user updates. 2. Run another replication from the secondary data center to the primary data center so that you do not lose any data. 3. Make sure the Replication Manager job runs only after all other replication jobs complete.

Step 1b: Prepare applications for failback


If you did not install applications during Step 1h: Install user applications, on page 90 in Chapter 4, Preparing for Failback,for any reason, install them on pdccluster-main following the instructions from that step at this time.

Step 1c: Fill out new pdccluster and new pdc-celerra worksheets
In preparation for failing back sessions, transfer information from the sdccluster and sdc-celerra worksheets to the new pdccluster and pdc-celerra worksheets. You will use this information when mapping drives and LUNs during failover. 1. From the sdccluster copy of Appendix B, Cluster Worksheet, copy the Drive letter, Application and Purpose information to Table 9 on page 127 in the new pdccluster worksheet. 2. As part of the failback process, the destination LUNs on pdc-celerra become source LUNs so that they can be used. In order to preserve information for setting up for failover, we need to record this information in the worksheet for the new pdc-celerra. Table 7 on page 123 on the worksheet for the new
Step 1: Preparing to fail back replication sessions
105

Failing Back and Final Recovery

pdc-celerra should be empty at this time. In this table, fill in the Source Target #, LUN #, Drive letter, and Purpose fields with information from the appropriate lines in Table 7 on the worksheet for sdc-celerra. You are filling in the source target and LUN number fields for pdc-celerra with the destination target and LUN number fields from sdc-celerra. 3. In Table 7 of the sdc-celerra worksheet, write in the Source Hostname field the name of the host that will take over for the same hosted listed in the same line on the pdc-celerra worksheet.

Step 1d: Fill out failover support worksheet


This step collects important information about the replication environment that will be necessary during failback. 1. Log in to the new pdc-celerra at the primary data center. 2. Use the nas_replicate command to display information about your active replication sessions: #nas_replicate -list This command outputs a list of the current replication sessions, as shown in Example 15 on page 107.

106

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Back and Final Recovery

Example 15

nas_replicate -list command

[root@pdc-celerra root]# nas_replicate -list server_2: Local Source Local Destination session_id = fs23_T3_LUN4_APM00055005029_0000_fs23_T3_LUN14_APM00054800041_0000 application_label = RM_sdcserver_G:\ session_id = fs23_T3_LUN3_APM00055005029_0000_fs23_T3_LUN13_APM00054800041_0000 application_label = RM_sdcserver_F:\

Note: Certain session IDs are associated with drive letters. These are the drive letter locations where the LUNs were mounted on sdccluster. They need to be mounted in the same locations on pdccluster.

3. Make a copy of Appendix C, Failover Support Worksheet, for each host you want to fail over. In our example, the worksheet is for the failover of sdccluster to pdccluster. 4. On the worksheet, note the hostname and IP address for sdccluster in Table 10 on page 131. Record the correct device number with its associated LUN on Table 11, Drive and LUN information. The device number comes from the device detail tab for each logged in target, while the drive letters and LUN numbers come from previously recorded information in the worksheet. Using the examples from Example 15, the destination information in failover worksheet in Table 3, Destination examples for failover worksheet, would be as shown as follows:
Table 3

Destination examples for failover worksheet Drive letter F: G: Destination target/LUN 13 14 Device number 3 4

5. On the copy of Appendix B, Cluster Worksheet, for sdccluster, fill in Table 9 on page 127 from the sdc-celerra copy of Table 7 on page 123. Match the Drive letter in Table 9 to the Drive letter in Table 7 and fill in the target and LUN information on Table 9.

Step 1: Preparing to fail back replication sessions

107

Failing Back and Final Recovery

6. On the new pdccluster copy of Appendix C, Failover Support Worksheet, note the Drive letter and Destination Target/LUN number for each LUN on the host. Get this information from either the new pdc-celerra Table 7 or the new pdccluster Table 9. You will get the device number information later.

108

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Back and Final Recovery

Step 2: Failing back the RM server


The Replication Manager command line interface (CLI) provides commands to failover the RM server.
Example 16

Steps to failover the RM Server

The following example illustrates the commands used for RM server failover:
C:\Program Files\EMC\rm\gui> rmcli connect host=rm2_host port=65432 Connected to 'rm2_host' (port=65432). RM-CLI : login user=Administrator password=mypassword Login as user 'Administrator' successful. RM-CLI : dr-set-primary RM-CLI : dr-get-state 0 PRIMARY ALONE.

The status PRIMARY ALONE indicates that the current status of the RM server is primary and there is no secondary server configured for this server.
Note: You may also want to refer to the EMC Replication Manager Version 5.0.3 Administrators Guide for more information about changing the state of the RM server environment.

Step 2: Failing back the RM server

109

Failing Back and Final Recovery

Step 3: Promoting clone replica to production and failing back


The LUNs on pdc-celerra are still set up as destinations for replication. This step will mount the LUNs as writable volumes on pdccluster.

Step 3a: Fail over appropriate replication sessions

CAUTION From this point you must complete the entire disaster recovery process to the end. After this step, you cannot force your environment back into its original state. From new pdc-celerra, run the nas_replicate -failover command for each of the session ID numbers listed in the nas_replicate -list output for sdc-celerra.

CAUTION When the disaster recovery Celerra Network Server (sdc-celerra) converts the LUNs to "writable," a message is sent to the source of that job to force it to be read-only. In a real disaster, this message will not go through to the source, or it will not matter, but in a test scenario or if an incorrect session ID is specified, then it may cause the data to go offline.
#nas_replicate -failover id=fs23_T3_LUN4_APM00055005029_0000_fs23_T3_LUN14_APM000 54800041_0000

Once the sessions are failed over (fail back), they no longer appear in the nas_replicate -list output.

Step 3b: Mask new LUNs to new Production cluster


Since the new LUNs are now writable on the Celerra Network Server, use Celerra Manager or the CLI to mask the new LUNs so that pdccluster can mount them for recovery. Refer to Configuring iSCSI Targets on Celerra for more information.

110

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Back and Final Recovery

Step 3c: Promoting the iSCSI clone replica


To promote the iSCSI clone replica: 1. Right-click the iSCSI clone replica and select Promote to Production as shown in the following figure.

Figure 34

EMC Replication Manager - Promote the iSCSI clone replica

Step 3: Promoting clone replica to production and failing back

111

Failing Back and Final Recovery

2. The following confirmation message box appears and asks if you have manually mounted the replica to a new production host.

Figure 35

Confirmation before promoting the replica

3. Click Yes only if you have completed steps 3a and 3b successfully. 4. The Promote Replica dialog box appears. Enter the name of the new production host and application set on that host and click OK.

Figure 36

Promote Replica dialog box

112

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Back and Final Recovery

5. The Promote Replica progress panel appears and displays current status of the replica promotion.

Figure 37

Promote Replica progress panel

Step 3: Promoting clone replica to production and failing back

113

Failing Back and Final Recovery

Step 4: Serving live data from the new production system


The data is now ready to be set up as live data on pdccluster.

Step 4a: Install user applications in DR Mode


NTFS and Microsoft SQL Server These applications should have already been installed. Microsoft Exchange Server Consult the Microsoft Exchange Server documentation and perform these procedures. 1. Install Microsoft Exchange Server in the disaster recovery mode. 2. Start the Microsoft Exchange information store. 3. Start the Microsoft Exchange system attendant services.

Step 4b: Recover replicas


Replicas are now added to the newly created application set on the new production host. All of the replication data from pdccluster should be loaded, and you should be able to see the snapshots that are available for restore. To recover the replicas: 1. Mount the disaster recovery volumes on pdccluster. Windows Server 2003 a. From the new production server, open a command prompt and access the following: C:\Program files\EMC\rm\client\bin> b. Run the following command:
RmMountVolume.bat <device number> <drive letter>

Refer to Example 17 for example uses of the command.


Example 17

RmMountVolume commands

C:\Program files\EMC\rm\client\bin>RmMountVolume.bat 3 F: C:\Program files\EMC\rm\client\bin>RmMountVolume.bat 4 G:

114

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Back and Final Recovery

NTFS Use default options. Microsoft Exchange Server Use default options. Microsoft SQL Server Please select the recovery option to leave the database in an operational step. Once this process completes without errors, continue to Step 4d: Recover applications on page 116.

Step 4c: Complete application setup


Microsoft SQL Server and NTFS At this time, user applications setup should be complete according the instructions provided in Chapter 2, Step 2h: Install user applications on page 51. Microsoft Exchange In Chapter 2, Step 2h: Install user applications on page 51 you installed Microsoft Exchange Server following the appropriate documentation from Microsoft. However, you did not create any virtual server instances. Now create a Microsoft Exchange Virtual Server instance with the same name as the one originally on pdccluster. Use the newly available disks that were mounted on sdccluster during Step 3d: Recover replicas on page 74 as the required disk resources. It may be necessary to change the IP address associated with the SMTP, HTTP, or other Exchange related servers. Refer to Microsoft Knowledge Base article 315691 which covers conditions that can exist after changing the IP address of an Exchange server. After creating the virtual server instance, stop any of the various Exchange cluster resources, except for Exchange Information Store, that may be using the iSCSI LUNs containing the data for logging or other activities. For example, the MTA or SMTP instances may keep log data on the same drive as the Exchange databases. Your message tracking log may also use this drive. Stop these services before proceeding to the next step.

Step 4: Serving live data from the new production system

115

Failing Back and Final Recovery

Step 4d: Recover applications


Replication Manager handles application recovery as part of the restore process. However, it is always a good idea to verify recovery. At this time your environment should be ready to service user requests. NTFS Please verify that file data is available using methods appropriate to your environment. Microsoft Exchange Server Users should now be able to access their mailboxes. Microsoft SQL Server The database should now be available. You can check this using the Microsoft SQL Server Management Studio. If you did not select the option to leave the database operational in Step 4b: Recover replicas on page 114, then you will need to recover the database. The command is:
Microsoft SQL Server> restore database <databasename> with recovery

Run this command from the Microsoft SQL Server Management Studio application after the database has been attached.

116

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failing Back and Final Recovery

Step 5: Setting up final recovery


The final step in the disaster recovery solution is to set up Replication Manager protection for the live data now being served from the new production system at the primary data center.

Step 5a: Verify and archive worksheets


At this point, you should have the following worksheets filled out:

Celerra worksheet on page 122 for the new pdc-celerra, except for the Destination section in Table 7 on page 123. You will complete that section in the next step. Cluster worksheet on page 127 for the new pdccluster

The worksheets for sdccluster and sdc-celerra and failover support now no longer apply. They should be archived so that they do not create confusion. During the setup for DR, which you will complete in Step 5b: Set up protection for the new Production Server, you will create new worksheets for the Celerra and reformatted Windows server in the secondary data center.

Step 5b: Set up protection for the new Production Server


Return the sdccluster to a pre-installed state by performing the following steps: a. Format the system disk. b. Return to Chapter 2, Setting up Disaster Recovery, and follow the procedures beginning with Step 2: Setting up the disaster recovery cluster on page 41 and finishing through the end of the chapter. You now have returned to a fully recovered state in which disaster recovery is implemented.

CAUTION EMC does not recommend allowing the disaster recovery server to boot on the network before it is reinstalled. Doing so could cause network address and namespace conflicts.

Step 5: Setting up final recovery

117

Failing Back and Final Recovery

118

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Invisible Body Tag

A
Celerra Worksheet

This appendix is designed to help organize configuration information for each Celerra Network Server in the DR environment.

Filling out the worksheet ................................................................ 120 Celerra worksheet ............................................................................ 122

Celerra Worksheet

119

Celerra Worksheet

Filling out the worksheet


1. Make a copy of Appendix A, Celerra Worksheet, for every Celerra in your DR environment. 2. Fill out the information required for each Celerra as described in Chapter 2, Setting up Disaster Recovery and Chapter 4, Preparing for Failback. 3. Keep completed worksheets wherever you keep critical system recovery information in the secondary data center. Use them in the event of failover. When the worksheets become obsolete, the procedures will instruct you when to discard them.

Worksheet definitions
Use the following definitions to help you fill out each item in the worksheet appropriately: Table 4 Control Station hostname The network hostname of the Celerra Control station to which this worksheet refers. Control Station IP address The Internet Protocol (IP) address of the Celerra Control Station to which this worksheet refers. Celerra location and contact information Specific information related to your environment detailing where this system can be found physically. For example: Location: Primary Site, Data Center 1, Row 8, Cabinet 4. Contact: pdc-sysadmins@company.com, or John Doe. Table 5 Number The unique target number for the iSCSI target being recorded. Refer to Step 3b: Create replication jobs on page 54 for an example of finding the iSCSI target number. Alias The target alias name assigned during iSCSI target creation. Data Mover The Celerra Data Mover associated with this target. Notes Any additional information that may be necessary. For example, if you have a target that publishes data to other hosts, it may be important not to change certain LUN masks associated with that target. Table 6 Remote Celerra name The hostname of the remote Celerra Control Station. This will match the information in Table 4 on page 122 on the worksheet for that Celerra system.

120

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Celerra Worksheet

Remote Celerra IP address The Internet Protocol (IP) address of the remote Celerra Control Station. This will match the information in Table 4 on page 122 on the worksheet for that Celerra system. Passphrase The passphrase associated with the trust relationship. This is described on page Step 3a: Set up trust relationship on page 53. Table 7 Source target # The target number from Table 5 that is associated with the LUN being described. Source LUN # The LUN Number of the LUN being described Source hostname The hostname that should have access to, or that is using this LUN. Source drive letter The drive letter where this LUN is mounted on the host. Destination Celerra # / Target # The target number (from Table 5 on the destination Celerra worksheet) of the remote target where this LUN is being replicated. Destination LUN # The LUN number of the remote LUN where this LUN is being replicated. Purpose Any additional or important information regarding this LUN that may be important to your environment. For example, it may be important to note that a specific LUN holds home directory data, or the files for a particular project.

Filling out the worksheet

121

Celerra Worksheet

Celerra worksheet
Table 4

Celerra general information Control Station hostname: Control Station IP address: Celerra Location and Contact Information:

Table 5

Celerra iSCSI targets Number Alias Data Mover Notes

Table 6

Celerra trust relationships # Remote Celerra name Remote Celerra IP address Passphrase 1. 2.

122

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Celerra Worksheet

Table 7

Celerra LUN configuration DESTINATION Celerra #/ Drive letter Target # PURPOSE

SOURCE

Target #

LUN # Hostname

LUN #

Celerra worksheet

123

Celerra Worksheet

124

iSCSI Clustered Disaster Recovery Solution Implementation Guide

B
Cluster Worksheet

This appendix is designed to capture failover information for each host that you are failing over during a catastrophic event.

Filling out the worksheet ................................................................ 126 Cluster worksheet ............................................................................ 127

Cluster Worksheet

125

Cluster Worksheet

Filling out the worksheet


1. Make a copy of Appendix B, Cluster Worksheet, for every Windows cluster in your DR environment. 2. Fill out the information required for each server as described in Chapter 2, Setting up Disaster Recovery and Chapter 4, Preparing for Failback. 3. Keep completed worksheets wherever you keep critical system recovery information in the secondary data center. Use the worksheets in the event of failover. When the worksheets become obsolete, the procedures will instruct you when to discard them.

Worksheet definitions
Use the following definitions to help you fill out each item in the worksheet appropriately: Table 8 Hostname the network name for the server being described. IP address the IP Address for the server being described. Additional network identities it is possible for a single host to have multiple network identities. If this is the case for your server this section will be filled out. Table 9 Application database the application database being protected that uses data on the LUN being described. Drive letter the drive letter that the application database is using to access data. Target # the number of the Celerra iSCSI target that is publishing the LUN being used. LUN # the LUN number being used by the application database. Purpose any additional information that may be important to your environment. For example, E: may hold a mailstore for one department, while F: holds a mailstore for another department.

126

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Cluster Worksheet

Cluster worksheet
Server Location and Contact Information:

Table 8

General cluster information Cluster hostname IP address Node name Cluster node identities Network name IP address Notes IP address Notes

Additional network identities


Table 9

Application databases Application database Drive letter Target # LUN # Purpose

Cluster worksheet

127

Cluster Worksheet

128

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Invisible Body Tag

C
Failover Support Worksheet

This appendix is designed to capture failover information for each host that you are failing over during a catastrophic event.

Filling out the worksheet ................................................................ 130 Failover support worksheet............................................................ 131

Failover Support Worksheet

129

Failover Support Worksheet

Filling out the worksheet


1. Make a copy of Appendix C, Failover Support Worksheet, for every host you are failing over in your DR environment. 2. Fill out the information required for each host as described in Chapter 3, Failing Over, and Chapter 5, Failing Back and Final Recovery. 3. Keep completed worksheets wherever you keep critical system recovery information in the secondary data center. You will need the worksheets when you fail back to a new production server. When the worksheets become obsolete, the procedures will instruct you when to discard them.

Worksheet definitions
Use the following definitions to help you fill out each item in the worksheet appropriately: Table 10 Hostname the network name for the server being described. IP address the IP Address for the server being described. Table 11 Drive letter the drive letter where the LUN has been mounted on the production server, and where it will be mounted on the disaster recovery server. Destination target / LUN number contains the target number (if more than one target is active) and the LUN number that should be associated with the drive letter on the disaster recovery server. Device number the device number that the disaster recovery server detects for the LUN number.

130

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Failover Support Worksheet

Failover support worksheet


Table 10

Host information Hostname IP address

Table 11

Drive and LUN information Destination target/ LUN number

Drive letter

Device number

Failover support worksheet

131

Failover Support Worksheet

132

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Invisible Body Tag

D
Mounting Remote Replicas

This appendix contains a high-level outline of the steps you need to take if you are mounting remote replicas for Microsoft Exchange or SQL Server.

Mounting remote replicas............................................................... 134

Mounting Remote Replicas

133

Mounting Remote Replicas

Mounting remote replicas


In some situations it may be necessary to mount a remote replica on a host for further processing outside of the context of disaster recovery. Generally this will be accomplished by selecting mount options while you are creating a replication job. Refer to Step 3: Setting up replication on page 53, and Step 2: Setting up replication on page 92, for detailed procedures. Keeping with the same example system names as in the main text of this procedure, the following steps describe a high-level overview of mounting remote replicas. 1. Mount a LUN that has been replicated by pdccluster from pdc-celerra to sdc-celerra on another server which is referred to as the Mount Host. In other words, mount the replica on a different server. The source for the LUN will be sdc-celerra, and the mountpoint may be different on than it was on pdccluster. This will not impact application access to the original data on pdccluster.
Note: This scenario would be appropriate if you needed to remount the image in order to validate that the data was replicated properly, or to do data mining against the remote replica.

When you create the replication job, the mount options screen requires values for: Mount Host, Mount Path, and Mount Option. These are explained at length in the EMC Replication Manager 5.0.3 Product Guide. Refer to Figure 38 on page 135 for examples.

134

iSCSI Clustered Disaster Recovery Solution Implementation Guide

Mounting Remote Replicas

Figure 38

Mount options example Note: For Microsoft Exchange, the mounted Exchange replica will have to be tested to ensure that it is valid. Please see the appropriate Exchange documentation to ensure that your Mount Host has the eseutil.exe installed.

2. Before you execute the replication job, ensure that your Mount Host will be able to access the destination LUN. Use the Celerra Manager to mask the appropriate destination LUNs to the Mount Host. Use the iSCSI Initiator on the Mount Host to log in to the appropriate iSCSI target on sdc-celerra.
Note: If either of these steps is not completed, your replication job will fail, and the log will indicate that the remote replica is not accessible by the Mount Host. Please correct the issues and ensure that your replication job completes successfully. An unsuccessful replication job will not produce a remote replica that will be usable in the event of a disaster.

Mounting remote replicas

135

Mounting Remote Replicas

136

iSCSI Clustered Disaster Recovery Solution Implementation Guide