Celerra Replcatr

Using Celerra Replicator (V1)
P/N 300-004-184 Rev A08
Version 5.6.47
December 2009
Contents
Introduction to Celerra Replicator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 Cautions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9 Celerra Replicator concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10 Local replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10 Remote replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11 Activating the destination file system as read/write . . . . . . . . . . . . .13 Communication between Celerra Network Servers. . . . . . . . . . . . . .17 How resynchronization works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18 How suspend works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19 How replication relationship restarts . . . . . . . . . . . . . . . . . . . . . . . . .20 System requirements for Celerra Replicator. . . . . . . . . . . . . . . . . . . . . . .22 Local replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22 Remote replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22 Upgrading from previous Celerra Network Server versions . . . . . . . . . .24 Upgrade from a version earlier than 5.5.39.2 . . . . . . . . . . . . . . . . . . .24 Upgrade from Celerra Network Server version 5.5.39.2 or later. . . .25 Planning considerations for Celerra Replicator . . . . . . . . . . . . . . . . . . . .29 Replication policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29 SavVol size requirements for remote replication . . . . . . . . . . . . . . .33 Determine the number of replications per Data Mover . . . . . . . . . . .34 Configuration considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35 User interface choices for Celerra Replicator. . . . . . . . . . . . . . . . . . . . . .37 Roadmap for Celerra Replicator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .38 Initiating replication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39 Task 1: Establish communication . . . . . . . . . . . . . . . . . . . . . . . . . . . .40 Task 2: Verify communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .40 Task 3: Create SnapSure checkpoint of source file system. . . . . . .42 Task 4: Create the destination file system . . . . . . . . . . . . . . . . . . . . .45 Task 5: Copy checkpoint to the destination file system . . . . . . . . . .45 Task 6: Begin replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48 Task 7: Create a second checkpoint of the source file system . . . .50 Task 8: Copy incremental changes. . . . . . . . . . . . . . . . . . . . . . . . . . .52
1 of 172
Task 9: Verify file system conversion . . . . . . . . . . . . . . . . . . . . . . . . 54 Task 10: Check replication status . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Task 11: Create restartable checkpoints . . . . . . . . . . . . . . . . . . . . . . 59 Recovering replication data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Task 1: Replication failover. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Task 2: Resynchronize the source and destination sites . . . . . . . . 66 Task 3: Replication reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Abort Celerra Replicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Suspend a replication relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Verify the suspended replication relationship . . . . . . . . . . . . . . . . . 88 Restarting a replication relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Verify that the replication relationship is not synchronized . . . . . . 90 Restart replication relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Extending the size of a file system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Extend file system size automatically . . . . . . . . . . . . . . . . . . . . . . . . 98 Extend file system size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Resetting replication policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 High water mark and time-out policies . . . . . . . . . . . . . . . . . . . . . . 105 Modify replication policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Change flow-control policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Set bandwidth size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Set policies using parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Reverse the direction of a replication relationship . . . . . . . . . . . . . . . . 111 Verify the reverse direction of replication relationship . . . . . . . . . 113 Monitor replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Checking playback service and outstanding delta sets . . . . . . . . . . . . 115 Task 1: Determine playback service status . . . . . . . . . . . . . . . . . . . 115 Task 2: Playback delta set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Task 3: Verify delta set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Events for Celerra Replicator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Change the Celerra Replicator SavVol default size . . . . . . . . . . . . . . . . 123 Change the passphrase between Celerra Network Servers . . . . . . . . . 124 Managing and avoiding IP replication problems . . . . . . . . . . . . . . . . . . 125 Preventive measures to avoid IP replication problems . . . . . . . . . 125 Replication restart methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Recovering from a corrupted file system . . . . . . . . . . . . . . . . . . . . 130 Managing anticipated destination site or network outages . . . . . . 131 Managing unanticipated destination site or network outages. . . . 132 Managing unanticipated source site outages . . . . . . . . . . . . . . . . . 133 Managing expected source site outages . . . . . . . . . . . . . . . . . . . . . 133 Mount the destination file system read/write temporarily . . . . . . . 133 Recovering from an inactive replication state. . . . . . . . . . . . . . . . . 135 Creating checkpoints on the destination site . . . . . . . . . . . . . . . . . 136 Copy file system to multiple destinations with fs_copy. . . . . . . . . 136 Transporting replication data using disk or tape . . . . . . . . . . . . . . . . . . 139 Disk transport method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Tape transport method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 Setting up the CLARiiON disk array . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Review the prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Run the setup script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Create data LUNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Troubleshooting Celerra Replicator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Where to get help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
2 of 172
Version 5.6.47
E-Lab Interoperability Navigator . . . . . . . . . . . . . . . . . . . . . . . . . . . .154 Log files for troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .154 server_log messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .155 Network performance troubleshooting. . . . . . . . . . . . . . . . . . . . . . .156 Failure during transport of delta set . . . . . . . . . . . . . . . . . . . . . . . . .156 Failure of fs_copy command process . . . . . . . . . . . . . . . . . . . . . . .156 Control Station restarts during replication . . . . . . . . . . . . . . . . . . .156 Control Station fails over . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .157 NS series loses power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .157 Return codes for fs_copy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .157 Error messages for Celerra Replicator . . . . . . . . . . . . . . . . . . . . . . . . . .161 Related information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .162 Training and Professional Services . . . . . . . . . . . . . . . . . . . . . . . . .163 Appendix A: fs_replicate -info output fields . . . . . . . . . . . . . . . . . . . . . .164 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .169
Version 5.6.47
3 of 172
4 of 172 Version 5.6.47
Introduction to Celerra Replicator

EMC Celerra Replicator produces a read-only, point-in-time copy of a source file system and periodically updates this copy, making it consistent with the source file system. This read-only copy can be used by a Data Mover in the same Celerra cabinet, or a Data Mover at a remote site for content distribution, backup, and application testing. This technical module is part of the EMC Celerra Network Server documentation set and is intended for system administrators who are responsible for IP replication in the Celerra environment. Before using Celerra Replicator, system administrators establishing replication should understand Celerra volumes and file systems. This technical module is one of several technical modules that describe replication using different implementations of Celerra Replicator. Use the following guidelines to navigate the technical modules:
Read this technical module to learn the basics about the Celerra Replicator product and how it performs local replication or remote replication of Production File Systems (PFSs) over an IP network to a destination. This technical module describes how to set up the IP environment for replication and how to use Celerra Replicator to replicate your file systems. Because Celerra Replicator relies on EMC SnapSure checkpoints (read-only, logical, point-in-time images) for the initial copy and for replication restart, you might find it useful to read Using SnapSure on EMC Celerra to learn more about checkpoints. Read Replicating EMC Celerra CIFS Environments (V1) to learn how to use Celerra Replicator to perform replication in the CIFS environment. Replicating EMC Celerra CIFS Environments (V1) describes how to replicate the CIFS environment information in the root file system of a Virtual Data Mover (VDM) as well as how to replicate the file systems mounted to that VDM. To learn about VDMs in general, read Configuring Virtual Data Movers for EMC Celerra. Read Using EMC Celerra Replicator for iSCSI (V1) to learn about iSCSI replication using Celerra Replicator for iSCSI product. Using EMC Celerra Replicator for iSCSI (V1) describes how to replicate production iSCSI LUNs by asynchronously distributing local point-in-time copies of the LUNs to a destination. Read Using EMC Celerra Replicator (V2) to learn about the new Celerra Replicator product. Read Managing EMC Celerra Volumes and File Systems with Automatic Volume Management and Managing EMC Celerra Volumes and File Systems Manually to learn more about establishing replication.
Terminology
This section defines terms important to understanding replication on the Celerra Network Server. The EMC Celerra Glossary provides a complete list of Celerra terms.
Automatic File System Extension: A configurable Celerra file system feature that automatically extends a file system created or extended with Automatic Volume
Version 5.6.47
5 of 172
Manager (AVM) when the high water mark (HWM) is reached. See also high water mark.
checkpoint: Read-only, logical, point-in-time image of a file system. A checkpoint is
sometimes referred to as a checkpoint file system or a SnapSure file system.

delta set: Set containing the block modifications made to the source file system that
Celerra Replicator uses to update the destination file system (read-only, point-intime, consistent replica of the source file system). The minimum delta-set size is 128 MB.
high water mark (HWM): The trigger point at which Celerra Network Server performs
one or more actions, such as sending a warning message, extending a volume, or updating a replicated file system, as directed by the related features software/parameter settings.
IP replication service: The service that uses the IP network to transfer the delta sets
from the replication SavVol on the source site to the replication SavVol on the destination site.
local replication: Replication of a file system on a single Celerra Network Server
with the source file system on one Data Mover and the destination file system on another Data Mover.
loopback replication: Replication of a file system with the source and destination file
systems residing on the same Data Mover.

playback service: The process of reading the delta sets from the destination SavVol and updating the destination file system. remote replication: Replication of a file system from one Celerra Network Server to
another. The source file system resides on a different Celerra system from the destination file system.
replication: A service that produces a read-only, point-in-time copy of a source file system. The service periodically updates the copy, making it consistent with the source file system. replication reversal: The process of reversing the direction of replication. The
source file system becomes read-only and the destination file system becomes read/write.
replication service: Service that copies modified blocks from the source file system
to a replication SavVol prior to transferring the data to the destination file system.
Replicator ConfigVol: An internal information store for replication. Provides a
storage vehicle for tracking changes in the source file system.

Replicator failover: The process that changes the destination file system from read-
only to read/write and stops the transmission of replicated data. The source file system, if available, becomes read-only.
Replicator SavVol: A Celerra volume, required by replication, used to store modified
data blocks from the source file system.

SnapSure SavVol: A Celerra volume to which SnapSure copies point-in-time data
blocks from the PFS before the blocks are altered by a transaction. SnapSure uses the contents of the SavVol and the unchanged PFS blocks to maintain a checkpoint of the PFS.
timeout: Time interval at which the system takes a predetermined action.
6 of 172 Version 5.6.47
Virtual Data Mover (VDM): A Celerra software feature enabling users to administratively separate CIFS servers, replicate CIFS environments, and move CIFS servers from Data Mover to Data Mover. Virtual Provisioning: A configurable Celerra file system feature that can only be used in conjunction with Automatic File System Extension. This option lets you allocate storage based on longer-term projections, while you dedicate only the file system resources you currently need. UsersNFS or CIFS clients and application see the virtual maximum size of the file system of which only a portion is physically allocated. Combined, the Automatic File System Extension and Virtual Provisioning options let you grow the file system gradually on an as-needed basis.
Restrictions
The following restrictions apply to Celerra Replicator:
Celerra Data Migration Service (CDMS) is unsupported (an mgfs file system cannot be replicated). Multi-Path File System (MPFS) is supported on the source file system, but not on the destination file system. EMC E-Lab Interoperability Navigator provides information about disaster recovery replication products such as EMC SRDF/Synchronous (SRDF/S) and SRDF/Asynchronous (SRDF/A). For EMC TimeFinder/FS: A business continuance volume (BCV) cannot be a source or a destination file system for replication. You can replicate the underlying source file system, but not the BCV. Do not use the TimeFinder/FS -Restore option for a replicated source file system. Replication will be unaware of any changes because these changes occur at the volume level. However, you can restore on a single file basis using a NFS/CIFS client, which has access to the source file system and the BCV of the source file system. Do not use TimeFinder/FS with a file system that was created on a slice volume. Creating a file system using the samesize option slices the volume. TimeFinder does not recognize sliced partitions. Using TimeFinder/FS, NearCopy, and FarCopy with EMC Celerra further details this feature.
For TimeFinder/FS Near Copy and Far Copy: A BCV cannot be a source or a destination file system for replication. You can replicate the underlying source file system, but cannot replicate the BCV.
Do not extend the source file system while fs_copy is running. On a per-Data-Mover basis, the total size of all file systems, the size of all SavVols used by SnapSure, and the size of all SavVols used by the Celerra Replicator feature must be less than the total supported capacity of the Data Mover. The EMC Celerra Network Server Release Notes, available at http://Powerlink.EMC.com, the EMC Powerlink website, provide a list of Data Mover capacities.
Version 5.6.47
7 of 172
When replicating databases, additional application-specific actions may be necessary to bring the database to a consistent state (for example, quiescing the database). If you plan to enable international character sets (Unicode) on your source and destination sites, you must first set up translation files on both sites before starting Unicode conversion on the source site. Using International Character Sets with EMC Celerra describes this action in detail. In the case of multiple file systems, all fs_replicate commands must be executed sequentially. IP replication failover is not supported for local groups unless you use VDMs. Configuring Virtual Data Movers for EMC Celerra describes VDM configuration. IP replications created prior to Celerra Network Server version 5.5 in which the source file system contained iSCSI LUNs are no longer supported. These replications will continue to run in their current state, but you cannot actively manage (suspend or resume) them. Any attempt to perform such an operation prompts an error stating that the item is currently in use by iSCSI. You can abort or delete the replication. EMC recommends that you convert any existing IP replications to iSCSI replications as soon as possible. Do not use IP aliasing for IP replication. Use IP aliasing only for Control Station access from the client. For full management capability, ensure that you have the same Celerra version on the source and destination Celerra Network Servers. For example, you cannot have version 5.5 on the source side and version 5.6 on the destination side. For limited management capability and no fs_copy support, out-of-family replication support is available and requires NAS version 5.5.39.2 or later on the source Celerra and 5.6.47 or later on the destination Celerra. "Upgrade from Celerra Network Server version 5.5.39.2 or later" on page 25 provides more details.
For FLR-E-enabled file systems, both the source and destination file systems must have the same FLR setting enabled. For example, if the source file system has FLR-E enabled, then the destination file system must also have FLR-E enabled. If the source file system does not have FLR-E enabled, then the destination cannot have FLR-E enabled. File systems enabled for processing by Celerra Data Deduplication cannot be replicated using Celerra Replicator. In addition, you cannot enable deduplication on a file system that is already being replicated by Celerra Replicator. In Celerra Replicator (V1) version 5.6, the fs_replicate -refresh command does not create a new delta set until the production file system is updated. Consequently, policy values for timeout, high water mark (HWM), autofreeze, autoreadonly, and so on, that are modified using the fs_replicate -modify or fs_replicate -refresh command, remain ineffective until a new delta set is created. In versions prior to 5.6, the fs_replicate -refresh command creates a new delta set even if no updates are made to the PFS.
8 of 172 Version 5.6.47
Cautions
This section lists the cautions for using this feature on Celerra Network Server. If any of this data is unclear, contact EMC Customer Service for assistance:
To provide a graceful shutdown in an electrical power loss, Celerra Network Server and the storage array need to have Uninterruptible Power Supply (UPS) protection. If this is not provided, replication will become inactive and might result in data loss. If replication becomes inactive, consult "Restarting a replication relationship" on page 89 to determine if you can resume the replication relationship. Replicating file systems from a Unicode-enabled Data Mover to an ASCIIenabled Data Mover is not supported. I18N mode (Unicode or ASCII) must be the same on the source and destination Data Movers. Replication sessions should run serially, not concurrently. That is, they should start one after the other, not simultaneously.
Version 5.6.47
9 of 172
Celerra Replicator concepts

This section explains how replication produces and uses a read-only, point-in-time copy of a source file system on the same or different Celerra Network Servers. It also describes how Celerra Replicator enables you to activate failover to the destination site for production, if the source site experiences a disaster and is unavailable for data processing.
Local replication
Replication produces a read-only copy of the source file system for use by a Data Mover in the same Celerra cabinet. The source and destination file systems are stored on separate volumes. Local replication can use different Data Movers or the same Data Mover.
Local replication process

Figure 1 on page 10 and subsequent steps show the processing that occurs when using local replication for the first time.
Celerra Network Server
Primary Data Mover
Secondary Data Mover
1 5
2 Source file system 3 SavVol Destination file system 4
Storage unit
CNS-000765
Figure 1 Local replication
10 of 172 Version 5.6.47
The process is as follows: 1. Throughout this process, network clients read and write to the source file systems through the primary Data Mover without interruption. 2. For the initial replication start, the source and destination file systems are manually synchronized using the fs_copy command. 3. After synchronization, the replication service uses the addresses of all block modifications made to the source file system to create one or more delta sets. The modified blocks are copied to the SavVol shared by the primary and secondary Data Movers. 4. The local replication playback service periodically reads any available, complete delta sets and updates the destination file system, making it consistent with the source file system. During this time, the system tracks all subsequent changes made to the source file system. 5. The secondary Data Mover exports the read-only copy to use for content distribution, backup, and application testing. This optional step is done manually.
Remote replication
Remote replication creates and periodically updates a read-only copy of a source file system at a remote (destination) site. This is done by transferring changes made to a source file system at a local site to a file system replica (destination) at the destination site over an IP network. These transfers are automatic and are based on user-definable replication policies.
Version 5.6.47
11 of 172
Remote replication process

Figure 2 on page 12 and subsequent steps show the processing that occurs when using remote replication for the first time:
Source site Celerra Network Server 1 Destination site Celerra Network Server 2
Data Mover
Data Mover
1 6
2 Source file system 3 4 Source SavVol Source storage unit Destination SavVol Destination storage unit
CNS-000762
Destination file system 5
Figure 2 Remote replication
1. Throughout this process, network clients read and write to the source file systems through a Data Mover at the source site without interruption. 2. For the initial replication start, the source and destination file systems are synchronized using the fs_copy command. This can be performed over the IP network or, if the source file system contains a large amount of data, by physically transporting the data to a remote site using disk or tape. 3. The addresses of all subsequent block modifications made to the source file system are used by replication to create one or more delta sets. The replication service creates a delta set by copying the modified blocks to the SavVol at the source site. 4. Replication transfers any available, complete delta sets which include the block addresses to the destination SavVol. During this time, the system tracks subsequent changes made to the source file system on the source site. 5. At the destination site, the playback service plays back any available, complete delta sets to the destination file system, which makes it consistent with the source file system.
12 of 172 Version 5.6.47
6. The Data Mover at the destination site exports the read-only copy for content distribution, backup, and application testing. This optional step is done manually.
Activating the destination file system as read/write

If the source file system becomes unavailable, usually as the result of a disaster, you can make the destination file system read/write for local or remote scenarios. After the source site is available again, you can then restore replication to become read/write at the source site and read-only at the destination site. This three-stage process includes:
Using the destination file system for production when the source file system is unavailable (failover). Resynchronizing the file systems. Restoring the replication process to its original state.
Replication failover
In this example, the source site has experienced a disaster and is unavailable. Failover ends the replication relationship between the source and destination file systems and changes the destination file system from read-only to read/write. When failing over, the following actions occur: 1. The system stops replication and plays back the outstanding delta sets on the destination site to the destination file system according to the options specified by the user. The system can play back either all or none of the delta sets. The beginning of the failover process is shown in Figure 3 on page 14.
Version 5.6.47
13 of 172
Source site Celerra Network Server 1
Destination site Celerra Network Server 2
Data Mover
Data Mover
Source file system
Destination file system
Source SavVol Source storage unit
Destination SavVol Destination storage unit

CNS-000763
Figure 3 Source site becomes unavailable
2. The system stops the playback service and the destination file system becomes read/write, as shown in Figure 4 on page 15. This illustration displays a remote replication scenario with the destination site as read/write. This state is possible in a local replication scenario as well.
14 of 172 Version 5.6.47
Source site Celerra Network Server 1
Destination site Celerra Network Server 2
Data Mover
Data Mover
Source file system

CNS-000764
Figure 4 Failover
3. The destination site can be enabled to allow read/write access to the destination file system from network clients (local or remote scenario). "After a failover or reversal" on page 78 provides more information. This optional step is done manually.
Note: If the source file system is online, it becomes read-only.
Version 5.6.47
15 of 172
Resynchronization
When the original source file system becomes available, the replication relationship can be reestablished. The fs_replicate -resync option is used to populate the source file system with the changes made to the destination file system while the source site was unavailable. This establishes the replication relationship in the reverse direction. The destination file system is read/write and the source file system is read-only, as shown in Figure 5 on page 16.
Data Mover
Data Mover
Source file system

CNS-000761
Figure 5
Resynchronization
16 of 172 Version 5.6.47
Replication reversal
In this example, the reversal is used after a failover and resynchronization to change the direction of the replication. The source site again accepts the source file system updates from the network clients, and the replication service transfers these updates to the destination site for playback to the destination file system, as shown in Figure 6 on page 17.
Data Mover
Data Mover
Source file system

CNS-000766
Figure 6 Replication reversal
Note: A reversal requires both sites to be available and results in no data loss. During the reversal phase, the source and destination file systems are set as read-only while the last updates are transferred.
"Recovering replication data" on page 61 further describes the replication reversal feature.
Communication between Celerra Network Servers

At the source and destination sites, you must build a trust relationship enabling HTTP communication between Celerra Network Servers. This trusted relationship is built upon a common passphrase set for both Celerra Network Servers. The 6- to 15-character passphrase is stored in clear text. It is used to generate a ticket for Celerra-to-Celerra communication.
Version 5.6.47
17 of 172
Note: You use the nas_cel -create command to establish the relationship between your Celerra Network Servers. For this command, ensure that the passphrase has at least 6 characters and no more than 15 characters. Remember that you must run the command on both Celerra Network Servers and use the same passphrase.
The time on the Data Movers involved in a replication relationship and the Control Stations at both sites must be synchronized with a maximum allowable skew of 10 minutes. Take into account time zones and daylight savings time, if applicable, when using the Network Time Protocol (NTP) to synchronize the time. Configuring EMC Celerra Time Services offers more information. To establish communication, first, you must have root privileges and each site must be active and configured for external communication. Table 1 on page 18 shows information about the source and destination sites used in these examples.
Table 1 Source and destination sites information
Site
[source_site] [destination_site]
Celerra name
cs100 cs110
IP address
192.168.168.114 192.168.168.102
Second, there must be IP network connectivity between both Control Stations. Verify whether a relationship exists using the nas_cel -list command. If communication is established, go to Task 3: "Create SnapSure checkpoint of source file system" on page 42.
Note: This task is performed only for remote replication.
How resynchronization works

After a failover completes and the source site becomes operational, you can resume replication using the -resync option. When the file systems are resynchronizing, changes that occurred after the failover are copied to the source site and replication is started but the replication direction is reversed. The replication service is running on the destination site, and the playback service is running on the source site. If this resynchronization is successful, you need not perform a full file system copy. Some reasons a resynchronization may not be possible are:
You performed a failover but the system was unable to make the source file system read-only. When the source site became available, it continued to receive I/O to your source file system. If your replication service becomes inactive, you must abort replication. You should not continue to allow I/O to the original source file system. After you performed a failover, you decided to abort replication when the source site became available.
18 of 172 Version 5.6.47
If the replication service is active and you receive one of the following error messages when attempting a resynchronization, you can run the -resync command again by specifying the autofullcopy=yes option: Error 2242: <file_system_name> : replication is not active, incremental resync not possible. Abort and start replication or resync with autofullcopy=yes. or Resync copy failed. Incremental resync not possible. Abort and start replication or resync with autofullcopy=yes.
Note: If this incremental resynchronization fails, restarting replication using a full file system copy might take considerable time and resources. Plan carefully before using this option.
CAUTION
Any data on the source file system not played back to the destination file system prior to the failover is permanently lost.
How suspend works

Suspend is an option that allows you to temporarily stop an active replication relationship and leave the replication in a condition that allows it to be restarted. Suspend, when used in conjunction with the restart option, allows you to temporarily stop replication, perform some action, and then restart the replication relationship using an incremental rather than a full data copy. After suspending a replication relationship, you can:
Change the replication SavVol size. During the course of using replication, the size of a SavVol may need changing because: The SavVol is too large and you want to reclaim the unused disk space. The SavVol is too small, which activates flow control.
Mount the replication source or destination file system on a different Data Mover. Change the IP addresses or interfaces the replication is using. When you are restarting a replication relationship, you can specify a source interface or allow the system to select it: If you specify an interface for the source site, replication uses that interface until the user changes it. If you allow the system to select the interface, the interface can change to keep the replication relationship running. For example, if the network interface currently in use becomes unavailable, the system attempts to select another interface. If it finds one, the replication relationship continues to function. The destination interface, regardless of how it is selected, is unchanged by the system.
Version 5.6.47
19 of 172
When suspending a replication relationship, the system:

Ensures all delta sets are transferred to the destination site. Plays back all outstanding delta sets. Creates a checkpoint on the source site, which is used to restart replication.
Note: The suspend checkpoint is named root_suspend_ckpt_xxx_1, where xxx is the ID of the suspended replication relationship destination and 1 represents Celerra ID of a remote replication relationship. No number appears for a local replication relationship.
The replication and playback services are no longer running after the suspend action is complete.
How replication relationship restarts

The restart option allows you to restart a replication relationship by using an incremental rather than a full data copy. Use the restart option when a replication relationship is:

Suspended Out-of-sync
Restart a suspended replication relationship

After you suspend a replication relationship using the -suspend option, only the restart option can restart it. This command verifies that the replication is in a condition to allow a restart. It begins the process with an incremental copy using a checkpoint of the source file system created when the replication suspended. You must include this checkpoint when determining the maximum number of checkpoints per file system used with replication.
Note: Before you restart a replication, make sure that all checkpoints are mounted. Otherwise, a full data copy will be initiated instead of an incremental copy.
If you are using this procedure to:
Increase the size of the replication SavVol, ensure that you specify the new SavVol size using the savsize=<newsize> option. Change interfaces or IP addresses, specify them when you restart the replication relationship.
When replication is restarted, default values are used for the replication policies. For example, high water mark and timeout are set to 600. Specify new policies when you restart replication using -option <options>.
Out-of-sync replication relationship

The source and destination file systems can become out-of-sync because:

Network connectivity is lost. Incoming write rate is greater than the delta-set replay rate on the destination file system.
20 of 172 Version 5.6.47
This causes the following: 1. Delta sets accumulate on the source site until the SavVol is filled. 2. Changes are logged in memory until they can no longer be tracked. 3. The file systems fall out-of-sync if no source policy is specified. Prevent an out-of-synchronization condition by specifying a source policy for a replication relationship. Either stop accepting writes or stop all client access to the file system. Reestablishing the replication relationship When the network connection is reestablished or the source file system is available again, you can restart replication. To restart, you must have usable checkpoints of the source file system. When restarting, all previous configuration parameters are maintained and cannot be changed during the restart. When restarting an out-of-sync replication relationship, the system:
Determines the optimal checkpoint that meets the criterion of having a delta number less than the delta-set number of the destination file system. Aborts the replication relationship. Restarts replication using the original configuration information. Performs an incremental copy of the file system using the appropriate checkpoint.
If no valid checkpoint is available, you must abort and reestablish the replication relationship. This means a complete (not an incremental) copy of the file system must be done. Checkpoint availability To ensure that a valid checkpoint is available to restart an out-of-sync replication relationship, create two checkpoints of the source file system. These restartable checkpoints, named <source_fs_name>_repl_restart_1 and <source_fs_name>_repl_restart_2, are automatically refreshed by a system CRON job that runs every hour at 25 minutes after the hour. When a replication relationship is out-of-sync, these checkpoints are available for the restart process. You must include these two checkpoints when determining the maximum number of checkpoints per file system used with replication. Verify that these checkpoints are being refreshed by occasionally checking your server log (server_log) file. If the system is not refreshing your checkpoints regularly, call your service provider. The following is a sample message for a successful refresh, where <fs_name> is the file system name: Dec 29 07:29:23 2004 NASDB:7:13 refresh scheduled replication restart ckpt <fs_name>_repl_restart_1 succeeded.
Version 5.6.47
21 of 172
System requirements for Celerra Replicator

This section details Celerra Network Server software, hardware, network, and storage settings to use Celerra Replicator as described in this technical module.
Local replication
Table 2 on page 22 lists the system requirements for local replication.
Table 2 Local replication system requirements Celerra Network Server version 5.6. Licenses for Celerra Replicator and SnapSure. One Celerra-storage (EMC Symmetrix or EMC CLARiiON) pair. CNS-14 or CFS-14 requires a minimum of a 510 Data Mover or later. To provide a graceful shutdown in an electrical power loss, Celerra Network Server and the storage array need to have Uninterruptible Power Supply (UPS) protection. If this is not provided, replication becomes inactive. IP addresses configured for the primary and secondary Data Movers. Sufficient storage space available for the source and destination file systems. Sufficient SavVol space available for use by Celerra Replicator and SnapSure.
Software
Hardware
Network Storage
Remote replication
Table 3 on page 22 lists the system requirements for remote replication.
Table 3 Remote replication system requirements (page 1 of 2) Celerra Network Server version 5.6 with the same Celerra version on the source and destination Celerra Network Servers.
Software
Note: For limited management capability and no fs_copy support: Celerra Network Server version 5.5.39.2 on the source and 5.6.47 on the destination Celerra Network Server. Licenses for Celerra Replicator and SnapSure. Minimum of two Celerra-storage (Symmetrix or CLARiiON) pairs. CNS-14 and CFS-14 require a minimum of one 510 Data Mover or later for each Celerra-storage pair. To provide a graceful shutdown in an electrical power loss, Celerra Network Server and the storage array need to have UPS protection. If this is not provided, replication becomes inactive.
Hardware
22 of 172 Version 5.6.47
Table 3
Remote replication system requirements (page 2 of 2) IP addresses configured for the source and destination Data Movers (ports 8888 and 8887 used by replication for transferring data and internal operationscontact Customer Service to change this setting). HTTPS connection between the Control Station on the source site and the Control Station on the destination site (port 443cannot be changed). Internet Control Message Protocol (ICMP) ensures that a destination Celerra Network Server is accessible from a source Celerra Network Server. The ICMP protocol reports errors and provides control data about IP packet processing. Sufficient storage space available for the source and destination file systems. Sufficient SavVol space available for use by Celerra Replicator and SnapSure.
Network
Storage
Consult "Configuration considerations" on page 35 for more information.
Version 5.6.47
23 of 172
Upgrading from previous Celerra Network Server versions

This section describes the upgrade process when upgrading from a previous version of Celerra Network Server to version 5.6 when using Celerra Replicator (V1):

Upgrade from a version earlier than 5.5.39.2 Upgrade from version 5.5.39.2 or later (with out-of-family support)
Upgrade from a version earlier than 5.5.39.2

When upgrading from a previous version of Celerra Network Server that is earlier than 5.5.39.2, to 5.6, note the following:
Local replication If you are upgrading a local replication, stop the playback service for all the destination file systems by setting the timeout and high water mark to 0. Upgrade Celerra Network Server, reset the timeout, and high water marks for the destination file systems to their previous settings.
Remote replication: If you are upgrading where only destination file systems are on the remote system, upgrade Celerra Network Server at the destination site first, and then upgrade the source site. You cannot upgrade a destination Celerra that is the target of a Celerra running a 5.5 version earlier than 5.5.39.2. You cannot upgrade a Celerra running a version earlier than 5.5.39.2 that is the source of an active replication session if the destination is also running a 5.5 version. You cannot upgrade a Celerra to 5.6 if there are bi-directional sessions targeting a 5.5 Celerra as a destination.
General replication considerations: Do not perform replication operations while upgrading (for example, -failover, -resync, -reverse, -restart, -refresh). When upgrading the source and destination Celerra Network Servers from version 5.5 to 5.6, replication sessions cannot be administered until both sides have been upgraded to 5.6.
24 of 172 Version 5.6.47
Before starting an upgrade, ensure that no failed over or suspended replication relationships are present. If you upgrade with replication relationships in this state, these relationships will be unusable when the upgrade completes. Introduced in version 5.4, SnapSure checkpoints use a new pageable blockmap structure. Checkpoints used for replication are in this format. Using SnapSure on EMC Celerra describes the pageable blockmap structure in depth. Replication policies for time-out interval and high water mark are as follows: Time out: Acceptable timeout values are 0 or greater than 60 seconds up to a limit of 24 hours. High water mark: The high water mark maximum value is 8000 MB. The value should not exceed the size of the SavVol.
Note: You do not have to abort replication when upgrading Celerra Network Server version.
Upgrade from Celerra Network Server version 5.5.39.2 or later

Introduced in version 5.5.39.2, out-of-family replication support is recommended for customers in a multi Celerra, edge-to-core environment who want to migrate to NAS version 5.6 but want to upgrade their Celerras at different times without interrupting data transfer. In an edge-to-core configuration, where there are multiple source Celerras replicating to one destination Celerra, you upgrade the destination (core) Celerra to NAS version 5.6 first and then upgrade each of the source (edge) Celerras when appropriate. Out-of-family replication is not intended for use over an extended period of time as there is limited replication management capability while in this environment. For example, you cannot start a new replication, restart a suspended or inactive replication, resync a failed-over replication, suspend or reverse a replication, or perform a copy (full or differential) of a file system. Table 4 on page 26 provides the support matrix for replication commands when in an out-of-family configuration. All management commands are supported after both sides of the replication session have been upgraded to the same NAS code family version. When upgrading from version 5.5.39.2 or later of Celerra Network Server to 5.6, note the following:
Out-of-family replication support requires NAS version 5.5.39.2 or later on the source Celerra and 5.6.47 or later on the destination Celerra. You cannot upgrade a 5.5.39.2 Celerra that is the source of an active replication session if the destination is also running version 5.5.39.2. You cannot upgrade a Celerra to 5.6 if there are bi-directional sessions targeting a 5.5 Celerra as a destination.
Version 5.6.47
25 of 172
Out-of-family replication is unidirectional from a 5.5.39.2 source to a destination running 5.6.47 or later. Replication from source 5.6 to destination 5.5 is not supported. Does not support fs_copy. After upgrade, there is limited replication management support for Celerra Replicator (V1) sessions running with different NAS code family versions (source running 5.5.39.2 and destination running 5.6). Table 4 on page 26 provides the support matrix of replication management commands when in an out-of-family configuration.
Table 4
Out-of-family replication command support matrix (page 1 of 2)
Replication state Command Active

Abort on source Abort on destination Abort on both Refresh on source Refresh on destination Refresh on both Modify on source Modify on destination Modify on both Failover on destination (default | Now | Sync) Allowed Allowed Not allowed Allowed Allowed
Inactive
Allowed Allowed Not allowed NA Allowed
Suspended
NA NA NA NA NA
Failed-over
Allowed NA Not allowed Allowed NA
Not configured
NA NA NA NA NA
Not allowed Allowed Allowed Not allowed Allowed (default | now) Not Allowed (sync) NA
NA Allowed Allowed Not allowed Allowed (default | now) Not Allowed (sync) NA
NA NA NA NA NA
Not allowed Allowed NA Not allowed NA
NA NA NA NA NA
Resync on destination Reverse on source Restart on source Suspend on source List from source List from destination
NA
Not allowed
NA
Not allowed NA Not allowed Allowed Allowed
Not allowed Not allowed Not allowed Allowed Allowed
NA Not allowed NA NA NA
NA NA NA Allowed NA
NA NA NA NA NA
26 of 172 Version 5.6.47
Table 4
Out-of-family replication command support matrix (page 2 of 2)
Replication state Command Active

Info from source Info from destination Start from source fs_copy from source Allowed Allowed NA NA
Inactive
Allowed Allowed Not allowed NA
Suspended
NA NA NA NA
Failed-over
Allowed NA NA NA
Not configured
NA NA Not allowed Not allowed
Prerequisites
Before upgrading, make sure that the Celerra to be updated:

Is running version 5.5.39.2 or later. If Celerra to be upgraded is a destination, make sure that the source hosting the active replication sessions is running version 5.5.39.2 or later. If Celerra to be upgraded is a source, make sure that the destination hosting the active replication sessions is running version 5.6.47 or later. Has no fs_copy sessions running. Is running under minimum load. If there is a high rate of I/O during upgrade it may cause replications to become inactive. Is not hosting both the source and destination sides of active replications running version 5.5. If bi-directional sessions exist on the Celerra to be upgraded, do the following: a. Suspend all replication sessions in one direction. (Either the sessions running from A to B or the sessions running from B to A.) b. Upgrade the Celerra that is hosting only the destination side of the active replication sessions. c. Data transfer will continue but with limited replication management capability. d. Upgrade the source Celerra. e. Restart all the suspended replication sessions.
Procedure
To upgrade to Celerra Network Server version 5.6.47:
Step
1. 2.
Action
Upgrade the destination Celerra from NAS code version 5.5 to 5.5.39.2. Upgrade all of the source Celerras from NAS code version 5.5 to 5.5.39.2.
Version 5.6.47
27 of 172
Step
3.
Action
Upgrade the destination Celerra from NAS code version 5.5.39.2 to 5.6.47. Data continues to transfer between the source and destination sites, but there is limited replication management capability until you upgrade the source Celerra to 5.6.47. For example, you cannot start a new replication, restart a suspended or inactive replication, resync a failed-over replication, suspend or reverse a replication, or perform a copy (full or differential) of a file system. Table 4 on page 26 provides the out-of-family replication command support matrix.
4. 5.
Upgrade the source Celerra from 5.5.39.2 to 5.6.47. Repeat step 4 for all of the source Celerras to be upgraded.
28 of 172 Version 5.6.47
Planning considerations for Celerra Replicator

Before you use Celerra Replicator:

Review replication policies Determine SavVol size Determine the number of replications per Data Mover Review the configuration considerations
Replication policies
Most replication policies can be established for one replication relationship (using the fs_replicate command) or all replication relationships on a Data Mover by setting a parameter. Celerra Replicator has policies to:

Control delta-set generation using a time-out interval and high water mark. Control how to handle data if network connectivity is lost (flow control). "Celerra Replicator flow control" on page 31 describes this action. Set the maximum IP bandwidth size used by a replication session. "Set bandwidth size" on page 109 details this policy. Set the data amount to be sent across the IP network before an acknowledgment is required from the receiving side. This is controlled by a parameter for the TCP window size (tcpwindow). "Accommodating network concerns" on page 127.
IP Alias with IP replication

Celerra Network Server versions 5.5.28.1 and later support IP Alias with IP replication. All restrictions on Control Station failover also apply to IP Alias with IP replication configurations. The following guidelines apply to this feature:
When using IP replication for the first time, or on new systems, configure IP Alias first, and use IP Alias in the -ip <ipaddr> option of nas_cel -create command. For existing systems with existing IP replication sessions, the current slot_0 IP address (primary Control Station IP address) must be used. For example: nas_config -IPalias -create 0 -> Do you want slot_0 IP address as your alias [yes or no] yes
If the Control Station fails over while IP replication is running, the IP replication command (fs_replicate) might need to be re-executed manually. Check logs (/nas/log/cmd_log*, server_log command output etc.) to determine how to proceed. Keep the fs_replicate command output for resync, suspend, restart, failover, and reverse options in case of failure, then execute the rest of steps based on the instruction in command output. When IP Alias is deleted using the nas_config -IPalias -delete command, the IP address of the primary or the secondary Control Station is not changed. Changes to the IP address of the primary or the secondary Control Station must be done separately. IP replication depends on communication between the
Version 5.6.47
29 of 172
source Control Station and the destination Control Station. When IP Alias is used for IP replication, deleting the IP Alias breaks the communication. The IP address which was used as the IP Alias must be restored on the primary Control Station to restore the communication.
While performing a Celerra code upgrade using ssh or telnet, do not use an IP Alias in the ssh or telnet to log in to the Control Station.
Control delta-set generation

A delta set contains the block modifications made to the source file system and is used by the replication service to periodically synchronize the destination file system with the source file system. The amount of information within the delta set is based on the source file systems activity and how you set the replication policies, time out, and high water mark values. The minimum delta-set size is 128 MB. The replication service is triggered by either the time-out policy or the high water mark policy, whichever is reached first. However, when the maximum delta-set size is reached (8 GB), a new delta set is generated, regardless of the time-out policy:
Time-out policy (where the system either generates or plays back a delta set): Source site: The interval at which the replication service automatically generates a delta set (for example, every 1200 seconds). Destination site: The interval at which the playback service automatically plays back all available delta sets to the destination file system (for example, every 600 seconds). At both sites, the default time-out value is 600 seconds. Acceptable time-out values are 0 or greater than 60 seconds up to a limit of 24 hours. A value of 0, indicating there is never a timeout, pauses the replication activities for this policy.
High water mark: Source site: The size in MB of the file system changes accumulated since the last delta set, at which the replication service automatically creates a delta set on the SavVol. For example, when the amount of change reaches 1200 MB in size, a delta set is generated. Destination site: The size in MB of the delta sets present on the destination SavVol, at which the replication service automatically plays back all available delta sets to the destination file system. For example, when the amount of change reaches 600 MB in size, playback occurs. At both sites, the default high water mark value is 600 MB. Acceptable high water mark values are 0 or greater than 30 MB up to a maximum value of 8 GB. This value should not exceed the size of the SavVol. A value of 0 pauses the replication activities and disables this policy.
Note: A delta set may not be generated or copied if a flow control is triggered. "Celerra Replicator flow control" on page 31 provides further information.
30 of 172 Version 5.6.47
How time-out and high water mark policies work

How time-out and high water mark policies work discusses how these replication policies work for the source and destination file systems. Replication policies for the source file system and when one of these replication policies is triggered, the following occurs: 1. The replication service automatically generates a delta set from the accumulated changes to the file system and stores it in the SavVol. Each delta set is recorded and processed per replication policy trigger. Each delta set contains a set of changes to the source file system that occur since the creation of the previous delta set. 2. Replication tracks all subsequent changes made to the source file system. 3. The delta sets are transferred from the SavVol on the source site to the SavVol on the destination site (for remote replication), and updates the destination file system through the replication playback service (for local and remote replication). 4. The replication service waits for the next event to create the next delta set. Replication policies for the destination file system:
The playback service continually polls the SavVol on the destination site to play back each delta set to the destination file system, synchronizing it with the source file system. This playback rate is based on the specified replication policy. After the delta set is copied to the destination file system, the next delta set is processed. For optimization, delta sets available before a trigger is reached are merged into one active playback session.
Celerra Replicator flow control

Celerra Replicator flow control activates when the playback rate on the destination site is too slow, when network connectivity is lost, or when the write rate from the source site is too fast. Celerra Replicator activates flow control in an attempt to allow the network to catch up. In most cases, Celerra Replicator should not be in a flow control state. Flow control is activated:
If a delta set that has not been replayed on the destination file system is about to be overwritten by a newer delta set, the destination file system temporarily holds the data flow until the delta set is replayed. This happens when the deltaset playback rate for the destination file system is too slow to handle the source file system updates. When a delta set cannot be transferred from the SavVol on the source site to the SavVol on the destination site (for example, the network is unavailable), the replication service stops the data flow to the destination site. During this time, Celerra Replicator tracks the source file system modifications, and continually retries connectivity to the destination site. If the network is down, a message is sent to the system log on the source site. When the SavVol at the source site is full, the replication service suspends copying the modified blocks to the SavVol. This prevents the overwriting of delta
Version 5.6.47 31 of 172
sets not yet transferred to the SavVol on the destination site. During this time, modifications to the source file system are still tracked, but are not copied to the SavVol until it has available space. Celerra Network Server tracks changes in memory only until these changes represent a delta set the same size as the Celerra Replicator SavVol. If the system can no longer track changes in memory, the system behaves in one of three ways, as explained in Table 5 on page 32.
Table 5 Replication flow-control options
Behavior
How to specify for a single replication session

Set an option using fs_replicate -modify -option autofreeze=yes. "How time-out and high water mark policies work" on page 31 describes this policy.
How to specify for all replication sessions on a Data Mover

Set the VRPL freeze parameter as described in the EMC Celerra Network Server Parameters Guide.
Temporarily halts all I/Os to the source file system until sufficient space is available on the source SavVol. During this time, the file system is inaccessible to network clients. When space is available on the Celerra Replicator SavVol, the source file system is mounted and begins accepting I/Os. Temporarily stops writing data to the source file system by mounting it read-only. Users still have read-only access to the source file system. When space becomes available on the Celerra Replicator SavVol, the source file system is remounted read/write and begins accepting all I/Os to the source file system. Allows the replication service to fall out of sync.
Set an option using fs_replicate -modify -option autoro=yes. "How timeout and high water mark policies work" on page 31 describes this policy.
Set the VRPL read-only parameter as described in the EMC Celerra Network Server Parameters Guide.
"Restarting a replication relationship" on page 89 describes how to restart replication after it has fallen out of sync.
Not applicable.
As shown in Table 5 on page 32, these policies can be set for one replication session or for all replication sessions on a Data Mover. If there is a conflict between these two policies, the one defined for a single replication session takes precedence. You can set up alerts to notify you when these events occur. "Events for Celerra Replicator" on page 121 details Celerra Replicator events, and Configuring EMC Celerra Events and Notifications describes how to use them.
32 of 172 Version 5.6.47
SavVol size requirements for remote replication

Replication SavVols store changes that occurred on the source file system not yet replayed to the destination file system. Celerra Replicator requires a SavVol at source and destination sites. The minimum SavVol size is 1 GB and the maximum is 500 GB. Determine the size of these SavVols based on the following criteria:
Investigate the source file system size, update ratio of the source file system per day, and the WAN network bandwidth for the source/destination file systems between the source and destination Celerra Network Servers. Use nas_fs -size to calculate the SavVol size and nas_disk -list to find the entire file system size. The EMC Celerra Network Server Command Reference Manual provides more command details. Evaluate the risk tolerance to network outages. For example, if the network experiences long outages, such as two days, ensure that the SavVol on the source site will allow capturing two days of delta sets. If performing replication on multiple file systems per Data Mover, consider the available network bandwidth per file system. Determine whether the network bandwidth is sufficient to transfer the changed data from the source to destination file system. If the rate of change on the source is continuously greater than available network bandwidth, the replication service will not transfer data quickly enough, eventually becoming inactive.
You may also contact EMC Customer Service or read the E-Lab Interoperability Navigator on Powerlink to size the SavVols.
Change replication SavVol requirements

In most cases, the default size of the SavVol is sufficient (10 percent of the source file system size). To accommodate a large network outage or allow for brief periods where the incoming change rate significantly exceeds the ability of the network to send changes to the destination site or both, consider increasing your replication SavVol size. For example, a 50 GB SavVol is sufficient for a 500 GB file system that incurs 20 GB of change per day to cover approximately two and one-half days of network outage without any flow-control events. But to cover a longer outage period, you can enlarge the SavVol.
Version 5.6.47
33 of 172
Table 6 on page 34 describes changing the replication SavVol size and explains when to use each method.
Table 6 Changing Celerra Replicator SavVol size
When to use
Before starting any replication processing.
What to do
Change default size of each Celerra Replicator SavVol from 10% of the source file system. By default, the system allocates 10% of the size of the source file system for the replication SavVol on the source and destination sites.
Procedure
"Change the Celerra Replicator SavVol default size" on page 123
At the start of a replication instance.
Control SavVol size for a file system by specifying a specific SavVol size. Use savsize option for fs_replicate -start. Revise SavVol size to meet your changing requirements.
Task 6: "Begin replication" on page 48
After replication is running.
"Suspend a replication relationship" on page 81
Determine the number of replications per Data Mover

Determine the maximum number of replication sessions per Data Mover based on your configuration (such as the WAN network bandwidth, delta-set update ratio, and the production I/O workload). This number is also affected by whether you are running SnapSure and Celerra Replicator on the same Data Mover. Both of these applications share the available memory on a Data Mover. For all configurations, there is an upper limit to the number of replications allowed per Data Mover. E-Lab Interoperability Navigator on Powerlink details the current number of replications allowed per Data Mover.
Note: If you plan to run loopback replications, remember that each loopback replication counts as two replication sessions because each session encapsulates outgoing and incoming replications.
To learn the number of replication sessions per Data Mover: 1. Determine the SavVol size of each replication, as described in "SavVol size requirements for remote replication" on page 33. 2. Verify that the total storage on a Data Mover (including any source and destination file systems and associated SavVols) does not exceed the guidelines for that Data Mover. These guidelines are detailed in the E-Lab Interoperability Navigator on Powerlink. 3. Celerra Replicator should be transferring delta sets faster than it creates them. Verify that the delta sets per Data Mover can be transferred to the destination
34 of 172 Version 5.6.47
site with the available WAN network bandwidth. An active flow-control condition may be indicative of this situation.
Note: To provide a stable network transfer rate for delta sets, it is strongly recommended that you configure a dedicated network port for Data Mover transfers.
4. Verify that the Data Mover can handle all replication sessions and production I/Os. You can also monitor memory usage and CPU usage using the server_sysstat command. This command shows total memory utilization, not just Celerra Replicator and SnapSure memory usage.
Note: Use Celerra Manager to monitor memory and CPU usage by creating a new notification on Celerras > Notifications > Data Mover Load tab.
Contact EMC Customer Service for additional advice.
Configuration considerations
Before setting the replication policy triggers, consider the following:
To avoid the source and destination file systems from becoming out-of-sync, do not allow the replication service to create delta sets significantly faster than it can copy them to the destination file system. Set the delta-set creation replication policy to a higher number (for example, 1200 seconds) than the delta-set playback number (for example, 600 seconds). The replication policies you establish for creating and replaying delta sets depend on the size and number of transactions processed on the source file system. Determine if the network bandwidth can effectively transport the production change data generated at the source site to the destination site. During the delta-set playback on the destination file system, network clients can access the destination file system. However, at the beginning of the delta-set playback for CIFS (Common Internet File Service) clients, there is a temporary freeze/thaw period that may cause a network disconnect. As a result, do not set the replication policy to a low number because this reduces the availability of the destination file system. To eliminate this freeze/thaw period, create a checkpoint of the destination file system and mount it for client access at the destination site. However, this checkpoint will not contain the most up-to-date production data.
Carefully evaluate the infrastructure of the destination site by reviewing items such as: Subnet addresses Unicode configuration Availability of name resolution services; for example, WINS, DNS, and NIS Availability of WINS/PDC/BDC/DC in the correct Microsoft Windows NT, or Windows Server domain Share names
Version 5.6.47
35 of 172
Availability of user mapping (for example, using EMC Usermapper for Celerra systems). The CIFS environment requires more preparation to set up a remote configuration because of the higher demands on its infrastructure than the network file system (NFS) environment (for example, authentication is handled by the domain controller). For the CIFS environment, you must perform mappings between the usernames/groups and UIDs/GIDs with EMC usermapper or local group/password files on the Data Movers.
Note: Replicating EMC Celerra CIFS Environments (V1) describes configuration considerations in depth.
Local groups are not supported on replicated file systems unless you use VDMs. Replicating EMC Celerra CIFS Environments (V1) describes this consideration more fully. The replication SavVol for the delta sets must be large enough to store and process all the delta-set write I/Os, and the SnapSure SavVol for the checkpoints must be able to store all the source file system block changes for the initial synchronization. The destination file system can only be mounted on one Data Mover, even though it is read-only. At the application level, as well as the operating system level, some applications might have limitations on the read-only destination file system due to caching and locking. If you are planning on enabling international character sets (Unicode) on your source and destination sites, you must first set up translation files on both sites before starting Unicode conversion on the source site. Using International Character Sets with EMC Celerra covers this consideration. Celerra FileMover feature supports replicated file systems. This is described in Using EMC Celerra FileMover. Celerra File-Level Retention Capability supports replicated file systems. Using File-Level Retention on EMC Celerra provides additional configuration information.
36 of 172 Version 5.6.47
User interface choices for Celerra Replicator

Celerra Network Server offers flexibility in managing networked storage based on your support environment and interface preferences. This technical module describes how to set up and manage replication using the command line interface (CLI). You can also perform most tasks using Celerra Manager Basic Edition. The following documents provide additional information about managing Celerra:

Getting Started with Celerra details user interface choices. Learning about EMC Celerra on the EMC Celerra Network Server Documentation CD and the applications online help system describe each applications capabilities. The EMC Celerra Network Server Release Notes provide additional, latebreaking information about Celerra management applications.
Version 5.6.47
37 of 172
Roadmap for Celerra Replicator

This section lists the tasks for configuring and managing Celerra Replicator. Celerra Replicator configuration tasks: 1. "Initiating replication" on page 39 2. "Recovering replication data" on page 61 Celerra Replicator management tasks:

"Abort Celerra Replicator" on page 79 "Suspend a replication relationship" on page 81 "Restarting a replication relationship" on page 89 "Extending the size of a file system" on page 98 "Resetting replication policy" on page 105 "Reverse the direction of a replication relationship" on page 111 "Monitor replication" on page 114 "Checking playback service and outstanding delta sets" on page 115 "Events for Celerra Replicator" on page 121 "Change the Celerra Replicator SavVol default size" on page 123 "Change the passphrase between Celerra Network Servers" on page 124 "Managing and avoiding IP replication problems" on page 125 "Transporting replication data using disk or tape" on page 139 "Setting up the CLARiiON disk array" on page 147
38 of 172 Version 5.6.47
Initiating replication
In most cases, you will have a functioning NFS or CIFS environment before you use Celerra Replicator. If you do not, ensure that you set up the following on the source and destination Celerra Network Servers before using Celerra Replicator:

Establish the IP infrastructure. Establish name service. Read Configuring EMC Celerra Naming Services for more information about establishing name service. Synchronize the time between the Data Movers and Control Stations involved in the replication relationship. The maximum allowable time skew is 10 minutes. Read Configuring EMC Celerra Time Services for more information. Establish user mappings. Read Configuring EMC Celerra User Mapping for more information on establishing user mappings.
The process of setting up a local or remote replication relationship assumes the following:

The source file system is created and mounted as read/write on a Data Mover. The destination file system is not created. The Celerra Network Server version of the destination site must be same as the version of the source site.
Note: The communication between Celerra Control Stations uses HTTPS.
When using remote replication, it is useful to create SnapSure checkpoints of the source file system to restart the source and destination sites if they fall out-of-sync. After you create the checkpoints, the system automatically keeps them up to date, as described in "Out-of-sync replication relationship" on page 20. To set up replication you have to complete the following tasks: 1. "Establish communication" on page 40 2. "Verify communication" on page 40 3. "Create SnapSure checkpoint of source file system" on page 42 4. "Create the destination file system" on page 45 5. "Copy checkpoint to the destination file system" on page 45 6. "Begin replication" on page 48 7. "Create a second checkpoint of the source file system" on page 50 8. "Copy incremental changes" on page 52 9. "Verify file system conversion" on page 54 10. "Check replication status" on page 55 11. "Create restartable checkpoints" on page 59
Version 5.6.47
39 of 172
Note: The first two tasks are used for remote replication and do not apply to setting up a local replication relationship. The commands for setting up a local replication relationship are included for your reference. The output reflects a remote replication.
Task 1: Establish communication

To establish communication:
Action
To establish a trusted relationship at each site, logged in as root, use this command syntax: # nas_cel -create [-name <cel_name>] <ip> -passphrase <passphrase> where: <cel_name> = name of the remote (destination) Celerra Network Server in the configuration <ip> = IP address of the remote Control Station in slot 0 <passphrase> = secure passphrase used for the connection, which must have 6- to 15characters and be the same on both sides of the connection Example: To set up a trust relationship, type the following commands at both sites: [source_site]# nas_cel -create eng25271 -ip 172.24.252.71 -passphrase nasadmin [destination_site]# nas_cel -create eng25246 -ip 172.24.252.46 passphrase nasadmin Note: If you need to change the passphrase later, follow the procedure described in "Change the passphrase between Celerra Network Servers" on page 124.
Output
From source site eng25271, to set up relationship with destination site eng25246: operation in id = name = owner = device = channel = net_path = celerra_id = passphrase = progress (not interruptible)... 3 eng25271 0
172.24.252.71 0001901003890010 nasadmin
Task 2: Verify communication

This task is performed only for remote replication. To verify communication:

"Verify communication at the source site" on page 41 "Verify communication at the destination site" on page 41 "View passphrase" on page 42
40 of 172 Version 5.6.47
Verify communication at the source site

At the source site, check whether Celerra Network Servers can communicate with one another.
Action
To verify that the source and destination sites can communicate with each other, type the command at each site: [source_site]$ nas_cel -list
Output
id 0 1 3 5 name cs100 eng168123 eng16853 cs110 owner mount_dev 0 201 501 503 channel net_path 192.168.168.114 xxx.xxx.xxx.xxx xxx.xxx.xxx.xxx 192.168.168.102 CMU APM000340000680000 APM000437048940000 0001835017370000 APM000446038450000
Note
The sample output shows the source site can communicate with the destination site, cs110.
Verify communication at the destination site

At the destination site, check whether Celerra Network Servers can communicate with one another.
Action
To verify that the source and destination sites can communicate with each other, at each site, type: [destination_site]$ nas_cel -list
Output
id 0 2 name cs110 cs100 owner mount_dev 0 501 channel net_path 192.168.168.102 192.168.168.114 CMU APM000446038450000 APM000340000680000
Note
The sample output shows the destination site can communicate with the source site, cs100.
Version 5.6.47
41 of 172
View passphrase
The 6- to 15-character passphrase is used to authenticate with a remote Celerra Network Server.
Action
To view the passphrase of a Celerra Network Server, use this command syntax: $ nas_cel -info id=<cel_id> where: <cel_id> = Celerra ID
Note: Celerra ID is assigned automatically. To view this ID of a remote system, use the nas_cel -list command. You can also use the hostname.
Example: To view the passphrase of the Celerra system, type: $ nas_cel -info id=5
Output
id name owner device channel net_path celerra_id passphrase = = = = = = = = 5 cs110 503
192.168.168.102 APM000446038450000 nasadmin
Task 3: Create SnapSure checkpoint of source file system

A SnapSure checkpoint is used as the baseline of data to copy to the destination file system. Copying this baseline data from the source to the destination site over the IP network can be a time-consuming process. When using remote replication, you can use an alternate method to copy the initial checkpoint of the source file system. You can back it up to a disk array or tape drive and transport it to the destination site. To use this alternate method, go to "Transporting replication data using disk or tape" on page 139 to continue with this procedure. Using SnapSure on EMC Celerra provides details on SnapSure checkpoints.
CAUTION
When creating checkpoints, be careful not to exceed your systems limit. Celerra permits 96 checkpoints per PFS, regardless of whether the PFS is replicated, for all systems except the Model 510 Data Mover (which permits 32 checkpoints with PFS replication and 64 checkpoints without). This limit counts existing checkpoints, or those already created in a schedule and might count two restartable checkpoints as well as a third checkpoint created by certain replication operations on either the PFS or SFS. If you are at the limit, delete existing checkpoints to create space for newer checkpoints, or do not create new checkpoints if existing ones are more
42 of 172 Version 5.6.47
important. Be aware when you start to replicate a file system, the facility must create two checkpoints, otherwise replication will not start. For example, if you have 95 checkpoints and want to start a replication, the 96th checkpoint will be created, but replication will fail when the system tries to create the 97th checkpoint because the limit is breached.
Also, when scheduling, be careful not to keep any checkpoints that will surpass the limit otherwise you can not start a replication. In other words, if all checkpoints you specify to keep are created, they must be within the limit.
Action
To create a SnapSure checkpoint of the source file system, use this command syntax: $ fs_ckpt <fs_name> -Create where: <fs_name> = file system name on which a checkpoint is created Remote replication example: To create a checkpoint of the source file system src_ufs1, type: $ fs_ckpt src_ufs1 -Create Local replication example: To create a checkpoint of the source file system local_src, type: $ fs_ckpt local_src -Create Note: The output shown is for the remote replication example.
Version 5.6.47
43 of 172
Output
operation in progress (not interruptible)...id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = src_ufs1_ckpt1 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 id = 90 name = src_ufs1_ckpt1 acl = 0 in_use = True type = ckpt worm = off volume = vp246 pool = clar_r5_performance member_of = rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = checkpt_of= src_ufs1 Mon Feb 7 06:58:10 EST 2005 used = 1% full(mark)= 90% stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2
44 of 172 Version 5.6.47
Task 4: Create the destination file system

The destination file system is created as rawfs because replication starts on a rawfs file system. When all steps to set up replication are complete, the file system will be a normal uxfs type. The destination file system must be the same size as the source file system. Managing Celerra Volumes and File Systems Manually and the appropriate man page describe how to create and use a file system.
Step
1.
Action
Create a destination file system using the samesize= option by typing: $ nas_fs -name dst_ufs1 -type rawfs -create samesize=src_ufs1:cel=cs100 pool=clar_r5_performance Local replication example: $ nas_fs -name local_dst -type rawfs -create samesize=local_source pool=clar_r5_performance
2.
Create a mount point on the destination Data Mover by typing: $ server_mountpoint server_2 -create /dst ufs_1 Local replication example: $ server_mountpoint server_2 -create /local_dst
3.
Mount the file system as read-only on the destination Data Mover by typing: $ server_mount server_2 -option ro dst_ufs1 /dst_ufs1 Local replication example: $ server_mount server_2 -option ro local_dst /local_dst Note: The destination file system can only be mounted on one Data Mover, even though it is read-only.
Task 5: Copy checkpoint to the destination file system

This task copies the entire checkpoint of the source file system created in Task 3: "Create SnapSure checkpoint of source file system" on page 42 to the destination file system. This creates a baseline copy of the source file system on the destination file system. This copy is updated incrementally with changes occurring to the source file system. Perform this copy task once per file system that is to be replicated. The checkpoint must be copied without converting it to uxfs, by using the convert=no option. Use of the monitor=off option runs this command as a background process, allowing you to run several copy sessions simultaneously.
Note: If the primary file system extends during the running of the fs_copy command and before replication starts in Task 6: "Begin replication" on page 48, you must extend the destination file system manually to keep file system sizes identical. Use the nas_fs -xtend command.
Version 5.6.47
45 of 172
Action
To copy a checkpoint to the destination file system, use this command syntax: $ fs_copy -start <srcfs> <dstfs>:cel=<cel_name> -option convert=no,monitor=off where: <srcfs> = source file system checkpoint. <dstfs> = destination file system. <cel_name> = destination Celerra Network Server name. -option convert=[yes|no] = allows the conversion of the <dstfs> to uxfs after the file system copy is executed. If no is specified, when the copy has completed, the <dstfs> remains a rawfs file system type. The default is yes. -option monitor=off = progress of the copy is printed to the screen by default. The off option forces the command to run as a background process. Remote replication example: To copy the checkpoint, src_ufs1_ckpt1, to the destination file system without converting it to uxfs, type: $ fs_copy -start src_ufs1_ckpt1 dst_ufs1:cel=cs110 -option convert=no,monitor=off Local replication example: To copy the checkpoint, local_src_ckpt1, to the destination file system, local_dst, type: $ fs_copy -start local_src_ckpt1 local_dst -option convert=no, monitor=off Note: The output shown is for the remote replication example.
Output
operation in progress (not interruptible)...id = 90 name = src_ufs1_ckpt1 acl = 0 in_use = True type = ckpt worm = off volume = vp246 pool = clar_r5_performance member_of = rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = checkpt_of= src_ufs1 Mon Feb 7 06:58:10 EST 2005 used = 1% full(mark)= 90% stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2
46 of 172 Version 5.6.47
Output
id = 126 name = dst_ufs1 acl = 0 in_use = True type = rawfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3:cs110 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = backup_of = src_ufs1 Mon Feb 7 06:58:10 EST 2005 stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 done
Verify the copied checkpoint to the destination file system

Action
To verify that the fs_copy command completes, type: $ fs_copy -list
Output
Local Source Filesystems Id Source Destination Local Destination Filesystems Id Source Destination
Status
%Remaining
CommState
Status
%Remaining
CommState
Note
The fs_copy session is not listed in the output indicating the copy is complete.
Version 5.6.47
47 of 172
Task 6: Begin replication

When you start replication, the system verifies that primary and secondary Data Movers can communicate with each other. Next, it starts replicating and then begins tracking all changes made to the source file system. You start this process once per file system to be replicated. Set your replication policies when you establish this replication relationship. "Replication policies" on page 29 describes this feature. If you want to specify a specific interface or IP address for the replication relationship, do so when you start replication. If you specify an interface for the source site, replication uses that interface until it is changed by the user. If you allow the system to select the interface, it can change to keep the replication relationship running. For example, if the network interface currently being used becomes unavailable, the system attempts to select another interface. If one is found, the replication relationship continues to function. The destination interface, regardless of how selected, is unchanged by the system. Any future changes to this information requires suspending and restarting replication, as detailed in "Suspend a replication relationship" on page 81 and "Restarting a replication relationship" on page 89.
Action
To start replication for the first time, use this command syntax: $ fs_replicate -start <srcfs> <dstfs>:cel=<cel_name> where: <srcfs> = source file system <dstfs> = destination file system <cel_name> = destination Celerra Network Server Note: Multiple fs_replicate -start processes must be executed sequentially, not in parallel. Run only one fs_replicate -start command at a time. Remote replication example: To start replication for source file system src_ufs1 and a destination file system dst_ufs1, type: $ fs_replicate -start src_ufs1 dst_ufs1:cel=cs110 Local replication example: To start replication for source file system local_src and a destination file system local_dst, type: $ fs_replicate -start local_src local_dst Note: The output shown is for the remote replication example.
48 of 172 Version 5.6.47
Output
operation in progress (not interruptible)...id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = src_ufs1_ckpt1 ip_copies = dst_ufs1:cs110 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 id = 126 name = dst_ufs1 acl = 0 in_use = True type = rawfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3:cs110 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = backup_of = src_ufs1 Mon Feb 7 06:58:10 EST 2005 stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 done
Version 5.6.47
49 of 172
Note
The system selects the interface if you do not specify one. Error messages: If you receive an error message stating the interface is not configured or is invalid, the IP addresses for the interface ports are not configured on the destination site. Define these interface ports by running the server_ifconfig command at the destination site. If Error 2211: Sec: invalid id specified appears, the local and destination sites have different passphrases. To modify this, follow the procedure in "Change the passphrase between Celerra Network Servers" on page 124.
Task 7: Create a second checkpoint of the source file system

The changes between the two checkpoints are copied to the destination file system in the next task.
Action
To create a second checkpoint of the source file system, which is compared to the initial checkpoint, use this command syntax: $ fs_ckpt <fs_name> -Create where: <fs_name> = file system name for which a checkpoint is created Remote replication example: To create a SnapSure checkpoint of source file system src_ufs1, type: $ fs_ckpt src_ufs1 -Create Local replication example: To create a SnapSure checkpoint of source file system local_src, type: $ fs_ckpt local_src -Create Note: The following output shown is for the remote replication example.
50 of 172 Version 5.6.47
Output
operation in progress (not interruptible)...id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = src_ufs1_ckpt1,src_ufs1_ckpt2 ip_copies = dst_ufs1:cs110 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 id = 97 name = src_ufs1_ckpt2 acl = 0 in_use = True type = ckpt worm = off volume = vp246 pool = clar_r5_performance member_of = rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = checkpt_of= src_ufs1 Mon Feb 7 07:05:00 EST 2005 used = 3% full(mark)= 90% stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2
Version 5.6.47
51 of 172
Task 8: Copy incremental changes

To ensure that the file system type is uxfs when the copy completes, do not use the convert=no option. Use of the monitor=off option runs this command as a background process, which enables you to run several copy sessions simultaneously.
Action
To copy the incremental changes (or the delta set) between the two source file system checkpoints to the destination file system, use this command syntax: $ fs_copy -start <new_check_point> <dstfs>:cel=<cel_name> -fromfs <previous_check_point> -option monitor=off where: <new_check_point> = last checkpoint taken as described in Task 7: "Create a second checkpoint of the source file system" on page 50. <dstfs> = destination file system. <cel_name> = Celerra Network Server where the destination file system resides. <previous_check_point> = first checkpoint taken. -option monitor=off = progress of the copy is printed to the screen by default. The off option forces the command to run as a background process. Remote replication example: The command to use here differs, depending on how you copied your source file system to the destination site. Use the -Force option only if you used a physical transport method (disk or tape). To copy the incremental changes between checkpoints of source file system src_ufs1, to destination file system dst_ufs1, type: $ fs_copy -start src_ufs1_ckpt2 dst_ufs1:cel=cs110 -fromfs src_ufs1_ckpt1 -option monitor=off If you used a physical transport to perform the original source file system copy to the destination site, type: $ fs_copy -start src_ufs1_ckpt2 dst_ufs1:cel=cs110 -fromfs src_ufs1_ckpt1 -Force -option monitor=off Local replication example: To copy the incremental changes between checkpoints of source file system local_src, type: $ fs_copy -start local_src_ckpt2 local_dst -fromfs local_src_ckpt1 option monitor=off Note: The following output shown is for the remote replication example.
52 of 172 Version 5.6.47
Output
operation in progress (not interruptible)...id = 97 name = src_ufs1_ckpt2 acl = 0 in_use = True type = ckpt worm = off volume = vp246 pool = clar_r5_performance member_of = rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = checkpt_of= src_ufs1 Mon Feb 7 07:05:00 EST 2005 used = 3% full(mark)= 90% stor_devs = APM00034000068-001F,APM00034000068-001E disks . = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 id = 126 name = dst_ufs1 acl = 0 in_use = True type = rawfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3:cs110 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = backup_of = src_ufs1 Mon Feb 7 07:05:00 EST 2005 stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 done
Version 5.6.47
53 of 172
Verify the copied incremental changes

Action
To verify that the fs_copy command completes, type: $ fs_copy -list
Output
Local Source Filesystems Id Source Destination Local Destination Filesystems Id Source Destination
Status
%Remaining
CommState
Status
%Remaining
CommState
Note
The fs_copy session is not listed in the output which indicates that the copy is complete.
Task 9: Verify file system conversion

At the destination site, verify file system conversion:
Action
To verify that the file system is converted to a uxfs type file system, type: [destination_site]$ nas_fs -info dst_ufs1
54 of 172 Version 5.6.47
Output
id = 126 name = dst_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3:cs110 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = backup_of = src_ufs1 Mon Feb 7 07:05:00 EST 2005 stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2
Task 10: Check replication status

You can optionally check the status of all replication sessions that are running on a Celerra system and an individual replication session:

"List all replication sessions (optional)" on page 55 "List individual replication session (optional)" on page 57
List all replication sessions (optional)

Action
To list the current active replication sessions, type: $ fs_replicate -list
Version 5.6.47
55 of 172
Output
Source: Local Source Filesystems Id Source FlowCtrl State 138 src_ufs1 inactive active Destination FlowCtrl State Network dst_ufs1:cs110 inactive active alive
Local Destination Filesystems Id Source FlowCtrl State Destination: Local Source Filesystems Id Source FlowCtrl State
Dest.
FlowCtrl State
Network
Destination
FlowCtrl State Network
Local Destination Filesystems Id Source FlowCtrl State 165 src_ufs1:cs100 inactive active
Dest. FlowCtrl State Network dst_ufs1 inactive active alive
Note
For local replication, the Destination FlowCtrl status and Network status always contain N/A (not applicable).
56 of 172 Version 5.6.47
List individual replication session (optional)

Execute this command from either Celerra Network Server for which the system reports the replication status of each current delta set on source and destination sites.
Action
To check the replication status and generate historical data about the replication up to the number of lines specified, use this command syntax: $ fs_replicate -info <fs_name> -verbose <number_of_lines> where: <fs_name> = name of the file system (in the example it is the source file system). <number_of_lines> = lines to display historical replication data. The maximum number is 128. Remote replication example: To check the replication status of the replication relationship, type: $ fs_replicate -info src_ufs1 -verbose 10 Local replication example: To check the replication status of the replication relationship, type: $ fs_replicate -info local_src -verbose 10 Note: The following output shown is for the remote replication example. "Appendix A: fs_replicate -info output fields" on page 164 gives the output definition description for the fs_replicate -info command.
Version 5.6.47
57 of 172
Output
id name fs_state type replicator_state source_policy high_water_mark time_out current_delta_set current_number_of_blocks flow_control total_savevol_space savevol_space_available id name type playback_state high_water_mark time_out current_delta_set flow_control total_savevol_space savevol_space_available outstanding delta sets: = = = = = = = = = = = = = = = = = = = = = = = 88 src_ufs1 active replication active NoPolicy 600 600 3 1 inactive 1048576 KBytes 917504 KBytes (Before Flow Control) 126 dst_ufs1:cs110 playback active 600 600 3 inactive 1048576 KBytes 786432 KBytes (Before Flow Control)
<None>
communication_state current_transfer_rate avg_transfer_rate source_ip source_port destination_ip destination_port QOS_bandwidth
= = = = = = = =
alive ~ 13312 Kbits/second ~ 169984 Kbits/second 192.168.168.18 57273 192.168.168.20 8888 0 kbits/sec
| Source | Destination Delta|Create Time Dur Blocks|Playback Time Dur Blocks DSinGroup -----|-------------- ------ ------|-------------- ------ ------ -------2 2005-02-08 00:20 1 1 2005-02-08 00:20 128 2005-02-08 00:20 128 2 0 2005-02-08 00:10 333 2005-02-08 00:10 333 1
Note
All times are GMT. Block size is 8 KB.
58 of 172 Version 5.6.47
Task 11: Create restartable checkpoints

To ensure that a valid checkpoint is available to restart an out-of-sync replication relationship, create two checkpoints of the source file system. These restartable checkpoints, named <source_fs_name>_repl_restart_1 and <source_fs_name>_repl_restart_2, are automatically refreshed by a system CRON job that runs every hour at 25 minutes after the hour. When a replication relationship is out-of-sync, these checkpoints are available for the restart process. You must include these two checkpoints when determining the maximum number of checkpoints per file system used with replication.
Action
To create checkpoints for use when a replication relationship falls out-of-sync, use this command syntax: $ fs_ckpt <fs_name> -name <name> -Create where: <fs_name> = file system name on which a checkpoint is created <name> = name of the refresh checkpoint that must follow this naming convention <source_fs_name>_repl_restart_1 and <source_fs_name>_repl_restart_2 Example: To create the checkpoints of the source file system for use when a replication relationship falls outof-sync, type: $ fs_ckpt src_ufs1 -name src_ufs1_repl_restart_1 -Create $ fs_ckpt src_ufs1 -name src_usf1_repl_restart_2 -Create The output only shows the creation of the first checkpoint.
Version 5.6.47
59 of 172
Output
operation in progress (not interruptible)...id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = src_ufs1_ckpt1,src_ufs1_ckpt2,src_ufs1_repl_restart_1 ip_copies = dst_ufs1:cs110 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 id = 100 name = src_ufs1_repl_restart_1 acl = 0 in_use = True type = ckpt worm = off volume = vp246 pool = clar_r5_performance member_of = rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = checkpt_of= src_ufs1 Mon Feb 7 07:14:26 EST 2005 used = 4% full(mark)= 90% delta_number= 1 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2
60 of 172 Version 5.6.47
Recovering replication data

Occasionally, you might need to recover from a situation in which a replication relationship must be terminated because of a disaster situation. For example, the source site experiences a disaster and becomes unavailable for data processing, or the source file system becomes corrupted and is unrecoverable. In this situation: 1. You can perform a failover from the site where you replicate your data. A checkpoint is created on the destination site which allows the system to track changes that occur after the failover. The failover process makes the destination file system read/write. "Replication failover" on page 62 provides more information. 2. When the source site becomes available again, resynchronize your source and destination sites and restart replication. However, after resynchronization replication is running in the reverse direction. "Resynchronize the source and destination sites" on page 66 provides more information. 3. Use a replication reversal to restore the direction of replication to what it was prior to the failover. When you schedule a replication reversal, the write activity on the destination file system is stopped, and the changes are applied to the source file system before the source site becomes read/write. The reversal process synchronizes the source and destination file systems. "Replication reversal" on page 74 provides more information.
Note: Replication failover, resynchronization, and reversal must be performed sequentially. For example, if you replicate multiple file systems from one Celerra system to another, you must run separate failover commands and wait for one command to complete before running the next command.
Table 7 on page 61 shows the relationships between the source and destination file systems when there is a initiated replication, failover, resynchronization, and reversal.
Table 7 Replication file system relationships (page 1 of 2)
Replication option
Begin replication
Source site
Source file system is read/write.
Destination site
Destination file system is read-only.
Explanation
Normal replication processing establishes source file system as read/write and destination file system as read-only. Changes which file system acts as the source file system and which acts as the destination file system. Brings the destination file system to read/write to service the I/O in the case of disaster.
Failover
Source file system becomes read-only, if source site is still available.
Destination file system becomes read/write.
Version 5.6.47
61 of 172
Table 7
Replication file system relationships (page 2 of 2)
Replication option
Resynchronize
Source site
Source file system remains read-only. The read-only file system becomes read/write.
Destination site
Destination file system remains read/write.
Explanation
Use to reestablish replication after a failover. Both sites must be available.
Reversal
The read/write file system becomes readonly.
Changes which file system acts as the source file system and which acts as the destination file system. Perform a reversal from whichever file system is read/write. When used after a failover, restores the direction of replication to what it was prior to the failover. Both sites must be available.
Task 1: Replication failover

Failover is the process that changes the destination file system from read-only to read/write and stops the transmission of replicated data. The source file system, if available, becomes read-only. Use the failover operation if the source site is unavailable due to a disaster or if the source site is still available but you want to activate the destination file system as read/write. Perform a failover using the options specified in Table 8 on page 62.
Table 8 Failover options (page 1 of 2)
Failover option
default
Use if
The source site is totally corrupt or unavailable. The source site is totally corrupt or unavailable.
What happens
Plays back all available delta sets at the destination site before failing over.
Site that must be available

Destination site
now
Initiates an immediate failover and no delta sets are played back. Note: If you perform a failover using this option and delta sets are in the SavVol at the destination site, an incremental resynchronization might not be possible in all cases.
Destination site
62 of 172 Version 5.6.47
Table 8
Failover options (page 2 of 2)
Failover option
sync
Use if
The source site is still available.
What happens
Fails over without any data loss by making the source file system read-only, the destination file system read/write, and creating a restart checkpoint of the destination file system. Synchronized failover takes longer to invoke and cannot be performed if the source site is unavailable. It is more suited to a maintenance-related failover as part of a failover plan. Note: The sync option is not used in a disaster situation because both sites must be available.
Site that must be available

Source and destination
Using the failover command when the source site is unavailable results in data loss because delta sets cannot be transferred from the source site to the destination site. Using the default failover option reduces the amount of data loss, by replaying any available delta sets (pending data) at the destination before initiating failover. Failover processing creates a file system checkpoint to use to resynchronize the replication relationship. You must include this checkpoint when determining the maximum number of checkpoints per file system used with replication. When a failover completes, replication is stopped and the destination file system becomes read/write. Replication is no longer running because the source site is usually unavailable when a failover is initiated. If the source site becomes available, reestablish replication in the opposite direction (from the destination site to the source site) by resynchronizing the source and destination file systems. For replication failover:

"Verify status of destination file system" on page 64 "Initiate replication failover" on page 64 "Verify file system is read/write" on page 66
Version 5.6.47
63 of 172
Verify status of destination file system

Action
To check the status of the destination file system and verify if it is mounted as read-only, use this command syntax: $ server_mount <movername> where: <movername> = name of the Data Mover on which the file system is mounted Example: To verify the status of the destination file systems on server_2, type: $ server_mount server_2 Note: The ro in the output indicates a read-only file system.
Output
server_2 : root_fs_2 on / uxfs,perm,rw fs1 on /fs1 uxfs,perm,rw ckpt1 on /ckpt1 ckpt,perm,ro fsk on /fsk ckpt,perm,ro root_fs_common on /.etc_common uxfs,perm,ro dst_ufsl on /dst_ufsl uxfs,perm,ro
Initiate replication failover

Action
To initiate a failover from the destination site, use this command syntax: [destination_site]$ fs_replicate -failover <srcfs>:cel=<cel_name> <dstfs> -option now|sync where: <srcfs> = source file system name <cel_name> = Celerra Network Server name of the source site <dstfs> = destination file system name Example: To fail over the source file system to the destination file system, type: $ fs_replicate -failover src_ufs1:cel=cs100 dst_ufs1
64 of 172 Version 5.6.47
Output
operation in progress (not interruptible)...id = 126 name = dst_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = root_restart_ckpt_88_2 stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10:cs100 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = src_ufs1_repl_restart_1:cs100,src_ufs1_repl_restart_2:cs100 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 done
Version 5.6.47
65 of 172
Verify file system is read/write

Action
To verify that the file system is mounted as read/write and is accessible to the network clients, type: $ server_mount server_2
Output
server_mount server_2 server_2 : root_fs_2 on / uxfs,perm,rw root_fs_common on /.etc_common uxfs,perm,ro dst_ufs1 on /dst_ufs1 uxfs,perm,rw
Task 2: Resynchronize the source and destination sites

After the failover command is run, a checkpoint is created on the destination site and the destination file system becomes read/write. New writes are then allowed on the destination file system. When replication is resynchronized, default values are used for the replication policies. For example, high water mark and timeout are set to 600. You can specify new policies when you restart replication using the any of these options:

autofullcopy={yes} to=<timeout> dto=<destination timeout> hwm=<high_water_mark> dhwm=<destination high_water_mark> qos=<qos> autoro={yes|no} autofreeze={yes|no}
Note: If you need to increase your file systems size and plan to resynchronize your source and destination sites after a failover, you must complete the resynchronization (fs_replicate -resync) before increasing the size of your destination file system.
To resynchronize the source and destination sites:

"Verify file system is read-only" on page 67 "Resynchronize the source and destination file systems and restart replication" on page 68
66 of 172 Version 5.6.47
Verify file system is read-only

Step
1.
Action
Before resynchronizing the file systems, verify whether the file system on the original source site is mounted as read-only by typing: $ server_mount server_2 Note: If the source file system is not mounted read-only, the following message appears:Error 4124: <file_system> : is not mounted ro
Result
server_2 : root_fs_2 on / uxfs,perm,rw root_fs_common on /.etc_common uxfs,perm,ro src_ufs1 on /src_ufs1 uxfs,perm,ro
2.
If the file system (src_ufs1) in the example is read/write, change it to read-only by typing: $ server_mount server_2 -option ro src_ufs1
server_2 : done
Version 5.6.47
67 of 172
Resynchronize the source and destination file systems and restart replication
Action
To attempt to resynchronize the source and destination file systems and restart replication from the destination site, use this command syntax: [destination_site]$ fs_replicate -resync <dstfs>[:cel=<cel_name>] <srcfs> -option autofullcopy=yes where: <dstfs> = current read-only file system name <cel_name> = name of Celerra Network Server of the original source site <srcfs> = current read/write file system name -option autofullcopy=yes = executes a full copy of the file system if an incremental resynchronization does not complete. Example: To resynchronize the file systems and resume replication, type: $ fs_replicate -resync src_ufs1:cel=cs100 dst_ufs1 For the system to automatically perform a full copy of the source file system if the incremental resynchronization fails, type: $ fs_replicate -resync src_ufs1:cel=cs100 dst_ufs1 -option autofullcopy=yes The full copy of the file system using autofullcopy=yes can be time-consuming. Consider when you want to run this command. Note: If a disaster occurs during the transfer, some delta sets might become lost. As a result, the replication process will not be able to completely replicate the source file system on the destination site.
Output: Convert to rawfs

operation in progress (not interruptible)...Converting filesystem type id = 88 name = src_ufs1 acl = 0 in_use = True type = rawfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = ckpts = src_ufs1_repl_restart_2,src_ufs1_repl_restart_1 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2
68 of 172 Version 5.6.47
Output: Start copy

Starting baseline copy... operation in progress (not interruptible)...id = 133 name = root_restart_ckpt_88_2 acl = 0 in_use = True type = ckpt worm = off volume = vp284 pool = clar_r5_performance member_of = rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = checkpt_of= dst_ufs1 Mon Feb 7 07:25:11 EST 2005 used = 1% full(mark)= 90% delta_number= 4 stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 id = 88 name = src_ufs1 acl = 0 in_use = True type = rawfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10:cs100 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = backup_of = dst_ufs1 Mon Feb 7 07:25:11 EST 2005 ckpts = src_ufs1_repl_restart_2:cs100,src_ufs1_repl_restart_1:cs100 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 IP Copy remaining (%) 100..Done. done
Version 5.6.47
69 of 172
Output: Start copy

operation in progress (not interruptible)...id = 88 name = src_ufs1 acl = 0 in_use = True type = rawfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = backup_of = dst_ufs1:cs110 Mon Feb 7 07:25:11 EST 2005 ckpts = src_ufs1_repl_restart_2,src_ufs1_repl_restart_1 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 done
70 of 172 Version 5.6.47
Output: Starting replication

Starting replication... operation in progress (not interruptible)...id = 126 name = dst_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = root_restart_ckpt_88_2 ip_copies = src_ufs1:cs100 stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 id = 88 name = src_ufs1 acl = 0 in_use = True type = rawfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10:cs100 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = backup_of = dst_ufs1 Mon Feb 7 07:25:11 EST 2005 ckpts = src_ufs1_repl_restart_2:cs100,src_ufs1_repl_restart_1:cs100 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 done
Version 5.6.47
71 of 172
Output: Generating a new checkpoint

Generating new checkpoint... operation in progress (not interruptible)...id = 126 name = dst_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = root_restart_ckpt_88_2,root_new_ckpt_dst_ufs1 ip_copies = src_ufs1:cs100 stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 id = name = acl = in_use = type = worm = volume = pool = member_of = rw_servers= ro_servers= rw_vdms = ro_vdms = checkpt_of= used = full(mark)= stor_devs = disks = disk=d8 disk=d8 disk=d8 disk=d8 disk=d9 disk=d9 disk=d9 disk=d9 140 root_new_ckpt_dst_ufs1 0 True ckpt off vp284 clar_r5_performance
server_2
dst_ufs1 Mon Feb 7 07:34:31 EST 2005 3% 90% APM00044603845-0008,APM00044603845-0007 d8,d9 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2
72 of 172 Version 5.6.47
Output: Starting differential copy

Starting diff copy... operation in progress (not interruptible)...id = 140 name = root_new_ckpt_dst_ufs1 acl = 0 in_use = True type = ckpt worm = off volume = vp284 pool = clar_r5_performance member_of = rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = checkpt_of= dst_ufs1 Mon Feb 7 07:34:31 EST 2005 used = 3% full(mark)= 90% stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 id = 88 name = src_ufs1 acl = 0 in_use = True type = rawfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10:cs100 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = backup_of = dst_ufs1 Mon Feb 7 07:34:31 EST 2005 ckpts = src_ufs1_repl_restart_2:cs100,src_ufs1_repl_restart_1:cs100 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 IP Copy remaining (%) 100..Done. done
Version 5.6.47
73 of 172
Output: Deleting checkpoints that are used to restart replication

Deleting root_restart_ckpt_88_2... id = 133 name = root_restart_ckpt_88_2 acl = 0 in_use = False type = ckpt worm = off volume = rw_servers= ro_servers= rw_vdms = ro_vdms = Deleting root_new_ckpt_dst_ufs1... id = 140 name = root_new_ckpt_dst_ufs1 acl = 0 in_use = False type = ckpt worm = off volume = rw_servers= ro_servers= rw_vdms = ro_vdms = Operation complete done
Task 3: Replication reversal

A replication reversal is a scheduled change in the direction of replication and requires both sites to be available. If you successfully resumed replication using the -resync option, replication is now running in the reverse directionfrom the destination site to the source site. You can now execute a reversal to change the direction of replication. This reversal returns the original source file system to read/write and the original destination file system to read-only. The write activity on the destination file system is stopped and any changes are applied to the source file system before the source site becomes read/write. The reversal process keeps the source and destination file systems synchronized. A reversal is usually performed after a failover and resynchronization, but it can be used at any time to reverse the replication direction.
Note: This can only be done when both sites are operational.
For replication reversal:

"Verify direction of replication process" on page 75 "Reverse the replication" on page 76
74 of 172 Version 5.6.47
Verify direction of replication process

Before you reverse the replication direction, verify which file system is read/write and which is read-only by using the server_mount server_x command. There will be a rw or ro entry to indicate the file systems status. You then verify the direction.
Action
To verify the direction of your replication relationship, type: $ fs_replicate -list Note: Use this command on the site where the file system is read/write. In this example, the destination file system is read/write because of the failover.
Output
fs_replicate -list Local Source Filesystems Id Source FlowCtrl State Destination 7 dst_ufs1 inactive active src_ufs1:cs100 Local Destination Filesystems Id Source FlowCtrl State
FlowCtrl inactive
State Network active alive
Destination
FlowCtrl
State
Network
Note
Notice the file system named dst_ufs1 is acting as the source file system (read/write) and the file system named src_ufs1 is functioning as the destination file system (read-only).
Version 5.6.47
75 of 172
Reverse the replication

Action
To initiate a reversal from the original destination site, use this command syntax: [original_destination_site]$ fs_replicate -reverse <dstfs>:cel=<cel_name> <srcfs> where: <dstfs> = current read-only file system name. This file system will become the read/write file system. <cel_name> = name of Celerra Network Server at the original source site. <srcfs> = current read/write file system name. This file system will become the read-only file system. Example: To reverse the direction of the replication process and return the src_ufs1 file system to read/write and the dst_ufs1 file system to read-only, type: $ fs_replicate -reverse src_ufs1:cel=cs100 dst_ufs1
76 of 172 Version 5.6.47
Output
operation in progress (not interruptible)...id = 126 name = dst_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ip_copies = src_ufs1:cs100 stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10:cs100 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = backup_of = dst_ufs1 Mon Feb 7 07:34:31 EST 2005 ckpts = src_ufs1_repl_restart_2:cs100,src_ufs1_repl_restart_1:cs100 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 done operation in progress (not interruptible)... done
Version 5.6.47
77 of 172
Note
After this command completes, the original source file system is read/write and the original destination file system is read-only. Replication is now running in the direction it was before the failover. You can verify this by using the server_mount server_x command. When replication is reversed, default values are used for the replication policies. For example, high water mark and timeout are set to 600. You can specify new policies when you restart replication using -option <options>.
After a failover or reversal

After you complete a failover or reversal, to enable users to access the file system that has newly acquired read/write access permission, you must configure Celerra system appropriately for NFS. In many cases, you can use the same configuration you used for the source file system which might include:

File system exports I18N mode (Unicode or ASCII) configuration Network interfaces
Replicating EMC Celerra CIFS Environments (V1) describes all CIFS replication issues.
78 of 172 Version 5.6.47
Abort Celerra Replicator

Abort Celerra Replicator when you no longer want to replicate the file system or when your source and destination file systems are not synchronized and you want to end replication. After Celerra Replicator successfully aborts, the destination file system is a readonly file system at the destination site. The source file system is a read/write file system at the source site. When a session has been aborted it cannot be restarted using the -restart option.
Note: Aborting replication does not delete the underlying file systems. Note: Multiple fs_replicate -abort processes are executed sequentially, not in parallel. Only run one fs_replicate -abort command at a time.
Action
To abort replication on source and destination file systems simultaneously, use this command syntax from the source site: $ fs_replicate -abort <srcfs>,<dstfs>:cel=<cel_name> where: <srcfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of the remote Celerra Network Server Example: To stop replication for the replication relationship, type: [source_site]$ fs_replicate -abort src_ufs1,dst_ufs1:cel=cs110
Version 5.6.47
79 of 172
Output
operation in progress (not interruptible)...id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = src_ufs1_repl_restart_2,src_ufs1_repl_restart_1 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 id = 126 name = dst_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3:cs110 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 done
80 of 172 Version 5.6.47
Suspend a replication relationship

Suspend is an option that allows you to temporarily stop an active replication relationship and leave replication in a condition that allows it to be restarted.
Action
To suspend replication, use this command syntax: [source_site]$ fs_replicate -suspend <srcfs> <dstfs>:cel=<cel_name> where: <scrfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of the destination Celerra Network Server Example: To suspend a replication relationship, type: $ fs_replicate -suspend src_ufs1 dst_ufs1:cel=cs110
Version 5.6.47
81 of 172
Suspend output: Creating a new delta set

operation in progress (not interruptible)...operation in progress (not interruptible)... id = 88 name = src_ufs1 fs_state = active type = replication replicator_state = active source_policy = NoPolicy high_water_mark = 0 time_out = 0 current_delta_set = 5 current_number_of_blocks = 0 flow_control = inactive total_savevol_space = 1048576 KBytes savevol_space_available = 917504 KBytes (Before Flow Control) id name type playback_state high_water_mark time_out current_delta_set flow_control total_savevol_space savevol_space_available = = = = = = = = = = 126 dst_ufs1:cs110 playback active 600 600 4 inactive 1048576 KBytes 917504 KBytes (Before Flow Control)
outstanding delta sets: Delta Source_create_time ----- -----------------4 2005-02-09 06:54:26
Blocks -----1
= = = = = = = =
Note: All times are in GMT. Block size is 8 KBytes. done
82 of 172 Version 5.6.47
Suspend output: Creating a baseline checkpoint

Generating new checkpoint... operation in progress (not interruptible)...id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = src_ufs1_repl_restart_1,src_ufs1_ckpt1, src_ufs1_repl_restart_2,root_susp end_ckpt_126_5 ip_copies = dst_ufs1:cs110 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 id = 125 name = root_suspend_ckpt_126_5 acl = 0 in_use = True type = ckpt worm = off volume = vp246 pool = clar_r5_performance member_of = rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = checkpt_of= src_ufs1 Tue Feb 8 13:51:01 EST 2005 used = 6% full(mark)= 90% delta_number= 5 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2
Version 5.6.47
83 of 172
Suspend output: Creating another delta set

operation in progress (not interruptible)... id = 88 name = src_ufs1 fs_state = active type = replication replicator_state = active source_policy = NoPolicy high_water_mark = 0 time_out = 0 current_delta_set = 6 current_number_of_blocks = 0 flow_control = inactive total_savevol_space = 1048576 KBytes savevol_space_available = 917504 KBytes (Before Flow Control) id name type playback_state high_water_mark time_out current_delta_set flow_control total_savevol_space savevol_space_available = = = = = = = = = = 126 dst_ufs1:cs110 playback active 600 600 4 inactive 1048576 KBytes 786432 KBytes (Before Flow Control)
outstanding delta sets: Delta Source_create_time ----- -----------------5 2005-02-09 06:54:53 4 2005-02-09 06:54:26
Blocks -----1 1
= = = = = = = =
84 of 172 Version 5.6.47
Suspend output: Playing back delta set

operation in progress (not interruptible)... id = 88 name = src_ufs1:cs100 fs_state = active type = replication replicator_state = active source_policy = NoPolicy high_water_mark = 0 time_out = 0 current_delta_set = 6 current_number_of_blocks = 0 flow_control = inactive total_savevol_space = 1048576 KBytes savevol_space_available = 917504 KBytes (Before Flow Control) id name type playback_state high_water_mark time_out current_delta_set flow_control total_savevol_space savevol_space_available outstanding delta sets: = = = = = = = = = = 126 dst_ufs1 playback active 0 10 6 inactive 1048576 KBytes 786432 KBytes (Before Flow Control)
<None>
= = = = = = = =
Version 5.6.47
85 of 172
Suspend output: Waiting for synchronization

operation in progress (not interruptible)...id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = src_ufs1_repl_restart_1,src_ufs1_ckpt1, src_ufs1_repl_restart_2,root_susp end_ckpt_126_5 ip_copies = dst_ufs1:cs110 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 done
86 of 172 Version 5.6.47
Suspend output: Convert destination file system to rawfs ready for restart
name = dst_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 done Converting filesystem type id = 126 name = dst_ufs1 acl = 0 in_use = True type = rawfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 Operation complete done
Version 5.6.47
87 of 172
Verify the suspended replication relationship

Action
To verify that the replication relationship no longer exists, type either of the following two commands. The -list option no longer displays the replication pair: $ fs_replicate -list $ fs_replicate -info src_ufs1
Output
Local Source Filesystems Id Source FlowCtrl State
Destination
FlowCtrl State
Network
Local Destination Filesystems Id Source FlowCtrl State The replication session is no longer listed.
Dest.
FlowCtrl State
Network
Error 2242: src_ufs1 : replication/playback is not set up
88 of 172 Version 5.6.47
Restarting a replication relationship

To restart a replication relationship:

"Verify that the replication relationship is not synchronized" on page 90 "Restart replication relationship" on page 91
Version 5.6.47
89 of 172
Verify that the replication relationship is not synchronized

Action
To verify the replication relationships status, use this command syntax: $ fs_replicate -info <srcfs> where: <srcfs> = name of the source file system Example: To verify the replication relationship, type: $ fs_replicate -info src_ufs1 -verbose 10
Output
id name fs_state type replicator_state source_policy high_water_mark time_out current_delta_set current_number_of_blocks flow_control total_savevol_space savevol_space_available id name type playback_state high_water_mark time_out current_delta_set flow_control total_savevol_space savevol_space_available outstanding delta sets: = = = = = = = = = = = = = = = = = = = = = = = 88 src_ufs1 active replication inactive NoPolicy 600 600 0 0 active 1048576 KBytes 0 KBytes (Before Flow Control) 126 dst_ufs1:cs110 playback active 600 600 146 inactive 1048576 KBytes 917504 KBytes (Before Flow Control)
<None>
= = = = = = = =
down ~ 0 Kbits/second ~ 0 Kbits/second 0.0.0.0 0 192.168.168.20 8888 0 kbits/sec
| Source | Destination Delta|Create Time Dur Blocks|Playback Time Dur Blocks DSinGroup -----|-------------- ------ ------|-------------- ------ ------ -------Note: All times are in GMT. Block size is 8 KBytes.
90 of 172 Version 5.6.47
Restart replication relationship

Action
To restart a replication relationship, use this command syntax from the source site: $ fs_replicate -restart <srcfs> <dstfs>:cel=<cel_name> where: <scrfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of the destination Celerra Network Server Example: To restart a replication relationship, type: $ fs_replicate -restart src_ufs1 dst_ufs1:cel=cs110 Note: The following output shown is for restarting an out-of-synchronization replication.
Version 5.6.47
91 of 172
Restart output: Converting file system type

operation in progress (not interruptible)...operation in progress (not interruptible)...id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = src_ufs1_repl_restart_2,src_ufs1_repl_restart_1 ip_copies = dst_ufs1:cs110 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 done operation in progress (not interruptible)...id = 126 name = dst_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 done
92 of 172 Version 5.6.47

Converting filesystem type id = 126 name = dst_ufs1 acl = 0 in_use = True type = rawfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms =

stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2
Version 5.6.47
93 of 172
Restart output: Starting replication

Starting replication... operation in progress (not interruptible)...id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = src_ufs1_repl_restart_2,src_ufs1_repl_restart_1 ip_copies = dst_ufs1:cs110 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 id = 126 name = dst_ufs1 acl = 0 in_use = True type = rawfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3:cs110 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = backup_of = src_ufs1 Tue Feb 8 08:49:46 EST 2005 stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 done
94 of 172 Version 5.6.47
Restart output: Create checkpoint

Generating new checkpoint... operation in progress (not interruptible)...id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = src_ufs1_repl_restart_2,src_ufs1_repl_restart_1, root_new_ckpt_src_ufs1 ip_copies = dst_ufs1:cs110 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 id = 115 name = root_new_ckpt_src_ufs1 acl = 0 in_use = True type = ckpt worm = off volume = vp246 pool = clar_r5_performance member_of = rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = checkpt_of= src_ufs1 Tue Feb 8 08:50:13 EST 2005 used = 4% full(mark)= 90% stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2
Version 5.6.47
95 of 172
Restart output: Start differential copy

Starting diff copy... operation in progress (not interruptible)...id = 115 name = root_new_ckpt_src_ufs1 acl = 0 in_use = True type = ckpt worm = off volume = vp246 pool = clar_r5_performance member_of = rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = checkpt_of= src_ufs1 Tue Feb 8 08:50:13 EST 2005 used = 4% full(mark)= 90% stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 id = 126 name = dst_ufs1 acl = 0 in_use = True type = rawfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3:cs110 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = backup_of = src_ufs1 Tue Feb 8 08:50:13 EST 2005 stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 IP Copy remaining (%) 100..Done. done
96 of 172 Version 5.6.47
Restart output: Delete restart checkpoints

Deleting root_new_ckpt_src_ufs1... id = 115 name = root_new_ckpt_src_ufs1 acl = 0 in_use = False type = ckpt worm = off volume = rw_servers= ro_servers= rw_vdms = ro_vdms = Operation complete done
Version 5.6.47
97 of 172
Extending the size of a file system

File systems can be extended automatically and manually. Follow these procedures to extend file system size:

"Extend file system size automatically" on page 98 "Extend file system size" on page 101
Note: You cannot extend file systems using non-sliced volumes if replication is running. When using Automatic File System Extension, the slice option must be enabled for an Automatic Volume Management (AVM) pool when replication is running. Managing EMC Celerra Volumes and File Systems with Automatic Volume Management describes how to configure slices in detail.
Extend file system size automatically

File systems in a replication relationship can be extended automatically. For replication, the Automatic File System Extension policy can be set on the source file system only. When the policy is set, the destination file system is extended, and then the source file system, just as with extending file system manually. This feature is used on file systems created using AVM only. Managing EMC Celerra Volumes and File Systems with Automatic Volume Management details Automatic File System Extension. Automatic File System Extension provides the following options in the CLI and in Celerra Manager (they are called out here as they appear in Celerra Manager):

Auto Extend enabled: To select the function. Virtual Provisioning enabled: To allocate storage capacity based on anticipated need, but dedicate resources only as needed. When virtual provisioning is enabled, the maximum file system size or real file system size, whichever is larger is reported to NFS and CIFS clients through Celerra Manager or CLI. High water mark: To specify the threshold, ranging from 50 to 90 percent of the file system, which triggers file system extension. Maximum capacity (MB): The maximum size the file system can be extended.
Note: The virtual size of a source but not destination file system is visible from the NFS/CIFS client.
98 of 172 Version 5.6.47
Action
To automatically extend the size of a file system (setting a high water mark, maximum size, and virtual provisioning), use this command syntax: $ nas_fs -modify <fs_name> -auto_extend yes -hwm <50-99>% -max_size <integer>[T|G|M] -vp <yes|no> where: <fs_name> = name of the file system. <50-99> = percentage of the file system that is full and in need of being reached before the file system is automatically extended. <integer> = file systems maximum size (entered in TB, GB, or MB). -vp = used with a specified Maximum Capacity value to report the anticipated or actual file system size. When turned on, the virtual size is reported to clients. Example: To automatically extend the source file system src_ufs1 with a high water mark of 50% and a minimum size of 70 MB with virtual provisioning enabled, type: $ nas_fs -modify src_ufs1 -auto_extend yes -hwm 50% -max_size 70M -vp yes
Output
id = 2707 name = pfs001 acl = 0 in_use = True type = uxfs worm = off volume = v7283 pool = clarata_archive member_of = root_avm_fs_group_10 rw_servers= 123secnfs ro_servers= rw_vdms = ro_vdms = auto_ext = hwm=50%,max_size=16777216M,virtual_provision=yes ckpts = pfs001_ckpt60,pfs001_ckpt61,pfs001_ckpt62, pfs001_ckpt63,pfs001_ckpt6 stor_devs = APM00043200225-0029,APM00043200225-002C, APM00043200225-0027,APM00043 disks = d35,d21,d34,d20,d32,d16,d30,d14,d33,d17,d31, d15,d38,d41,d36,d22,d29,d13,d39,d42,d28,d12,d37,d23 disk=d35 stor_dev=APM00043200225-0029 addr=c16t2l9 server=123secnfs [nasadmin@lnsgc123 nasadmin]$
Version 5.6.47
99 of 172
Recover from Automatic File System Extension failure

Automatic File System Extension employs an internal script that checks for adequate space on the source and destination sites. If replication is thwarted because extension of either file system fails due to insufficient space, an error message displays the cause. For example: Nov 21 10:37:53 2005 CFS:3:101 fs auto extension failed: no space available to extend src1 In the great majority of Automatic File System Extension failures, both file systems fail to extend. By consulting the sys_log, you can determine whether only the source file system extension failed if the log contains a line that directs you to use the src_only option. Whether one or both file system extensions fail, find the file systems sizes and compare their block counts and volume sizes by using the nas_fs -size command. The number displayed is a second way to determine which file system extension failed. This method allows you to know how much space to reserve when you manually extend the file systems. Perform the following steps to recover from a failure to automatically extend the file system.
Step
1.
Action
Display the sys_log by typing: $ cd /nas/log/ $ more sys_log Find the size of the source and destination file system, and compare the values using this command syntax: $ nas_fs -size <fs_name> The following example shows sample output: total = 100837 avail = 99176 used = 1660 ( 1% ) (sizes in MB) ( blockcount = 209715200 ) volume: total = 102400 (sizes in MB) ( blockcount = 209715200 ) Note: Because total, available, and used values are generated from the operating system and not updated until the destination file system is refreshed, that data will differ from total volume data derived by the Control Station. Block # counts agree on the same file system, however, and can be used to accurately compare totals on source and destination file systems.
2.
3.
To extend the source file system, type the following command from the source site. For size, type the difference between source and destination file systems determined in step 2: $ nas_fs -xtend <fs_name> size=<integer>[T|G|M] -option src_only Note: Because replication always extends the destination first, the source size will never be larger than the destination size. If you specify the wrong size when you run this command, the command will report the error and you must rerun the command with the correct size.
100 of 172 Version 5.6.47
Extend file system size

To extend file system size:

"Extend file system size manually" on page 101 "Extend a file system after replication has been suspended" on page 103 "Extend a file system after replication failover" on page 104 "Start replication when the source file system is inactive" on page 104
Extend file system size manually

File systems can be manually extended while replication is running. If only the destination file system extends, you can extend the source file system manually. The procedure to do this (and extend both file systems) is the same for manual and Automatic File System Extension. Extending the source file system extends the destination file system by default. This maintains the same file system sizes at both sites. "Recover from Automatic File System Extension failure" on page 100 provides more information. Before extending the source file system, verify that the destination Celerra system contains enough unused space. Use the nas_fs -size command to determine the current size of the file system.
Action
To extend the source file system, use this command syntax: [source_site]$ nas_fs -xtend <fs_name> size=<integer>[T|G|M] pool=<pool> where: <fs_name> = name of the source file system <integer> = file systems size in terabytes, gigabytes, or megabytes <pool> = assigns a rule set for the file system Example: To extend the source file system src_ufs1 by 1024 megabytes, type: $ nas_fs -xtend src_ufs1 size=1024M pool=clar_r5_performance
Version 5.6.47
101 of 172
Output
id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = src_ufs1_repl_restart_2,src_ufs1_repl_restart_1 ip_copies = dst_ufs1:cs110 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 Source file system size: total = 3025 avail = 2010 used = 1014 ( 33% ) (sizes in MB) ( blockcount = 6291456 ) volume: total = 3072 (sizes in MB) ( blockcount = 6291456 ) Destination file system size: total = 3025 avail = 2010 used = 1014 ( 33% ) (sizes in MB) ( blockcount = 6291456 ) volume: total = 3072 (sizes in MB) ( blockcount = 6291456 ) Note: Both file systems have the same block # count (6291456). The nas_fs -xtend command extends the destination file system and then the source file system. If either file system fails to extend, an error message displays the cause. If neither file system extends, the following sample output appears: Error 5008: Remote remote celerra remote exit status remote error remote message command failed: = cs0 = 5 = 0 = CLSTD : volume(s) are not available
Note: Step 3 of "Recover from Automatic File System Extension failure" on page 100 explains how to extend both file systems. If only the destination file system extends, the following output appears: Error 5008: pfsAvolume(s) are not available PFS extension failed. Please extend PFS with nas_fs -xtend... -option src_only Note: Go to step 3 of the "Recover from Automatic File System Extension failure" on page 100 to extend only the source file system.
102 of 172 Version 5.6.47
Extend a file system after replication has been suspended

If the source file system is extended automatically or manually after replication has been suspended, perform the following steps to ensure that the source and destination file systems are the same size before restarting replication. If you do not perform these steps before restarting replication, the restart will fail with an error because the source file system and destination file system are no longer the same size (that is, the block counts differ). In this case, you must perform these steps to recover from the error.
Step
1.
Action
Verify that the destination file system type is set to rawfs. If it is set to uxfs, convert the destination file system from uxfs to rawfs by using this command syntax: $ nas_fs -Type rawfs <dstfs> -Force where: <dstfs> = name of the destination file system Example: To verify that the file system is set to rawfs, type: $ nas_fs -Type rawfs dst_ufs1 -Force Note: A read-only file system must be set to rawfs prior to extending a file system or restarting a replication.
2.
Extend the destination file system manually using the same size as the source file system by using this command syntax: $ nas_fs -xtend <dst_fs> size=<integer>[T|G|M] -option <options> where: <dstfs> = name of the destination file system <integer> = size of the secondary file system in terabytes, gigabytes, or megabytes <option> = any comma-separated options, such as slice={y|n}, which specifies whether the disk volumes used by the file system might be shared with other file systems using a slice Example: To extend the destination file system dst_ufs1 by 2 MB, using the slice option, to match the source file system extension to the same size, type: $ nas_fs -xtend dst_ufs1 size=2M slice=y
3.
Restart the replication relationship by using this command syntax from the source site: $ fs_replicate -restart <srcfs> <dstfs>:cel=<cel_name> where: <scrfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of the destination Celerra Network Server Example: To restart a replication relationship, type: $ fs_replicate -restart src_ufs1 dst_ufs1:cel=cs110
Version 5.6.47
103 of 172
Extend a file system after replication failover

After a replication failover ends, you must reconfigure the Automatic File System Extension policy if you want to continue using the function on the destination file system as your PFS. This is because the Celerra Network Server database, which stores the automatic extension configuration, is not copied to the destination file system during replication. As a result, the destination file system cannot be extended automatically. The procedure on page 98 provides an example, and Managing EMC Celerra Volumes and File Systems with Automatic Volume Management offers detailed instructions to configure this functions policy. If the source file system is still active and the original destination file system is extended, and you want to resynchronize both file systems, follow this procedure:
Step
1.
Action
Verify that the source file system is mounted read-only. If it is not, do so now using this command syntax: $ server_mount <servername> -option ro <srcfs>/<srcfs_mountpoint> Convert the source file system from uxfs to rawfs using this command syntax: $ nas_fs -Type rawfs <srcfs> -Force Manually extend the source file system and match the extended destination file system size using this command syntax: $ nas_fs -xtend <srcfs> size=<bytes> Convert the source file system back to uxfs using this command syntax: $ nas_fs -Type uxfs <srcfs> -Force Resynchronize the destination and original source file systems using this command syntax: $ fs_replicate -resync <srcfs>:cel=<cel_name> <dstfs> Reverse the replication and return to the original configuration using this command syntax: $ fs_replicate -reverse <dstfs> <dstfs>:cel=<cel_name> <srcfs>
2.
3.
4.
5.
6.
Start replication when the source file system is inactive

If the source file system is inactive, you have to consider other options. Consult "Abort Celerra Replicator" on page 79 for guidance on whether to attempt an incremental resynchronization of the source and destination file systems or start replication from the beginning.
104 of 172 Version 5.6.47
Resetting replication policy

Replication policies are established when you initially set up the replication relationship. Use the fs_replicate command to change the policies for each replication session. To reset replication policy:

"High water mark and time-out policies" on page 105 "Modify replication policy" on page 106 "Change flow-control policies" on page 107 "Set bandwidth size" on page 109 "Set policies using parameters" on page 110
High water mark and time-out policies

High water mark and time-out policies prevent the replication service from creating delta sets faster than it can copy them to the destination file system. To avoid file systems dropping out-of-sync, reset these source file system policies to higher values than the destination file system policies. The following example resets the high water mark replication policy to:
Trigger the replication service to create delta sets every 300 MB of change for the source file system. Replay these delta sets to the destination file system every 300 MB of change.
Table 9 on page 105 explains the fs_replicate command options to change the policies.
Table 9 Options to set high water mark and time-out policies
Option
-modify
Behavior
To specify values for the source and destination sites, include both file systems in the command syntax. The values are effective the next time a trigger for these policies is reached. For example, if policies are changed from 600 and 300 for high water mark and time-out interval, respectively, the next time replication reaches 600, the trigger is changed to 300. If you set source high water mark and time-out interval values without specifying values for the destination, the source values are applied to the destination site. A refresh replication initiates playback of outstanding data already on the destination site and then creates a delta set of modified data on the source site. This has the same effect as reaching the next high water mark or time-out interval.
-refresh
Note: Multiple fs_replicate -refresh processes must be run sequentially, not concurrently. Run only one fs_replicate -refresh command at a time.
Version 5.6.47
105 of 172
Modify replication policy

Action
To modify the replication policy for a source and destination file system, use this command syntax: $ fs_replicate -modify <srcfs> -option hwm=<high_water_mark>, to=<timeout>,dhwm=<high_water_mark>,dto=<timeout> where: <srcfs> = name of the source file system -option hwm=<high_water_mark> = high water mark policy in megabytes -option dto=<timeout> = time-out policy in seconds Example: To reset the high water mark for source and destination file systems, type: $ fs_replicate -modify src_ufs1,dst_ufs1:cel=cs110 -option hwm=300,dhwm=300
Output
operation in progress (not interruptible)... id = 88 name = src_ufs1 fs_state = active type = replication replicator_state = active source_policy = NoPolicy high_water_mark = 600 (Pending: 300) time_out = 600 current_delta_set = 11 current_number_of_blocks = 1 flow_control = inactive total_savevol_space = 1048576 KBytes savevol_space_available = 917504 KBytes (Before Flow Control) id name type playback_state high_water_mark time_out current_delta_set flow_control total_savevol_space savevol_space_available outstanding delta sets: communication_state current_transfer_rate avg_transfer_rate source_ip source_port destination_ip destination_port QOS_bandwidth done = = = = = = = = = = 126 dst_ufs1:cs110 playback active 600 (Pending: 300) 600 11 inactive 1048576 KBytes 917504 KBytes (Before Flow Control)
<None> = = = = = = = = alive ~ 13312 Kbits/second ~ 13312 Kbits/second 172.24.168.123 62815 192.168.168.20 8888 0 kbits/sec
Note: All times are in GMT. Block size is 8 KBytes.
106 of 172 Version 5.6.47
Note
Any changes to the time-out interval or high water mark occur when the next trigger point is reached so the Pending entry shown above is removed when the policy value changes.
Change flow-control policies

When the source SavVol is full and the Data Mover can no longer track changes to the source file system, use one of these two options to keep the replication session active:
Set a policy to freeze the file system, which temporarily halts all I/Os to the source file system until sufficient space is available on the source SavVol. Set a policy that temporarily halts all writes to the source file system and makes this file system read-only. This condition persists until sufficient space is available on the source SavVol.
Consider the following when establishing these policies:
These values take effect when the next trigger for creating a delta set is reached. If replication is currently in one of these flow-control states, the new setting takes effect after exiting the read-only or freeze situation. The autofreeze and autoro options can only be set on the source file system. Flow-control policies, by default, are not enabled, which can result in replication becoming inactive. If that happens: Try to restart your replication relationship using the procedure described in "Restarting a replication relationship" on page 89. If that is impossible, abort (described in "Abort Celerra Replicator" on page 79) and restart replication (described in "Initiating replication" on page 39).
Action
To change flow-control policies for a file system, use this command syntax: $ fs_replicate -modify <fs_name>:cel=<cel_name> -option <options> where: <fs_name> = name of the source file system. <cel_name> = name of the remote Celerra Network Server. <options> = flow-control setting for the source file system. To freeze all I/O to the source file system use autofreeze=yes. To allow users to continue read-only access to the source file system use autoro=yes. Example: To freeze all I/O to the source file system, src_ufs1, specify the option autofreeze=yes. Type: $ fs_replicate -modify src_ufs1 -option autofreeze=yes
Version 5.6.47
107 of 172
Output
operation in progress (not interruptible)... id = 88 name = src_ufs1 fs_state = active type = replication replicator_state = active source_policy = NoPolicy (Pending: Freeze) high_water_mark = 600 time_out = 600 current_delta_set = 30 current_number_of_blocks = 0 flow_control = inactive total_savevol_space = 1048576 KBytes savevol_space_available = 917504 KBytes (Before Flow Control) id name type playback_state high_water_mark time_out current_delta_set flow_control total_savevol_space savevol_space_available = = = = = = = = = = 126 dst_ufs1:cs110 playback active 600 600 29 inactive 1048576 KBytes 917504 KBytes (Before Flow Control)
outstanding delta sets: Delta Source_create_time ----- -----------------29 2005-02-09 11:57:08
Blocks -----1
= = = = = = = =
Note
Determine when the file system is in a read-only or freeze situation by using the fs_replicate -info command and checking the fs_state field. When the file system is in a read-only situation, the fs_state field is romounted and when no I/O is allowed, frozen.
108 of 172 Version 5.6.47
Set bandwidth size

Setting the bandwidth size, limits the total bandwidth for this replication session. The default value is maximum available network bandwidth 0.
Action
To specify the maximum bandwidth used for a replication session, use this command syntax: $ fs_replicate -modify <fs_name>:cel=<cel_name> -option qos=8000 where: <fs_name> = name of the source file system <cel_name> = name of the remote Celerra Network Server <options> = qos = bandwidth typed in kilobytes per second Example: To set the maximum bandwidth for this replication session, type: $ fs_replicate -modify src_ufs1 -option qos=8000 Note: This setting takes effect the next time data is sent across the IP network.
Version 5.6.47
109 of 172
Output
operation in progress (not interruptible)... id = 88 name = src_ufs1 fs_state = active type = replication replicator_state = active source_policy = NoPolicy high_water_mark = 600 time_out = 600 current_delta_set = 30 current_number_of_blocks = 1 flow_control = inactive total_savevol_space = 1048576 KBytes savevol_space_available = 917504 KBytes (Before Flow Control) id name type playback_state high_water_mark time_out current_delta_set flow_control total_savevol_space savevol_space_available outstanding delta sets: communication_state current_transfer_rate avg_transfer_rate source_ip source_port destination_ip destination_port QOS_bandwidth = = = = = = = = = = 126 dst_ufs1:cs110 playback active 600 600 30 inactive 1048576 KBytes 917504 KBytes (Before Flow Control)
<None> = = = = = = = = alive ~ 13312 Kbits/second ~ 13312 Kbits/second 192.168.168.18 62819 192.168.168.20 8888 8000 kbits/sec
Set policies using parameters

Use the fs_replicate command to set flow-control policies and bandwidth size for each replication relationship rather than for all replication sessions on a Data Mover. If you choose to set these policies for all sessions on a Data Mover, change the parameter VRPL read-only and VRPL freeze for flow-control policies. You can view and dynamically modify parameter values using the server_param command or Celerra Manager graphical user interface. This technical module describes only the command-line procedures. Celerra Manager online help details how to use the graphical user interface to modify parameter values.
110 of 172 Version 5.6.47
Reverse the direction of a replication relationship

Use a replication reversal to change the direction of replication. You might want to change the direction of replication to perform maintenance on the source site or to do testing on the destination site.
Action
To change the direction of replication, use this command syntax: [read/write side]$ fs_replicate -reverse <dstfs>:cel=<cel_name> <srcfs> where: <dstfs> = name of the file system currently read-only <cel_name> = name of Celerra Network Server where the destination file system currently resides <srcfs> = name of the file system currently read/write Example: For the current read/write file system, src_ufs1, to become the read-only file system, type: $ fs_replicate -reverse dst_ufs1:cel=cs110 src_ufs
Version 5.6.47
111 of 172
Output
operation in progress (not interruptible)...id = 88 name = src_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v243 pool = clar_r5_performance member_of = root_avm_fs_group_10 rw_servers= server_2 ro_servers= rw_vdms = ro_vdms = ckpts = src_ufs1_ckpt1,src_ufs1_repl_restart_2,src_ufs1_repl_restart_1 ip_copies = dst_ufs1:cs110 stor_devs = APM00034000068-001F,APM00034000068-001E disks = d21,d15 disk=d21 stor_dev=APM00034000068-001F addr=c16t1l14 server=server_2 disk=d21 stor_dev=APM00034000068-001F addr=c0t1l14 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c0t1l13 server=server_2 disk=d15 stor_dev=APM00034000068-001E addr=c16t1l13 server=server_2 id = 126 name = dst_ufs1 acl = 0 in_use = True type = uxfs worm = off volume = v272 pool = clar_r5_performance member_of = root_avm_fs_group_3:cs110 rw_servers= ro_servers= server_2 rw_vdms = ro_vdms = backup_of = src_ufs1 Tue Feb 8 13:53:59 EST 2005 stor_devs = APM00044603845-0008,APM00044603845-0007 disks = d8,d9 disk=d8 stor_dev=APM00044603845-0008 addr=c0t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c32t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c16t1l2 server=server_2 disk=d8 stor_dev=APM00044603845-0008 addr=c48t1l2 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c16t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c48t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c0t1l1 server=server_2 disk=d9 stor_dev=APM00044603845-0007 addr=c32t1l1 server=server_2 done
112 of 172 Version 5.6.47
Note
When this command completes, the current read/write file system (src_ufs1) becomes read-only and the current read-only file system (dst_ufs1) becomes read/write. If you tried to run this command from the incorrect side (read-only), this error message appears: Error 2247: this command must be issued on the current source site:cs100
Verify the reverse direction of replication relationship

Action
To verify if the direction of the replication reversed, type: $ fs_replicate -list
Output
Local Source Filesystems Id Source FlowCtrl State Network
Destination
FlowCtrl State
Local Destination Filesystems Id Source FlowCtrl State Dest. FlowCtrl State 135 dst_ufs1:cs110 inactive active src_ufs1 inactive active
Network alive
Version 5.6.47
113 of 172
Monitor replication
Table 10 on page 114 shows the commands to use to monitor different aspects of replication. You can also monitor replication using Celerra Manager, which is described in Celerra Manager online help.
Table 10 Ways to monitor replication
Monitor
Data Movers
Description
Returns information about available memory and the CPU idle percentage. Reports the amount of used and available disk space for a file system and reports the amount of a file systems total capacity that is used. Shows information for all replication sessions, as described in "List all replication sessions (optional)" on page 55. Shows information for an individual replication session, as detailed in "List individual replication session (optional)" on page 57.
Command
server_sysstat <movername>
File systems
server_df <movername>
Replication
fs_replicate -list
Replication
fs_replicate -info <fs_name>
114 of 172 Version 5.6.47
Checking playback service and outstanding delta sets

The -info option of fs_replicate lets you verify that the playback service is running and which delta sets have not replayed to the destination file system. The following tasks show how to determine that the playback service is running and to verify the outstanding delta sets: 1. "Determine playback service status" on page 115 2. "Playback delta set" on page 118 3. "Verify delta set" on page 119
Task 1: Determine playback service status

Action
To determine the playback service is running, use this command syntax: $ fs_replicate -info <fs_name> -verbose <number_of_lines> where: <fs_name> = file system name <number_of_lines> = number of lines to display in the output Example: To determine the playback service is running, type: $ fs_replicate -info src -verbose 20
Version 5.6.47
115 of 172
Output
$ id name fs_state type replicator_state source_policy high_water_mark time_out current_delta_set current_number_of_blocks flow_control total_savevol_space savevol_space_available id name type playback_state high_water_mark time_out current_delta_set flow_control total_savevol_space savevol_space_available id name type playback_state high_water_mark time_out current_delta_set flow_control total_savevol_space savevol_space_available outstanding delta sets: Delta Source_create_time ----- -----------------90 06/04 10:08:54 89 06/04 09:58:54 88 06/04 09:58:43 87 06/04 09:48:43 91 06/04 10:10:52 communication_state current_transfer_rate avg_transfer_rate source_ip source_port destination_ip destination_port QOS_bandwidth = = = = = = = =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
18 src active replication active NoPolicy 60000 3600 92 1 inactive 1048576 KBytes 393216 KBytes (Before Flow Control) 30 dest:eng168102 playback active 300 600 87 inactive 1048576 KBytes 393216 KBytes (Before Flow Control) 30 dest:eng168102 playback active 300 600 87 inactive 1048576 KBytes 393216 KBytes (Before Flow Control)
Blocks -----1 1 1 14457 1 alive ~ 13312 Kbits/second ~ 18140.4 Kbits/second 10.168.0.11 59068 10.168.0.180 8888 0 kbits/sec
116 of 172 Version 5.6.47
Output
| Source | Destination Delta|Create Time Dur Blocks|Playback Time Dur Blocks DSinGroup -----|-------------- ------ ------|-------------- ------ ------ -------91 06/04 10:10:52 0 1 90 06/04 10:08:54 0 1 89 06/04 09:58:54 0 1 88 06/04 09:58:43 0 1 87 06/04 09:48:43 9 14457 86 06/04 09:39:25 0 1 06/04 09:44:35 0 1 1 85 06/04 09:29:25 0 1 06/04 09:34:35 0 1 1 84 06/04 09:19:25 0 1 06/04 09:24:35 0 1 1 83 06/04 09:09:25 0 1 06/04 09:14:35 0 1 1 82 06/04 08:59:25 0 1 06/04 09:04:34 0 1 1 81 06/04 08:49:25 0 1 06/04 08:54:34 0 1 1 80 06/04 08:39:25 0 1 06/04 08:44:34 0 1 1 79 06/04 08:29:25 0 1 06/04 08:34:34 0 1 1 78 06/04 08:19:25 0 1 06/04 08:24:34 0 1 1 77 06/04 08:09:25 0 1 06/04 08:14:34 0 1 1 76 06/04 07:59:25 0 1 06/04 08:04:34 0 1 1 75 06/04 07:49:25 0 1 06/04 07:54:34 0 1 1 74 06/04 07:39:25 0 1 06/04 07:44:34 0 1 1 73 06/04 07:29:25 0 1 06/04 07:34:34 0 1 1 72 06/04 07:19:25 0 1 06/04 07:24:34 0 1 1 Note: All times are in GMT. Block size is 8 KBytes.
Version 5.6.47
117 of 172
Task 2: Playback delta set

Action
To play back all delta sets up to a specified delta-set number, use this command syntax: $ fs_replicate -refresh dest -option playuntildelta=91 Note: In this example, the system plays back all delta sets up to 91. Any delta sets greater than that number will not replay.
Output
operation in progress (not interruptible)... id = 18 name = src:eng16853 fs_state = active type = replication replicator_state = active source_policy = NoPolicy high_water_mark = 60000 time_out = 3600 current_delta_set = 92 current_number_of_blocks = 1 flow_control = inactive total_savevol_space = 1048576 KBytes savevol_space_available = 393216 KBytes (Before Flow Control) id name type playback_state high_water_mark time_out current_delta_set flow_control total_savevol_space savevol_space_available outstanding delta sets: communication_state current_transfer_rate avg_transfer_rate source_ip source_port destination_ip destination_port QOS_bandwidth = = = = = = = = = = 30 dest playback active 300 600 92 inactive 1048576 KBytes 393216 KBytes (Before Flow Control)
<None> = alive = ~ 13312 Kbits/second = ~ 18140.4 Kbits/second = 10.168.0.11 = 59068 = 10.168.0.180 = 8888 = 0 kbits/sec
118 of 172 Version 5.6.47
Task 3: Verify delta set

Action
To verify that the specified delta set was replayed, type: $ fs_replicate -info src -verbose 20
Output
id name fs_state type replicator_state source_policy high_water_mark time_out current_delta_set current_number_of_blocks flow_control total_savevol_space savevol_space_available id name type playback_state high_water_mark time_out current_delta_set flow_control total_savevol_space savevol_space_available outstanding delta sets: communication_state current_transfer_rate avg_transfer_rate source_ip source_port destination_ip destination_port QOS_bandwidth | Source Delta|Create Time -----|-------------91 06/04 10:10:52 90 06/04 10:08:54 89 06/04 09:58:54 88 06/04 09:58:43 87 06/04 09:48:43 86 06/04 09:39:25 85 06/04 09:29:25 84 06/04 09:19:25 83 06/04 09:09:25 82 06/04 08:59:25 81 06/04 08:49:25 80 06/04 08:39:25 = = = = = = = = = = = = = = = = = = = = = = = 18 src active replication active NoPolicy 60000 3600 92 1 inactive 1048576 KBytes 393216 KBytes (Before Flow Control) 30 dest:eng168102 playback active 300 600 92 inactive 1048576 KBytes 393216 KBytes (Before Flow Control)
<None> = = = = = = = = alive ~ 13312 Kbits/second ~ 18140.4 Kbits/second 10.168.0.11 59068 10.168.0.180 8888 0 kbits/sec
| Destination Dur Blocks|Playback Time Dur Blocks DSinGroup ------ ------|-------------- ------ ------ -------0 0 0 0 9 0 0 0 0 0 0 0 1 1 1 1 14457 1 1 1 1 1 1 1 06/04 06/04 06/04 06/04 06/04 06/04 06/04 06/04 06/04 06/04 06/04 06/04 10:58:42 10:48:42 10:38:42 10:28:42 10:18:42 09:44:35 09:34:35 09:24:35 09:14:35 09:04:34 08:54:34 08:44:34 2 2 2 2 2 0 0 0 0 0 0 0 1 1 1 1 14457 1 1 1 1 1 1 1 5 5 5 5 5 1 1 1 1 1 1 1
Version 5.6.47
119 of 172
Output
79 78 77 76 75 74 73 72 Note: 06/04 08:29:25 0 06/04 08:19:25 0 06/04 08:09:25 0 06/04 07:59:25 0 06/04 07:49:25 0 06/04 07:39:25 0 06/04 07:29:25 0 06/04 07:19:25 0 All times are in GMT. 1 06/04 08:34:34 0 1 06/04 08:24:34 0 1 06/04 08:14:34 0 1 06/04 08:04:34 0 1 06/04 07:54:34 0 1 06/04 07:44:34 0 1 06/04 07:34:34 0 1 06/04 07:24:34 0 Block size is 8 KBytes. 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
120 of 172 Version 5.6.47
Events for Celerra Replicator

Celerra Replicator provides the following file system events for use in SNMP traps and email. System event traps and email notifications are configured by the user. Configuring EMC Celerra Events and Notifications covers file system events. Table 11 on page 121 explains Volume Replication (VRPL) facility events (Facility ID : 77).
Table 11 Celerra Replicator events (page 1 of 2)
Event ID Event
0 1 Replication ok Replication on Source Filesystem Inactive Resync asked by (previous) Destination Filesystem Source Filesystem Switch Delta on HWM Destination Filesystem in Error IP Rep Svc failed - Transport IP Rep Svc NetWork or Receiver Down IP Rep Svc Network or Receiver Up Rep Svc Source Filesystem Flow on Hold Rep Svc Source Filesystem Flow Resumed Source Filesystem Frozen
Description
Not in use. Replication service is inactive.
Request for a replication relationship resynchronization after a failover. Not in use.
4 5 6 7 8
Playback service is inactive. IPRepSender stopped. Source or destination IP network is down. Source or destination IP network is up. No source SavVol space available.
Source SavVol space available.
10
Result of specifying the autofreeze option or parameter. Release of autofreeze option or parameter. Issued at the end of a failover and reverse.
11 12
Source Filesystem Thawed Last Delta Replayed on Destination Filesystem Redo Buffer close to overflow event
13
In write-intensive environments, sometimes the redo buffers are not flushed fast enough. Result of specifying the autoro option or parameter. Release of autoro option or parameter.
14
Source mounted RO
15
Source mounted RW
Version 5.6.47
121 of 172
Table 11 Celerra Replicator events (page 2 of 2)
Event ID Event
16 Replication in error
Description
Replication service is inactive or an internal error occurred.
The events mentioned in Table 12 on page 122 apply to Celerra Replicator functionality. VMCAST : Events for fs_copy (Facility ID : 84)
Table 12 Events description
ID
0 1 2 3 4
Description
Not an event FS Copy over ip done FS Copy over ip failed Volume Copy over ip done Volume Copy over ip failed
122 of 172 Version 5.6.47
Change the Celerra Replicator SavVol default size

By default, the system allocates 10 percent of the source file system size for each Celerra Replicator SavVol. You can enlarge this default amount by changing the value in the /nas/site/nas_param file.
Step
1. 2.
Action
Log in to the Control Station. Open /nas/site/nas_param with a text editor. A short list of configuration lines appears.
CAUTION
Do not edit anything in /nas/sys/, as these settings are overwritten with each code upgrade.
3.
Locate the replication configuration line: Replication:10: where: 10 = size (in %) of the source file system Celerra allocates for the replication SavVols. The minimum SavVol size is 1 GB. EMC recommends not using a value of less than 10%. Note: Do not change any other lines in this file without a thorough knowledge of the potential effects on the system. Contact EMC Customer Service for guidance.
4. 5.
Change the parameter to represent the percentage of space you want to allocate. Save and close the file. Note: Changing this value does not require a Control Station restart.
Version 5.6.47
123 of 172
Change the passphrase between Celerra Network Servers

For a configuration with remote replication, use this procedure to change the passphrase on the Celerra system at each site.
Step
1.
Action
At each site, you can review the current passphrase using this command syntax: $ nas_cel -info <cel_name> where: <cel_name> = name of Celerra Network Server
2.
At each site, establish the new passphrase using this command syntax: $ nas_cel -modify <cel_name> -passphrase <passphrase> where: <cel_name> = name of Celerra Network Server <passphrase> = new secure passphrase to be used for the connection, which must have 6- to 15-character and be the same on both sides of the connection Example: # nas_cel -modify cs110 -passphrase nas_replication operation in id = name = owner = device = channel = net_path = celerra_id = passphrase = progress (not interruptible)... 5 cs110 503
192.168.168.102 APM000446038450000 nas_replication
124 of 172 Version 5.6.47
Managing and avoiding IP replication problems

This section describes procedures to save replication sessions in the event of network, and source or destination file system outages, planned or unplanned. Additionally, this section details recommended practices to follow before performing replication. Failure to follow these recommendations might lead to nasdb inconsistencies and replication failures. You can use Celerra Manager to perform much of the functionality described in this section. Consult Celerra Manager online help for instructions. This section describes:

"Preventive measures to avoid IP replication problems" on page 125 "Replication restart methods" on page 128 "Recovering from a corrupted file system" on page 130 "Recovering from an inactive replication state" on page 135 "Managing anticipated destination site or network outages" on page 131 "Managing unanticipated destination site or network outages" on page 132 "Managing unanticipated source site outages" on page 133 "Managing expected source site outages" on page 133 "Mount the destination file system read/write temporarily" on page 133 "Recovering from an inactive replication state" on page 135 "Creating checkpoints on the destination site" on page 136 "Copy file system to multiple destinations with fs_copy" on page 136
Preventive measures to avoid IP replication problems

You can take the following preventive measures to avoid IP replication problems:

"Creating restartable checkpoints for out-of-sync operations" on page 125 "Controlling delta set size" on page 126 "Enlarging SavVol size" on page 126 "Calculating modifications rate on the source file system" on page 127 "Accommodating network concerns" on page 127
Creating restartable checkpoints for out-of-sync operations

In any IP replication recovery scenario, it is critical to have restartable checkpoints as a base from which to work. Verifying that each file system contains checkpoints that are occasionally refreshed ensures that Celerra can perform an out-of-sync restart if required. Restartable checkpoints created using fs_ckpt <srcfs> -name <srcfs>_repl_restart_1 -Create and fs_ckpt <srcfs> -name <srcfs>_repl_restart_2 Create are automatically refreshed when starting from a differential copy. Restartable checkpoints are supported in version 5.4 only.
Version 5.6.47
125 of 172
Be sure to verify that these checkpoints are being refreshed by using fs_ckpt -list. For example, examine the creation times of the checkpoints, as shown in the following sample output:
$ fs_ckpt pfs3 -list id 258 pfs3_ckpt1 271 pfs3_ckpt2 282 pfs3_ckpt3 283 pfs3_ckpt4 284 pfs3_ckpt5 ckpt_name creation_time inuse full(mark) 01/03/2006-04:20:10-EST y 90% 01/03/2006-05:35:09-EST y 90% 01/03/2006-06:22:07-EST y 90% 01/03/2006-06:22:55-EST y 90% 01/03/2006-06:34:26-EST y 90% used 10% 10% 10% 10% 10%
If restartable checkpoints do not exist, create them. If they do exist, but their timestamps indicate they are not refreshing with the replication updates, check that the names are correct and replication is healthy. "Out-of-sync replication relationship" on page 20 provides more information about these special checkpoints.
Controlling delta set size

Controlling the size of delta sets is integral to managing efficient replications. Version 5.4 and later enforce an 8 GB delta-set limit. Manageable delta sets are preferable to large deltas because:

Large delta sets dramatically increase failover time (blockmap recovery). Large delta sets consume more operating system resources. Playback and create times for large delta sets do not rise proportionally with size. Reasonably sized delta sets are created and replayed faster.
Enlarging SavVol size

Before beginning replication, determine whether the default source side SavVol size (10 percent), as related to the file system size, is sufficient for the anticipated delta sets. Find the SavVol size by calculating 10 percent of the file system size obtained by using nas_fs -size <fs_name>. If you previously changed the SavVol size, you can learn its present size using the fs_replicate -info <fs_name> command and checking the total_savevol_space field. To manage a large network outage or account for brief intervals when the incoming modification rate significantly exceeds the networks ability to send changes to the destination site, you can increase the size of the replication SavVol. For example, a 500 GB file system that incurs 20 GB of change daily will, with a 50 GB SavVol, accommodate approximately two and one-half days of outage. If replication has already begun, change the SavVol size as follows:
Note: If you are starting replication, specify the SavVol size rather than use the default value.
126 of 172 Version 5.6.47
Step
1.
Action
Record existing policy parameters using this command syntax: $ fs_replicate -info <srcfs> where: <srcfs> = name of the source file system
2.
Stop replication using this command syntax: $ fs_replicate -suspend <srcfs> <dstfs>:cel=<cel_name> where: <srcfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of the Celerra Network Server for the file system
3.
Restart replication with the revised SavVol and old parameters using this command syntax: $ fs_replicate -restart <srcfs> <dstfs>:cel=<cel_name> savsize=<MB> -sav <srcsavvol_name> -option to=<value>, hwm=<MB> where: <srcfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of the Celerra Network Server for the file system savsize=<MB> = size of the SavVol in MB <srcsavvol_name> = name of the source file system SavVol -option to=<value> = time-out interval in seconds -option hwm=<value> = high water mark in MB
Calculating modifications rate on the source file system

Estimate the rate of modifications expected in the source side file system. If the rate of anticipated change on the source file system is continuously greater than the available network bandwidth, the replication service cannot transfer data quickly enough to avoid becoming inactive. To avoid this state, create two restartable checkpoints (consult "Start replication without initiating a full data copy" on page 128). When autoro=yes the source file system becomes read only.
Accommodating network concerns

Consider the following quality of service (QoS) and network suggestions suitable to your bandwidth requirements:
Use dedicated network devices for IP replication data transfer to avoid an impact on users. Apply QoS policy using a value that matches the bandwidth of one network to another. For instance, even if you have a 1 MB/s line from A to B and want to fill the pipe, set QoS at 1 MB/s so the line will not flood, causing packets to drop. Configure the qos parameter to throttle bandwidth for a replication session (or bandwidths for different sessions) on a Data Mover using fs_replicate <fsname> -modify -option qos=<kbps>. The EMC Celerra Network Server Command Reference Manual further details the fs_replicate command.
Version 5.6.47
127 of 172
If you do not expect significant packet loss, enable the fastRTO parameter. fastRTO determines which TCP timer to use when calculating retransmission timeout. The TCP slow timer (500 ms) is the default, causing the first time-out retransmission to occur in 11.5 seconds. But setting fastRTO to 1 sets the TCP fast timer (200 ms) for use when calculating retransmission timeout, which causes the first timeout to occur in 400600 ms. Use server_param <movername> -facility tcp fastRTO=1 to configure the setting.
Note: This setting may actually increase network traffic. Make the change cautiously, recognizing it might not improve performance.
To ensure that a stable network transfer rate for delta-set transfers on a Data Mover, use a dedicated network port. Correctly set the TCP window size for network latency. Configuring the tcpwindow parameter sets the window size used by replication (and fs_copy). This value indicates the data load that can be sent before acknowledgment by the receiving site. Increasing the value is most effective with a high latency. Window size is calculated by multiplying the round-trip delay by the appropriate packet rate. For example, to send 10 MB/s across an IP network with a round-trip delay of 100 ms, a window size of 1 MB (0.1 sec x 10 MB/s = 1 MB) is needed. Use server_param <movername> -facility rcp tcpwindow=bytes to configure the setting.
EMC Celerra Network Server Parameters Guide provides more information on setting tcp and rcp facilities.
Replication restart methods

If file systems are corrupted or out-of-sync, you can:

"Start replication without initiating a full data copy" on page 128 "Start replication from scratch" on page 129
You must start replication from scratch if the source or destination file system is corrupted. If file systems are only out-of-sync, you can restart the replication relationship.
Note: If you restart a replication session and there are unmounted checkpoints, a full data copy will be initiated instead of a differential copy.
Start replication without initiating a full data copy

"Restarting a replication relationship" on page 89 provides details on how to start a replication session without having to perform a full data copy.
128 of 172 Version 5.6.47
Start replication from scratch

Step
1.
Action
Terminate the replication relationship using this command syntax: $ fs_replicate -abort <srcfs> <dstfs>:cel=<cel_name> where: <srcfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of Celerra Network Server for the file system
2.
Create a checkpoint of the source file system using this command syntax: $ fs_ckpt <srcfs> -Create where: <srcfs> = name of the source file system
3.
Copy the checkpoint to the destination file system using this command syntax: $ fs_copy -start <src_ckpt> <dstfs>:cel=<cel_name> where: <src_ckpt> = source checkpoint copied of the destination <dstfs> = name of the destination file system <cel_name> = name of the destination Celerra Network Server
4.
Convert the destination file system to rawfs using this command syntax: $ nas_fs -Type rawfs <dstfs> -Force where: <dstfs> = name of the destination file system
5.
Start replication from the source to destination file system using this command syntax: $ fs_replicate -start <srcfs> <dstfs>:cel=<cel_name> where: <srcfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of the destination Celerra Network Server
6.
Create a second checkpoint of the source file system using this command syntax: $ fs_ckpt <srcfs> -Create where: <srcfs> = name of the source file system
7.
Perform a differential copy by typing: $ fs_copy -start <src_newckpt> <dstfs>:cel=<cel_name> -fromfs <previous_ckpt> monitor=off where: <dstfs> = name of the destination file system <cel_name> = name of the destination Celerra Network Server
8.
Check the copys progress and completion using this command syntax: $ fs_copy -info <srcfs> or fs_copy -list where: <srcfs> = name of the source file system
Version 5.6.47
129 of 172
Recovering from a corrupted file system

To recover from a corrupted file system:

"Run nas_fsck on the source file system" on page 130 "Suspend and restart replication" on page 130
Note: You cannot use nas_fsck if the destination file system is corrupted as a result of an improper fs_copy operation. File system replication fails due to pending nas_fsck.
Run nas_fsck on the source file system

The nas_fsck command checks and repairs a file system when replication is threatened. You can run the command to check the source file system while replication is running. If nas_fsck detects inconsistencies in the primary file system, the changes that occur as a result of using nas_fsck are replicated to the destination file system like any other file system modifications.
Step
1.
Action
Modify time-out and HWM values to zero using this command syntax: $ fs_replicate -modify <srcfs> -option hwm=0,to=0 where: <srcfs> = name of the source file system Note: Setting trigger points to zero causes Celerra to keep replication active and track changes, but not cut delta sets.
2.
Run nas_fsck on the source file system to replicate and replay changes on the destination file system using this command syntax: $ nas_fsck -start <srcfs> where: <srcfs> = name of the source file system Note: Running nas_fsck repairs corruption on the source file system, bringing it into a consistent, but not original, state. While nas_fsck runs, the file system is not mounted to avoid system instability. When the command is complete and inconsistencies addressed, the file system is brought back online.
3.
Revert to your previous time-out and HWM values using this command syntax: $ fs_replicate -modify <srcfs> -option hwm=<MB>, to=<second> where: <srcfs> = name of the source file system hwm=<value> = original high water mark value in MB to=<value> = original time-out interval in seconds
Suspend and restart replication

If the destination is so unstable or corrupted that it might fail before the nas_fsck changes are replicated and replayed, suspend replication, run nas_fsck -start <srcfs>, and restart replication. While it is unlikely that only the destination file system would be corrupted, you can diagnose a destination-only failure on the
130 of 172 Version 5.6.47 Using Celerra Replicator (V1)
source file system using the nas_fsck -start <srcfs> command. Any inconsistencies found by nas_fsck on the primary file system are replicated to the secondary file system.
Step
1.
Action
Suspend replication using this command syntax: $ fs_replicate -suspend <srcfs> <dstfs>:cel=<cel_name> where: <srcfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of Celerra Network Server for the file system
2.
Run nas_fsck using this command syntax at the source site: $ nas_fsck -start <srcfs> where: <srcfs> = name of the source file system
3.
Restart the replication using this command syntax: $ fs_replicate -restart <srcfs> <dstfs>:cel=<cel_name> where: <srcfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of Celerra Network Server for the file system
Managing anticipated destination site or network outages

This section describes how to handle expected destination or network outages. When planning for an outage, for instance, to restart a secondary Data Mover or to conduct standard network maintenance involving the secondary file system, follow the guidelines in the following sections to protect replication:

"Anticipated destination site outage over a short period" on page 131 "Anticipated destination site outage over a long period" on page 131
Anticipated destination site outage over a short period

Begin by evaluating the outage period and whether the site can survive it. If the data queues easily in the SavVol, nothing need be done. For example, if the planned outage period is one day, the SavVol is 100 MB, the file system is 1 GB, and 200 MB of modifications occur daily, then survival is ensured for a half-day because the SavVol will fill in 12 hours. On the other hand, if only 100 MB in modifications occur daily, a whole days worth of changes are protected.
Anticipated destination site outage over a long period

In situations in which an outage is expected for a long period and the replication service continues running, the SavVol might become full or trigger flow control, eventually leading replication to drop out-of-sync. If this scenario is likely, perform a replication suspend and restart, as directed in the following procedure.
Version 5.6.47
131 of 172
Note: To restart after suspending replication, you must use the -restart option.
Step
1.
Action
Suspend each replication session individually using this command syntax: $ fs_replicate -suspend <srcfs> <dstfs>:cel=<cel_name> where: <srcfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of the destination Celerra Network Server Note: The suspend operation lets Celerra track changes in a SavVol that automatically expands. Less total storage is needed because tracking is done only once in the checkpoint SavVol, not twice in the checkpoint and replication SavVol. Also, checkpoints retain only one overall changed block instead of one per delta set.
2.
When all replication sessions are suspended, to check that the session you suspended no longer appears in output for fs_replicate -list, type: $ fs_replicate -list Note: Use this command on source and destination Celerra systems to verify that no sessions are running. Output from the command should display no sessions, as shown below: Local Source Filesystems Id Source FlowCtrl State Destination FlowCtrl State Network
3.
Verify the size of the suspend checkpoint to ensure that there is enough disk space to expand the SavVol. The suspend checkpoint, root_suspend_ckpt, is used to restart replication and is added by the -suspend option. Verify the size by typing: $ nas_fs -size root_suspend_ckpt Restart replication after the outage is over using this command syntax: $ fs_replicate -restart <srcfs> <dstfs>:cel=<cel_name> where: <srcfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of the destination Celerra Network Server
4.
Managing unanticipated destination site or network outages

This section describes coping with unanticipated Celerra destination platform or network outages. Most destination outages require no action. For instance, if Celerra restarts or goes offlineand the system does not fall out-of-syncno remediation is necessary. Consider the following:
If fs_replicate -list shows replication is still active after the unplanned destination outage is finished, nothing need be done. If fs_replicate -list shows replication is inactive and out-of-sync after the unplanned destination outage is finished, consult "Replication restart methods" on page 128.
132 of 172 Version 5.6.47
During an unplanned destination outage, the only way to mitigate the system impact other than restoring the destination file system before the SavVol fills up is to reconfigure a small timeout and HWM on Celerra, if applicable. In this case, use fs_replicate -modify -option to=0, hwm=8192 to save a small amount of SavVol space, allowing the system more time before falling out-of-sync. This causes delta sets to be cut as large and with as little duplication as possible. Contrast a TO of zero with a TO of 60 seconds, which consumes 128 MB of SavVol per minute even without many modifications because the minimum delta-set size is 128 MB.
Note: If replication failed and corrupted the destination file system, or you mistakenly suspended replication, mounted the destination read/write, and restarted replication, you must abort the session and restart using a full fs_copy. See "Replication restart methods" on page 128 for more information.
Managing unanticipated source site outages

This section describes how to manage unplanned Celerra source outages such as power interruptions or restarts. Consider the following if you decide to activate the Data Recovery (DR) site:
"Replication restart methods" on page 128 describes the source that resides on a CLARiiON system that lost power, and replication is inactive. If the source resides on a Symmetrix system that lost powerwhere replication should still be activeno remediation is necessary. If only the source is down and you do not want to activate the DR site, no remediation is necessary. "Recovering replication data" on page 61 describes replication failover, resynchronization, and reversal.
Managing expected source site outages

This section describes coping with both anticipated and unexpected outages on the source Celerra. Perform the same steps as described in "Managing unanticipated source site outages" on page 133, and decide whether to activate the DR site.
Mount the destination file system read/write temporarily

This section describes temporarily mounting the secondary file system as a read/write volume to test data recovery without doing a failover and taking the source server offline. To temporarily mount the destination read/write, perform this procedure.
Version 5.6.47
133 of 172
CAUTION
The following procedure cannot be used for VDM root file systems. A checkpoint of a VDMs root file system is designed in such a way that it cannot be restored. Replication for VDM root file system needs to be restarted with a full fs_copy to ensure that the source and destination file systems are synchronized. Failure to do this will lead to file system corruption and data unavailability.
Step
1.
Action
Suspend replication using this command syntax: $ fs_replicate -suspend <srcfs> <dstfs>:cel=<cel_name> where: <srcfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of Celerra Network Server for the file system
2.
Assign the file system type to default using this command syntax: $ nas_fs -Type uxfs <dstfs> -Force where: <dstfs> = name of the destination file system
3.
Unmount the destination file system using this command syntax: $ server_umount <movername> -perm <dstfs> where: <movername> = name of the Data Mover <dstfs> = name of the destination file system
4.
Mount the destination file system read-write using this command syntax: $ server_mount <movername> -option rw <dstfs> /<dstfs_mountpoint> where: <movername> = name of the Data Mover <dstfs> = name of the destination file system <dstfs_mountpoint> = point at which the file system is mounted
5.
Perform DR testing with an appropriate program after the both sites are read-write. A number of diagnostics help to ensure that your database can start up correctly, such as: Reading/writing to a single file or every file Creating new files Modifying existing files Reading or deleting file systems Make the now-writable destination file system available to clients either by exporting it through NFS, or sharing it through CIFS (assuming a CIFS server is active on the destination side) using this command syntax: $ server_export <movername> -Protocol nfs -option <options> <pathtoexport> or $ server_export <movername> -Protocol cifs -name <sharename> option <options> <pathtoshare> where: <movername> = name of the Data Mover
6.
134 of 172 Version 5.6.47
Step
7.
Action
Unmount the destination file system after testing ends using this command syntax: $ server_umount <movername> -perm <dstfs> where: <movername> = name of the Data Mover <dstfs> = name of the destination file system
8.
Mount the destination file system as read-only using this command syntax: $ server_mount <movername> -option ro <dstfs> /<dstfs_mountpoint> where: <movername> = name of the Data Mover <dstfs> = name of the destination file system <dstfs_mountpoint> = point at which the file system is mounted
9.
Convert the destination file system to rawfs after the restore ends using this command syntax: $ nas_fs -Type rawfs <dstfs> -Force where: <dstfs> = name of the destination file system
10.
Restart replication at the source site using this command syntax: $ fs_replicate -restart <srcfs> <dstfs>:cel=<cel_name> where: <srcfs> = name of the source file system <dstfs> = name of the destination file system <cel_name> = name of Celerra Network Server for the file system
Recovering from an inactive replication state

If the state of replication and playback is inactive due to a power failure on Celerra or because flow control was exercisedand source and destination file systems are out-of-syncyou must restart the replication relationship. Go to "Replication restart methods" on page 128.
Version 5.6.47
135 of 172
Creating checkpoints on the destination site

When using the fs_copy command to regularly copy modifications from source to destination file system, you should employ checkpoints on the destination site for these reasons:
While an fs_copy is in progress, the file system on the destination is in a rawfs state and unavailable. However if you preserve a checkpoint on the destination, this checkpoint remains available during the copy. "Copy file system to multiple destinations with fs_copy" on page 136 provides more details. By using checkpoints, you can preserve the copied views of the file system on the destination site. When checkpointing the destination, if an fs_copy is in progress, the operation suspends until the copy is complete.
The following caveats apply when creating checkpoints on the destination file system with replication running:
Checkpoints created on the destination file system are supported just as on the source file system, except that they fall on delta-set boundaries. Checkpoints processed during playback suspend until the playback ends. So, if you are in the midst of a large delta-set playback, the refresh or create could take an extended period to process while Celerra waits for the playback to finish.
Copy file system to multiple destinations with fs_copy

The fs_copy command is used typically in a first-time replication start to manually synchronize source and destination file systems. It is also often employed in a script to copy one source file system to many destinations, either by cascading one file system to a number of destinations or directly to each location. Using fs_copy does not constitute replication. It is a copy command that emulates replication which requires considerable scripting and integration. The command should be used sparingly by a knowledgeable operator. The following scenario describes regularly copying a file system to three destination sites (A, B, and C). The procedure requires an operator to perform differential copies on alternating checkpoint sets.
Note: It is highly recommended that slice volumes be used to create and extend file systems for this fs_copy solution. Managing EMC Celerra Volumes and File Systems with Automatic Volume Management provides more information about slice volumes.
136 of 172 Version 5.6.47
Step
1.
Action
Copy a file system to multiple destinations with fs_copy and create a checkpoint of the source file system using this command syntax: $ fs_ckpt <srcfs> -name <src_ckpt1> -Create where: <srcfs> = name of the source file system <src_ckpt1> = first checkpoint on source Note: If the primary file system extends before checkpoint creation, the fs_copy command will fail. In this case, you must extend the destination file system manually to keep file system sizes identical. First convert the file system to rawfs using the nas_fs -Type rawfs command. Then use the nas_fs -xtend command. If a slice volume is not used, an incremental copy might fail and you might need to run a full fs_copy from scratch when the source file system is extended.
2.
Copy it to the destination A file system after the checkpoint is created using this command syntax: $ fs_copy -start <src_ckpt1> <dstfs>:cel=<cel_name> where: <src_ckpt1> = first checkpoint on source <dstfs> = destination name for file system A <cel_name> = name of Celerra Network Server for file system A Note: While an fs_copy is running, the destination file system is inaccessible. To make it accessible, you must create a checkpoint before the data transfer starts.
3.
After the fs_copy operation ends, create a checkpoint of destination file system using this command syntax: $ fs_ckpt <dstfs> -name <dst_ckpt1> -Create where: <dstfs> = destination file system name A <dst_ckpt1> = first checkpoint on destination file system A
4.
Create a second checkpoint of the source file system by typing: $ fs_ckpt <srcfs> -name <src_ckpt2> -Create where: <srcfs> = name of the source file system <src_ckpt2> = second checkpoint on source
5.
Convert destination A file system to rawfs using this standard syntax: $ nas_fs -Type rawfs <dstfs> -Force where: <dstfs> = destination name of file system A
Version 5.6.47
137 of 172
Step
6.
Action
Perform a differential copy between checkpoints 1 and 2 using this command syntax: $ fs_copy -start <src_ckpt2> <dstfs>:cel=<cel_name> -fromfs <src_ckpt1> where: <src_ckpt2> = second checkpoint on the source <dstfs> = destination name for file system A <cel_name> = name of Celerra Network Server for file system A <src_ckpt1> = first checkpoint on the source
7.
After the fs_copy command completes again, refresh the first checkpoint of destination file system A using this command syntax: $ fs_ckpt <dst_ckpt1> -refresh where: <dst_ckpt1> = first checkpoint on destination file system A
8.
Perform steps 2 through 7 for destinations B and C. The source file system is now saved on destinations A, B, and C. Refresh the copy of the source file system at destinations A, B, and C, refresh the first source checkpoint using this command syntax: $ fs_ckpt <src_ckpt1> -refresh where: <src_ckpt1> = first checkpoint on source
9.
10.
Repeat steps 2 through 8, swapping checkpoints <src_ckpt1> and <src_ckpt2>.
138 of 172 Version 5.6.47
Transporting replication data using disk or tape

Copying the baseline source file system from the source to the destination site over the IP network can be a time-consuming process. You can use an alternative method by copying the initial checkpoint of the source file system, backing it up to a disk array or tape drive, and transporting it to the destination site, as shown in Figure 7 on page 139.
Source site Destination site
Celerra Data Mover
Celerra Data Mover
File system
File system
CLARiiON or Symmetrix storage system
CLARiiON or Symmetrix storage system
Disk or tape
Disk or tape
Truck
CNS-000767
Figure 7 Physical transport of data
Note: Use the IP network to make the initial copy of the root file system for a VDM.
This section consists of the following tasks:

"Disk transport method" on page 140 "Tape transport method" on page 144
Version 5.6.47
139 of 172
Disk transport method

If the source file system holds a large amount of data, the initial copy of the source to the destination file system can be time-consuming to move over the IP network. Preferably, move the initial file system copy by disk, instead of over the network. If you want to physically transport a copy of your source file system using a CLARiiON disk array, the Data Movers that run replication must connect to the storage system using Fibre Channel switched fabric connections. To transport the baseline copy of the source file system to the destination site, use the disk transport method described in this section or contact EMC about a customized replication baselining service offering.
Note: You can use any qualified CLARiiON storage system for this transfer.
To transport replication data using disk: 1. "Capture data from the source site on disk" on page 141 2. "Transfer data to the destination site from disk" on page 143
140 of 172 Version 5.6.47
Step 1: Capture data from the source site on disk

Step
1.
Action
List the disks attached to the Celerra Network Server using the nas_disk -list command. Keep this list to use for comparison later in this procedure. Attach a supported CLARiiON array (for example a CX300) with the appropriately bound LUNs to the Celerra Network Server. This procedure assumes you will use a dedicated array. "Setting up the CLARiiON disk array" on page 147 describes preparing the CLARiiON array.
2.
3.
Probe and verify SCSI disks by typing: $ server_devconfig server_2 -probe -scsi -disks where server_2 is the Data Mover with access to the CLARiiON array.
4.
Create SCSI disks by typing: $ nas_diskmark -mark -all The disks will be available to all Data Movers including standby Data Mover.
5.
List the disks that are attached to the Celerra Network Server using the nas_disk -list command. Then perform a diff command between this list and the one created in step 1 of "Capture data from the source site on disk" on page 141. For example: > 378 n 260607 APM00034402893-0000 CLSTD d378 2 > 379 n 260607 APM00034402893-0001 CLSTD d379 2
6.
Create a file system on the CLARiiON array that is the same size as the source file system. To do so: a. Create a user-defined pool by typing: $ nas_pool -create -name transport_disks -volumes d378,d379 b. Create a file system by typing: $ nas_fs -name transport_fs -type rawfs -create samesize=src pool=transport_disks -option mover=server_2 c. Create a mountpoint for the file system. d. Mount the file system read-only. Note: Ensure that you create the file system as rawfs and use the samesize= option to ensure that it is identical in size to the source file system. When creating the pool, ensure that the disks are added in the same order on source and destination sites. If you are creating more than one file system, ensure that they are created in the same order on source and destination sites. EMC recommends that you create the largest file system first.
7.
Create a checkpoint of the source file system by typing: $ fs_ckpt src -Create Copy the source file system checkpoint to the file system created on the new disks by typing: $ fs_copy -start src_ckpt1 transport_fs -option convert=no, monitor=off
8.
Version 5.6.47
141 of 172
Step
9.
Action
Monitor the fs_copy progress by typing: $ fs_copy -list $ fs_copy -info <session_id> Verify the size of the transport file system to the source file system using the nas_fs -size command.
10.
Delete the disks from the source site. To do so: a. b. c. d. e. Unmount the file system transport_fs (server_umount) Delete the mountpoint (server_mountpoint) Delete the file system (nas_fs -delete) Delete the pool (nas_pool -delete) Delete the disks which were the result of the -diff command in step 5 (nas_disk -delete) of "Capture data from the source site on disk" on page 141
11.
Verify that the disks were removed by using the nas_disk -list command. The results you obtain from this step should be the same as those derived from the first step of "Capture data from the source site on disk" on page 141. Disconnect and uninstall the CLARiiON array from the source site. Transport the disk array to the destination site.
12. 13.
142 of 172 Version 5.6.47
Step 2: Transfer data to the destination site from disk

Step
1.
Action
List the disks that are attached to the Celerra Network Server using the nas_disk -list command. Keep this list to use for comparison later in this procedure. Attach the CLARiiON array at the destination site. Probe and verify SCSI disks by typing: $ server_devconfig server_2 -probe -scsi -disks where server_2 is the Data Mover with access to the CLARiiON array.
2. 3.
4.
Create SCSI disks by typing: $ nas_diskmark -mark -all The disks will be available to all Data Movers including standby Data Mover.
5.
List the disks that are attached to the Celerra Network Server using the nas_disk -list command. Then perform a diff command between this list and the one created in step 1 of "Transfer data to the destination site from disk" on page 143. For example: > 375 n 260607 APM00034402893-0000 CLSTD d375 1 > 376 n 260607 APM00034402893-0001 CLSTD d376 1
6.
Create a file system on the transport disk array. To do so: a. Create a user-defined pool by typing: $ nas_pool -create -name transport_disks -volumes d375,d376 b. Create a file system by typing: $ nas_fs -name transport_fs -type rawfs -create samesize=src:cel=eng16853 pool=transport_disks -option mover=server_2 c. Create a mountpoint. d. Mount the file system read-only. Note: Ensure that you create the file system as rawfs and use the samesize= option to ensure that it is identical in size to the source file system. When creating the pool ensure that the disks are added in the same order on source and destination sites. If you are creating more than one file system ensure that they are created in the same order on source and destination sites.
7.
Create a destination file system (in this example the destination Celerra file system is attached to a Symmetrix storage system). To do so: a. Create a file system by typing: $ nas_fs -name dest -type rawfs -create samesize=src:cel=eng16853 pool=symm_std b. Create a mountpoint and mount the file system. Note: Ensure that you create the file system as rawfs and use the samesize= option to ensure that it is identical in size to the source file system.
8.
Copy the file system on the transport disk array to the destination file system, created in step 7 of "Transfer data to the destination site from disk" on page 143: $ fs_copy -start transport_fs dest -option convert=no
Version 5.6.47
143 of 172
Step
9.
Action
The destination file system is now rawfs and contains a copy of the source file system checkpoint. Delete the disks from the transport disk array. To do so: a. b. c. d. e. Unmount the file system transport_fs (server_umount) Delete the mountpoint (server_mountpoint) Delete the file system (nas_fs -delete) Delete the pool (nas_pool -delete) Delete the disks which were the result of the -diff command in step 5 (nas_disk -delete) of "Transfer data to the destination site from disk" on page 143.
10.
11.
Verify that the disks were removed by using the nas_disk -list command. The results you obtain from this step should be the same as those derived from step 1 of "Transfer data to the destination site from disk" on page 143. Disconnect and uninstall the CLARiiON array from the destination site. Continue with the next step for setting up replication, Task 6: "Begin replication" on page 48, in Using EMC Celerra Replicator (V1).
12. 13.
Tape transport method

If the source file system contains a large amount of data, the initial copy of the source file system to the destination file system can be time-consuming to move over the IP network. Moving the initial copy of the file system by backing it up to tape, instead of over the network, is preferable. When using this method of transport, note the following:

You must have a valid NDMP infrastructure on both Celerra Network Servers. The restore is performed to a rawfs file system that must be mounted on the Data Mover. The restore will be rejected if the destination file system is not the same size as the source file system used for the backup.
Note: This special backup is used only for transporting replication data.
CAUTION
Backing up file systems from a Unicode-enabled Data Mover and restoring to an ASCII-enabled Data Mover is not supported. I18N mode (Unicode or ASCII) must be the same on the source and destination Data Movers.
To transport replication data using tape: 1. "Capture data from the source site on tape" on page 145 2. "Transfer data to the destination site from tape" on page 146
144 of 172 Version 5.6.47
Step 1: Capture data from the source site on tape

Step
1.
Action
Create the checkpoint of the source file system by typing: $ fs_ckpt src_ufs1 -Create Note: You can back up only the checkpoint of an IP replication read-only target file system using the NDMP backup feature. If you attempt to back up a replication read-only target file system, NDMP will fail when replication is updating the change. Celerra Network Server version 5.5.27 and later supports NDMP backup of integrated checkpoints and manually created checkpoints of a target replication file system.
2.
Set the NDMP environment variable for your backup software. For example, set the VLC=y NDMP environmental variable before you run the backup. The NDMP technical module for your particular backup software provides information about environment variables. For information about how to set this variable, read your backup software vendors documentation. Note: The source file system and the checkpoint must be mounted on the NDMP Data Mover.
3. 4.
Use your normal backup procedure to back up the source file system checkpoint. Transport the backup tapes to the destination site.
Version 5.6.47
145 of 172
Step 2: Transfer data to the destination site from tape

Step
1.
Action
When the tapes are on the destination Celerra Network Server, create a file system (rawfs) that is the same size as the source file system. Create the file system on a metavolume, create a mount point, and then mount the file system. Managing EMC Celerra Volumes and File Systems Manually describes how to create a file system. Ensure that you create the file system as rawfs and use the samesize= option to ensure that it is identical in size to the source file system.
2.
Determine the volume number of the destination file system created in step 1. In this example, the volume number for the rawfs file system is 66: $ nas_fs -list id 1 2 3 inuse type acl y 1 0 y 1 0 y 1 0 volume name server 66 rawfs server_2 68 new server_3 70 last server_4
3.
Using your normal NDMP restore procedure, restore the backup using the following as the file system name: /.celerra_vol_<fs_volume_ID> where: <fs_volume_ID> = volume number of the rawfs file system (66 in the example) Note: The file system must be restored to the NDMP Data Mover.
4.
The destination file system is now rawfs and contains the source file system checkpoint. Start the replication between source and destination. Follow the procedure, Task 6: "Begin replication" on page 48, to set up remote replication.
5.
Create a second checkpoint. Follow the procedure, Task 7: "Create a second checkpoint of the source file system" on page 50. Perform an incremental copy and allow the destination system to convert to uxfs. Follow the procedure, Task 8: "Copy incremental changes" on page 52. Be sure to specify the -force option, for example: $ fs_copy -start src_ufs1_ckpt2 dst_ufs1:cel=cs110 -fromfs src_ufs1_ckpt1 -Force -option monitor=off
6.
7.
Verify the replication session: $ fs_replicate -list
146 of 172 Version 5.6.47
Setting up the CLARiiON disk array

If you want to physically transport a copy of your source file system using a CLARiiON disk array, you must set up replication on a Celerra gateway. Use the following documents in this setup procedure:
EMC CLARiiON CX300, CX500, and CX700 Storage Systems Initialization Guide EMC CLARiiON CX300 2-Gigabit Disk Processor Enclosure (DPE2) Setup and Cabling Guide
Note: Use the appropriate setup and cabling guide depending on the disk array used.
When configuring the CLARiiON disk array for transporting replication data, run the appropriate setup script after setting up zoning for the network switches. To prepare the CLARiiON disk array to receive the copy of the source file system:

"Review the prerequisites" on page 147 "Run the setup script" on page 149 "Create data LUNs" on page 151
Review the prerequisites

Step
1. 2.
Action
Cable and zone the CXxxx disk array to the Celerra Network Server. Ensure that the required software components are installed on the CLARiiON disk array: CXxxx Base Array (EMC FLARE) EMC Navisphere_ArrayAgent Navisphere Management UI EMC Access Logix Read the E-Lab Interoperability Navigator for the most recent Celerra software and FLARE microcode compatibility specifications. The E-Lab Interoperability Navigator, which is available at Powerlink as definitive information on supported software and hardware, such as backup software, Fibre Channel switches, and application support for Celerra network-attached storage (Celerra) products. The E-Lab Interoperability Navigator is for EMC use only. Do not share this information with customers.
3.
Version 5.6.47
147 of 172
Step
4.
Action
Make sure you have the following: Network information (hostname, IP address, subnet mask, and gateway for the SPs). Set up the service computer with: Windows Server 2000 or Windows NT 4.0 with Service Pack 5 (or later). Dial-up networking using direct connection. The latest version of the Navisphere CLI is installed in C:\program files\emc\navisphere cli on the service computer. Null modem cable with 9-pin female-to-female connectors. An IP connection between the Celerra Network Server and the SPs (required for installing the system software). Create the PPP link by adding a modem and creating a connection as described in the EMC CLARiiON CX300, CX500, and CX700 Storage Systems Initialization Guide.
5.
148 of 172 Version 5.6.47
Run the setup script

Step
1.
Action
Using the NULL modem cable, connect the service computer to the SP A serial port.
Serial port
CNS-000760
2.
Establish a dial-up connection between the service computer and SP A by selecting, Start > Settings > Network and Dial-up Connections > Direct Connection. The Connect Direct Connection dialog box opens.
3.
Establish a dial-up connection by filling in the dialog box as follows: Username: clariion Password: clariion!
4.
Select Connect. Note: Establishing this connection might require several redial attempts.
5.
Open a Command Prompt Window by selecting Start > Programs > Accessories > Command Prompt. The Command Prompt window opens.
6. 7.
Insert the Celerra Installation CD into the CD-ROM drive. In the Command Prompt window, change to the \clariion directory on the CD-ROM by typing the drive letter and then typing cd \clariion. For example, type: D:cd\clariion
Version 5.6.47
149 of 172
Step
8.
Action
Run the script by typing setup, and press Enter: =========================================================== Program commandline: setup Startup values: Version: 7.0 Jumppoint into program: Pathname to navicli: c:\program files\emc\navisphere cli\navicli Debug (1=yes) : Logging (1=yes): 1 Logfile: C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\laptop_script.log IP Address for PPP: 128.221.252.1 Commands will all be sent via the dial-up connection. Disconnect any existing Dial-Up Networking connections. Connect serial cable to SPA and start Dial-up Networking. When the connection is established, press any key to continue configuring the array. Connection to SPA (128.221.252.1) has been established and verified. Waiting to ensure SP is up and stable ... NOTE: Some operations, like changing the serial number, result in multiple reboots. This is expected. Checking 128.221.252.1 SP at 128.221.252.1 has responded. ensure it stays up. SP at 128.221.252.1 is up. Configuring CX Series ... Creating the hotspare ... Setting Failover Mode When prompted to create Celerra Control Volumes, type N: Do you want to configure Celerra Control Volumes? [Y,N,] : N Configuring cache Configuring the cache ... Waiting 60 seconds to
Cache Configured.
Tasks complete. disconnected.
The serial connection to the SP can now be
This procedure was logged into file: C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\laptop _script.log When the script completes, type exit to close the Command Prompt window.
150 of 172 Version 5.6.47
Create data LUNs

Step
1.
Action
Log in to the Celerra Network Server. Change to root (su) and type the password.
2.
Determine the CXxxx serial number from a DOS window by typing: $ navicli -h <SPA_IP_address> getagent Output: Agent Rev: Name: Desc: Node: Physical Node: Signature: Peer Signature: Revision: SCSI Id: Model: Model Type: Prom Rev: SP Memory: Serial No: SP Identifier: Cabinet: 6.6.0 (3.1) K10 A-APM00035106458 K10 979596 885857 2.06.500.4.004 0 500 Rackmount 3.00.00 2048 APM00035106458 A DPE2
Version 5.6.47
151 of 172
Step
3.
Action
Run the setup script using this command syntax: $ /nas/sbin/setup_clariion <CXxxx_serial_number> Use the serial number from the previous step. Output: # nas//setup_clariion APM00035106458 CLARIION(s) APM00035106458 will be setup. Setup CLARiiON APM00035106458 storage device... Enter the ip address for A_APM00035106458: 172.24.168.72 Enter the ip address for B_APM00035106458: 172.24.168.73 System 172.24.168.72 is up System 172.24.168.73 is up Clariion Array: APM00035106458 Committed Base software package The following 5 template(s) available: 1. CX_All_4Plus1_Raid_5 2. CX_Standard_Raid_5 3. CX_Standard_Raid_1 4. CX_Standard_Raid_5_Legacy 5. CX_Standard_Raid_1_Legacy Please select a template in the range of 1-5 or 'q' to quit: 1 Summary: 2 disk group(s) are created. 8,9 5 spare(s) are created. 200,201,202,203,204 Enclosure(s) 0_0 are installed in the system. Enclosure info: ---------------------------------------------------------------0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ---------------------------------------------------------------0_0: 146 146 146 146 146 146 146 146 146 146 146 146 146 146 146 *8 *8 *8 *8 *8 *HS *9 *9 *9 *9 *9 *HS *HS *HS *HS ---------------------------------------------------------------"*" indicates a diskgroup/spare which will be configured Size Type Disks Spares ------------------------146 FC 15 5 Model: CX500 Memory: 2048
Do you want to continue and configure as shown [yes or no]?: yes Enclosure 0_0. Created disk group Created spare Created disk group Created spare Created spare Created spare Created spare
8, luns 16,17 200 9, luns 18,19 201 202 203 204
Binding complete. All luns are created successfully! Enclosure(s) 0_0 are installed in the system.
152 of 172 Version 5.6.47
Step
3.
Action
Enclosure info: ---------------------------------------------------------------0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ---------------------------------------------------------------0_0: 146 146 146 146 146 146 146 146 146 146 146 146 146 146 146 8 8 8 8 8 HS 9 9 9 9 9 HS HS HS HS ---------------------------------------------------------------Configuration completed! Setup of CLARiiON APM00035106458 storage device complete.
4. 5.
Register the World Wide Names using the Connectivity Status window in Navisphere. Proceed with "Disk transport method" on page 140 procedure.
Version 5.6.47
153 of 172
Troubleshooting Celerra Replicator

As part of an effort to continuously improve and enhance the performance and capabilities of its product lines, EMC periodically releases new versions of its hardware and software. Therefore, some functions described in this document might not be supported by all versions of the software or hardware currently in use. For the most up-to-date information on product features, refer to your product release notes. If a product does not function properly or does not function as described in this document, please contact your EMC Sales Representative.
Where to get help

Product information For documentation, release notes, software updates, or for information about EMC products, licensing, and service, go to the EMC Powerlink website (registration required) at http://Powerlink.EMC.com. Troubleshooting For troubleshooting information, go to Powerlink, search for Celerra Tools, and select Celerra Troubleshooting from the navigation panel on the left. Technical support For technical support, go to Powerlink and choose Support. On the Support page, you can access Support Forums, request a product enhancement, talk directly to an EMC representative, or open a service request. To open a service request, you must have a valid support agreement. Please contact you EMC sales representative for details about obtaining a valid support agreement or to answer any questions about your account.
Note: Do not request a specific support representative unless one has already been assigned to your particular system problem.
Problem Resolution Roadmap for EMC Celerra contains additional information about using Powerlink and resolving problems.
E-Lab Interoperability Navigator

The EMC E-Lab Interoperability Navigator is a searchable, web-based application that provides access to EMC interoperability support matrices. It is available on the EMC Powerlink website at http://Powerlink.EMC.com. After logging in to Powerlink, go to Support > Interoperability and Product Lifecycle Information > E-Lab Interoperability Navigator.
Log files for troubleshooting

The following log files are available to help troubleshoot replication:

server_log for messages /nas/log/sys_log for messages /nas/log/cmd_log for internal commands run for replication
Additionally, use the fs_replicate info verbose <number_of_lines> command to view the delta-set status.
154 of 172 Version 5.6.47 Using Celerra Replicator (V1)
server_log messages
Table 13 on page 155 shows an example message generated in the server_log for a Data Mover performing replication at the source site: Replication::Valid v:CISn363 Delta:2564 ad:525312 g:2564 nc:1
Table 13 Sample server_log message from source site v: Volume ID of the source file system. In the above example, 363 is the volume ID. Delta-set number copied to the SavVol. Address at the block level (1 block = 512 bytes) on the SavVol where the delta set is created. Chunk number on the SavVol. If all delta sets consist of one chunk, this number is the same as the delta-set number. Number of chunks in the delta set. One chunk equals 128 MB.
Delta: ad:
g:
nc:
Table 14 on page 155 shows an example message generated in the server_log for a Data Mover performing replication at the destination site: Playback: v:361, Delta:3557, g:3557, ad:263168, nDelta:7
Table 14 Sample server_log message from destination site v: Delta: g: Volume ID of the destination file system. First delta-set number in the group replayed to the destination file system. First chunk number in the first delta set. If all the delta sets consist of one chunk, this number is the same as the delta-set number. Address on the SavVol where the first delta set in the group is located. Number of delta sets in the group replicated to the destination file system.
ad: nDelta:
Version 5.6.47
155 of 172
Network performance troubleshooting

When you experience a performance issue during the transferring of delta sets, check the following:
Duplex network configuration mismatch (Full, Half, Auto, and so on) between the Data Mover and the network switch Packet errors and input/output bytes using the server_netstat i command on the Data Mover Packet errors and input/output bytes on the network switch port Transfer rate in the fs_replicate -info command output
Create a file from the network client, refresh the source file system, and then check the transfer rate in the fs_replicate -info command output.
Failure during transport of delta set

If a network failure occurs during the transport of a delta set over IP, the replication service continues to resend this delta set until it is accepted by the destination site.
Failure of fs_copy command process

If the output of the fs_copy -list command reports failed in the Status field, or the /nas/log/sys_log has one or more Copy failed messages, then the fs_copy process has failed and you must do the following: 1. Abort the fs_copy operation by running the command fs_copy -abort. 2. Start replication from scratch, as described in "Start replication from scratch" on page 129. The following /nas/log/sys_log excerpt shows messages related to an fs_copy failure: Aug 29 23:00:38 2006 VMCAST:3:9 Slot 2: 1156910199: Group:4673_0001854002020039 FSID:4666 Resync Copy failed. Full Copy has to be started. (t:17978065009336) Aug 29 23:00:38 2006 VMCAST:3:2 Slot 2: 1156910199: Group:4673_0001854002020039 FSID:4666 Copy Failed restartad:0x0 (t:17978065072171)
Control Station restarts during replication

During replication processing:
If the Control Station restarts during an fs_copy, you must abort the file system copy using the fs_copy -abort command for the source and destination file systems. Then start the copy again. If the Control Station restarts with active file system replications, the replication sessions are not affected and continue to run.
156 of 172 Version 5.6.47
Control Station fails over

For remote replication, if a primary Control Station fails over to a standby Control Station, the replication service continues to run but the replication management capabilities are unavailable. For example, you cannot perform list, abort, start, or refresh replication functions from the standby Control Station. First resolve the problem with the primary Control Station before executing any of these commands. This applies to the Control Stations at the source and destination sites.
NS series loses power

If any of the NS series systems or a CNS-14 or NSX series system attached to a CLARiiON storage system loses power, after you power up the system, determine if you can restart replication using an incremental copy of the source file system as described in "Restarting a replication relationship" on page 89. If not, abort as detailed in "Abort Celerra Replicator" on page 79 and start replication from the beginning as described in "Initiating replication" on page 39.
Return codes for fs_copy

This section describes the possible fs_copy return codes that can be helpful when error checking scripts using the fs_copy command. Table 15 on page 157 lists the possible return codes, and Table 16 on page 158 lists the return codes with corresponding error message IDs, a brief description, and error severity. To obtain detailed information about a particular error, use the nas_message -info <error_id> command.
Table 15 Return codes and description
Return code Description

0 1 2 Command completed successfully. CLI usage error. The object ID specified in the command line does not exist or is invalid. Unable to acquire locks or resources for the command to start successfully. Not applicable. Possible communication error. Transaction error reported during internal process transaction or during object verification or post transaction processing. Data Mover error. Not applicable.
4 5 6
7 8
Version 5.6.47
157 of 172
Table 16 fs_copy return codes with error IDs and message descriptions (page 1 of 3)
Return code
0
Error ID
10246
Message description
This warning indicates that the time skew between the local and remote Celerra may have exceeded 10 minutes or that there is a passphrase mismatch between the local and remote Celerra. This warning indicates that the remote Celerra could not be reached with an http connection. This CLI usage error indicates that an invalid command option, such as the file system name, Celerra system name, interface name, or IP address, was specified. This CLI usage error indicates that an invalid interface or IP address was specified for the local or remote Celerra system. This error occurred because an invalid ID was specified. This error occurred because a process is not able to lock or reserve a resource, such as a file system, Data Mover, or interface. This is a communication error that occurred because the Data Mover is not connected. This communication error occurred because either the source Data Mover or the destination Data Mover (on which destination file system is mounted) is not connected. This communication error indicates that the interface specified for the source Data Mover is not connected to the interface on the destination Data Mover. This error occurred during the transaction when another process was modifying the database being accessed by the current command. This CLI usage error occurred during argument verification and indicates that an invalid command option, such as convert, autofullcopy, qos, or resync, was specified. This error occurred when the transaction was aborted due to an internal failure or abnormal termination of the command. This error indicates that an invalid checkpoint was specified in the -fromfs option of the differential copy argument or that the abort command was issued and the file system is involved in multiple copy sessions. The copy session should be aborted using the destination file system.
10247
2102
10142
2 3
2211 2201
4001
4002
4036
2001
2103
2207
2225
158 of 172 Version 5.6.47
Return code
6
Error ID
2227
Message description
This error indicates that the specified destination file system has FLR status of Enterprise. FLR-C enabled file systems (source or destination) are not supported for fs_copy or Celerra Replicator (V1). This error occurred during the transaction as specified in the error message. This error indicates that the command executed on the remote system failed. This error indicates that the checkpoint specified in the command is inactive. This error indicates that the file system type is invalid. This error indicates that either 1) an invalid file system is specified for the copy session or the copy session already exists, 2) the file system is not part of the specified session or, 3) the restart checkpoint specified is not newer than the destination file system. This error indicates that the source file system was restored from the checkpoint. This error indicates that the checkpoint specified is not newer than the destination file system.
2237
2241
2243
6 6
2245 3105
3128
3134
3136
This error occurred because the destination file system has replication set up. This error indicates that the file system is already part of another copy session. This error indicates that the file system is not part of a copy session. This error occurred when polling the progress of the copy session and indicates that the command failed to complete. This error occurred because the file system is not mounted. This error occurred because the file system is not mounted. This error occurred because the file system is mounted read/write by the server specified in the message. This error occurred because the file system is mounted read/write by the server specified in the error.
3138
3139
4019
6 6 6
4103 4109 4111
4124
Version 5.6.47
159 of 172
Return code
6
Error ID
4205
Message description
This error occurred because there is an invalid interface specified for the source file system. Check the interfaces specified for the Data Mover on which the file system is mounted. This error occurred because the destination file system is a backup of another replication or copy session. This error occurred because fsck is being executed on the file system. This error occurred because aclck is being executed on the file system. This error occurred because the remote Celerra system is running an out-of-family NAS code version with respect to the NAS code version running on the local Celerra system. This error occurred because IP Alias is configured on the system and the remote system's version could not be identified. This error occurred during the transaction because of an internal error. This error indicates that the query executed on the remote system failed. This error occurred because the destination file system could not found. This error indicates that the copy session failed. This error occurred because either the user aborted the copy session or there is a problem with the Data Mover. This error indicates that the checkpoint specified is older than than the replication configured on the source file system. This error occurred because a writable checkpoint is specified in the CLI. This error indicates that the clean up process for the copy session on destination failed. This error occurred because the size of the source and destination file systems does not match. This error occurred because the system was unable to retrieve copy session information from the Data Mover.
4420
4424
4425
4446
4447
5000
10233
10272
6 6
10273 10274
10277
10299
10311
10312
10310
160 of 172 Version 5.6.47
Error messages for Celerra Replicator

As of version 5.6, all new event, alert, and status messages provide detailed information and recommended actions to help you troubleshoot the situation. To view message details, use any of the following methods:
Celerra Manager: Right-click an event, alert, or status message and select to view Event Details, Alert Details, or Status Details.
Celerra CLI: Use the nas_message -info <error_id> command to retrieve the detailed information for a particular error.
EMC Celerra Network Server Error Messages Guide: Use this guide to locate information about messages that are in the earlierrelease message format.
Powerlink: Use the text from the error messages brief description or the messages ID to search the Knowledgebase on Powerlink. After logging in to Powerlink, go to Support > Knowledgebase Search > Support Solutions Search.
Version 5.6.47
161 of 172
Related information
Specific information related to the features and functionality described in this document is included in:

EMC Celerra Glossary EMC Celerra Network Server Command Reference Manual EMC Celerra Network Server Error Messages Guide EMC Celerra Network Server Parameters Guide EMC Celerra Network Server Release Notes Configuring EMC Celerra Events and Notifications Configuring EMC Celerra Time Services Managing EMC Celerra Volumes and File Systems Manually Managing EMC Celerra Volumes and File Systems with Automatic Volume Management Online Celerra man pages Problem Resolution Roadmap for EMC Celerra Replicating EMC Celerra CIFS Environments (V1) Using EMC Celerra FileMover Using EMC Celerra Replicator for iSCSI (V1) Using File-Level Retention on EMC Celerra Using International Character Sets with EMC Celerra Using SnapSure on EMC Celerra Using TimeFinder/FS, NearCopy, and FarCopy with EMC Celerra
The EMC Celerra Network Server Documentation CD, supplied with Celerra and also available on the EMC Powerlink website, provides the complete set of EMC Celerra customer publications. After logging in to Powerlink, go to Support > Technical Documentation and Advisories > Hardware/Platforms Documentation > Celerra Network Server. On this page, click Add to Favorites. The Favorites section on your Powerlink home page provides a link that takes you directly to this page. Celerra Support Demos are available on Powerlink. Use these instructional videos to learn how to perform a variety of Celerra configuration and management tasks. After logging in to Powerlink, go to Support > Product and Diagnostic Tools > Celerra Tools > Celerra Support Demos.
162 of 172 Version 5.6.47
Training and Professional Services

EMC Customer Education courses help you learn how EMC storage products work together within your environment in order to maximize your entire infrastructure investment. EMC Customer Education features online and hands-on training in state-of-the-art labs conveniently located throughout the world. EMC customer training courses are developed and delivered by EMC experts. Go to EMC Powerlink at http://Powerlink.EMC.com for course and registration information. EMC Professional Services can help you implement your Celerra Network Server efficiently. Consultants evaluate your business, IT processes, and technology and recommend ways you can leverage your information for the most benefit. From business plan to implementation, you get the experience and expertise you need, without straining your IT staff or hiring and training new personnel. Contact your EMC representative for more information.
Version 5.6.47
163 of 172
Appendix A: fs_replicate -info output fields

The first part of the output from the fs_replicate -info command presents information about the source file system. Table 17 on page 164 describes these output fields.
Table 17 First section of fs_replicate -info output fields
Field
id name fs_state
Description
Source file system ID. Source file system name. Condition of the source file system. Indicates whether the source file system is active (active), mounted read-only (romounted), or unmounted (frozen). Whether it is a replication or playback service. Indicates if the replication service is active, inactive, or creating a delta set. If inactive, replication has fallen out-ofsync. How replication handles flow-control situations. Values are ReadOnly, Freeze, or NoPolicy, where: ReadOnly = the file system is only available for reads. Freeze = reads and writes are not allowed. NoPolicy = no policy is defined.
type replicator_state
source_policy
high_water_mark
Size in MB of the file system changes accumulated since the last delta set was created. When this size is reached, the replication service automatically creates a delta set on the SavVol. Interval (in seconds) when the replication service creates a delta set. Current delta set being processed. This reflects the current delta set being tracked in memory. Number of modified blocks in the current delta set. One block size is 8 KB. Indicates if there is enough space in the SavVol to write a new delta set to the SavVol. Note: An active status indicates there is not enough SavVol space to process the next delta set.
time_out
current_delta_set
current_number_of_block s flow_control
total_savevol_space savevol_space_available
The total size of the SavVol. The amount of free space in the SavVol.
164 of 172 Version 5.6.47
The second part of the output from the fs_replicate command presents information about the destination file system. Table 18 on page 165 describes these output fields.
Table 18 Second section of fs_replicate -info output fields
Field
id name type playback_state
Description
Destination file system id. Destination file system name. Indicates whether it is a replication or playback service. Indicates whether the playback service is active, inactive, or replaying a delta set. If inactive, replication has not started. Size in MB of the delta sets at which the replication service automatically replays a delta set on the SavVol. Interval (in seconds) when the replication service creates or replays a delta set. Lists the next delta set to replay to the destination file system. Indicates whether there is enough space to write an incoming delta set to the destination SavVol. Note: An active status indicates there is not enough SavVol space to process the next delta set.
high_water_mark
time_out
current_delta_set flow_control
total_savevol_space savevol_space_available outstanding delta sets
The total size of the SavVol. The amount of free space in the SavVol. The delta sets on the destination SavVol the replication service has not played back to the destination file system.
Version 5.6.47
165 of 172
The third part of the output from the fs_replicate command presents information on the communication state between the source and destination Data Movers. Table 19 on page 166 describes these output fields.
Table 19 Third section of fs_replicate -info output fields
Field
communication_state
Description
Indicates whether the Data Movers on each Celerra system can communicate with each other. The rate at which the last 128 MB of data was sent across the IP network. The average rate at which the last 128 sets of data were sent across the IP network. IP address of the source Data Mover. Port number of the source Data Mover. This is assigned dynamically. IP address of the destination Data Mover. Port number of the destination Data Mover. This is always assigned the number 8888 by default. The IP network bandwidth throttle used for this replication relationship. Zero (0) represents maximum available network bandwidth.
current_transfer_rate
avg_transfer_rate
source_ip source_port
destination_ip destination_port
QOS_bandwidth
When using the verbose option with the fs_replicate -info command, the replication service generates this additional output. Table 20 describes these output fields.
Table 20 fs_replicate -info -verbose output fields (page 1 of 2)
Field
Source file system Delta Create Time Dur Blocks
Description
Delta-set number or delta-set ID that was created. Date and start time the delta set was created. Duration of time (in seconds) to create the delta set. Number of blocks modified in the delta set.
166 of 172 Version 5.6.47
Table 20 fs_replicate -info -verbose output fields (page 2 of 2)
Field
Destination file system Playback Time
Description
Date and start time the delta set replayed to the destination file system. Duration of time (in seconds) to replay the delta set or DsinGroup. Number of blocks modified in the delta set. Number of delta sets in the group played back. In some instances, the playback service can play back more than one delta set at once. In this case, the Dur and Blocks fields refer to the group as a whole, not an individual delta set.
Dur Blocks DSinGroup
Version 5.6.47
167 of 172
168 of 172 Version 5.6.47
Index
A
aborting replication 79 Automatic File System Extension 98, 100, 104 definition 5 extending after replication failure 104 recovering from failure 100 starting replication from scratch 104
nas_cel, verify Control Station relationship 18, 41 nas_cel, view passphrase 42 nas_fs, creating destination 45 server_mount, mounting file systems 45 server_mountpoint, creating 45 server_sysstat, monitor memory usage 35 configuration considerations 35 replication policies 30 Control Station preinitialization 40
B
bandwidth size changing policy 109 modifying 109
D
data flow control 31 delta set checking status 115 definition 6 minimum size 30 overview 30 delta-set transport failure 156 disk transport 140
C
cautions graceful shutdown 9 serial replication sessions 9 system 9 Unicode to ASCII replication 9 Celerra Replicator cautions 9 checking status of 55, 114 log files 154 restarting replication 89 restrictions 7 starting replication 48 system requirements 22 upgrade considerations 24, 25 checkpoint definition 6 Commands fs_replicate, flow control options 32 nas_fs, calculate SavVol size 33 commands fs_ckpt, copy checkpoints 50 fs_ckpt, creating 43 fs_ckpt, using 59 fs_copy, checkpoint copy 46 fs_copy, copy checkpoints 45 fs_copy, copy incremental changes 52 fs_copy, events for 122 fs_copy, using 46, 52 fs_replicate, aborting 79 fs_replicate, changing bandwidth 109 fs_replicate, check status 55 fs_replicate, control SavVol size 34 fs_replicate, description of output fields 165 fs_replicate, failover 64 fs_replicate, failure 50 fs_replicate, output definitions 165 fs_replicate, resynchronize 68 fs_replicate, reverse 76 fs_replicate, starting 48 fs_replicate, suspending 81 fs_replicate, using 57 Using Celerra Replicator (V1)
E
exit codes for fs_copy 157 extending file system after replication failover 104
F
failover initiate 64 options 62 failure delta-set transport 156 failure, fs_replicate 50 file system automatically extending size 98 events 121 manually extending size 101 flow control freeze 107 read-only 107 fs_copy return codes 157
H
high water mark definition 6 resetting policy 105 HTTP communication 17
I
information, related 163 Initial Copy transporting by disk 144 initial copy transporting by disk 140 transporting by tape 144
Version 5.6.47
169 of 172
IP replication service definition 6
L
local replication definition 6 description of 10 process overview 10 system requirements 22 log files 154 loopback replication, definition 6 losing power, CLARiiON storage system 157
M
messages, server_log 155
N
nas_cel, using 41 nas_fs, using 45, 101 network performance, troubleshooting 156 NS series, loses power 157
O
overview local replication 10 remote replication 12 replication failover 13 replication reverse 17 suspend 81
P
parameters, changing Replicator SavVol size 123 passphrase establishing 17 viewing 42 physical transport of replication data 139 playback service checking 115 definition 6 policy bandwidth size 109 flow control 31 high watermark 30 time-out 30
description of 11 establish communication between Celerra systems 17 process overview 12 start replication 48 system requirements 22 verify communication between Celerra systems 41 replication aborting 79 starting from scratch 104 replication failover definition 6 process overview 13 replication policy configuring 35 flow control 107 high watermark 30 setting 105 time-out 30 triggers 35 replication service, definition 6 replications per Data Mover 34 Replicator SavVol definition 6 parameter for changing size 123 size requirements 33 restart out of synchronization checkpoints 21, 59 suspended replication 20 restrictions system 7 TimeFinder/FS 7 TimeFinder/FS Near and Far Copy 7 resynchronize process overview 16 reversing replication after failover 75 definition 6 maintenance procedure 111 process overview 17
S
server_df, monitoring file systems 114 server_log 155 server_param command, SavVol default 123 server_sysstat, monitoring replication 114 SnapSure, SavVol definition 6 SNMP traps 121 status checking 55 of replication 114 suspend overview 81 procedure 20 system requirements 22 restrictions 7
R
recovering from a corrupted file system using nas_fsck 130 recovering from auto extension failure 100 remote replication copy checkpoint 45 incremental changes 52 create checkpoint 42 destination file system 45 definition 6
170 of 172
Version 5.6.47
T
tape, transporting initial copy procedure 144 timeout definition 6 replication policy 30 resetting policy 105 triggers, data flow 31 troubleshooting accommodating network concerns 127 calculating modifications rate on source file system 127 changing the passphrase 124 Control Station failovers 157 Control Station reboots during replication 156 copying a file system to multiple destinations with fs_copy 136 creating checkpoints on the destination site 136 creating restartable checkpoints 125 enlarging SavVol size 126 error messages 162 failure during transport of delta set 156 network performance 156 NS series loses power 157 reading server_log messages 155 recovering from an inactive replication state 135 return codes for fs_copy 157 starting replication from a differential copy 128 starting replication from scratch 129 temporarily mount destination file system read/write 133 using log files 154 trust relationship 17
U
upgrading 24, 25
V
Virtual Data Mover, definition 7 virtual provisioning definition 7
Version 5.6.47
171 of 172
About this document

As part of its effort to continuously improve and enhance the performance and capabilities of the Celerra Network Server product line, EMC periodically releases new versions of Celerra hardware and software. Therefore, some functions described in this document may not be supported by all versions of Celerra software or hardware presently in use. For the most up-to-date information on product features, see your product release notes. If your Celerra system does not offer a function described in this document, contact your EMC Customer Support Representative for a hardware upgrade or software update.
Comments and suggestions about documentation

Your suggestions will help us improve the accuracy, organization, and overall quality of the user documentation. Send a message to techpubcomments@EMC.com with your opinions of this document.
Copyright 1998-2009 EMC Corporation. All rights reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS IS. EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date regulatory document for your product line, go to the Technical Documentation and Advisories section on EMC Powerlink. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All other trademarks used herein are the property of their respective owners.
172 of 172 Version 5.6.47

Celerra Replcatr

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Celerra Replcatr

Uploaded by

Copyright:

Available Formats

Using Celerra Replicator (V1)

P/N 300-004-184 Rev A08

Using Celerra Replicator (V1)

Using Celerra Replicator (V1)

4 of 172 Version 5.6.47

Using Celerra Replicator (V1)

Introduction to Celerra Replicator

Using Celerra Replicator (V1)

sometimes referred to as a checkpoint file system or a SnapSure file system.

systems residing on the same Data Mover.

storage vehicle for tracking changes in the source file system.

data blocks from the source file system.

6 of 172 Version 5.6.47

Using Celerra Replicator (V1)

Using Celerra Replicator (V1)

8 of 172 Version 5.6.47

Using Celerra Replicator (V1)

Using Celerra Replicator (V1)

Celerra Replicator concepts

Local replication process

Primary Data Mover

Secondary Data Mover

2 Source file system 3 SavVol Destination file system 4

Figure 1 Local replication

10 of 172 Version 5.6.47

Using Celerra Replicator (V1)

Using Celerra Replicator (V1)

Remote replication process

Destination file system 5

Figure 2 Remote replication

12 of 172 Version 5.6.47

Using Celerra Replicator (V1)

Activating the destination file system as read/write

Using Celerra Replicator (V1)

Source site Celerra Network Server 1

Destination site Celerra Network Server 2

Source file system

Destination file system

Source SavVol Source storage unit

Destination SavVol Destination storage unit

Figure 3 Source site becomes unavailable

14 of 172 Version 5.6.47

Using Celerra Replicator (V1)

Source site Celerra Network Server 1

Destination site Celerra Network Server 2

Source file system

Destination file system

Source SavVol Source storage unit

Destination SavVol Destination storage unit

Using Celerra Replicator (V1)

Source file system

Destination file system

Source SavVol Source storage unit

Destination SavVol Destination storage unit

16 of 172 Version 5.6.47

Using Celerra Replicator (V1)

Source file system

Destination file system

Source SavVol Source storage unit

Destination SavVol Destination storage unit

Figure 6 Replication reversal

Communication between Celerra Network Servers

Using Celerra Replicator (V1)