Professional Documents
Culture Documents
Release 7.1
EMC Corporation
Corporate Headquarters:
Hopkinton, MA 01748-9103
1-508-435-1000
www.EMC.com
Copyright © 1998 - 2012 EMC Corporation. All rights reserved.
Published July 2012
EMC believes the information in this publication is accurate as of its publication date. The
information is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED "AS IS." EMC CORPORATION
MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO
THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an
applicable software license.
For the most up-to-date regulatory document for your product line, go to the Technical
Documentation and Advisories section on EMC Powerlink.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on
EMC.com.
All other trademarks used herein are the property of their respective owners.
Corporate Headquarters: Hopkinton, MA 01748-9103
Preface.....................................................................................................5
Chapter 1: Introduction...........................................................................7
System requirements...............................................................................................8
Restrictions...............................................................................................................8
VNX restrictions............................................................................................8
Limitations......................................................................................................9
Cautions and warnings.........................................................................................10
User interface choices...........................................................................................10
Related information..............................................................................................11
Chapter 2: Concepts.............................................................................13
Data Mover failover..............................................................................................14
Data Mover failure detection...............................................................................14
Control Station failover considerations..............................................................15
Standby operations................................................................................................16
Choose standby Data Movers..............................................................................17
Failover policies...........................................................................................17
Failover example.........................................................................................17
Testing standby relationships....................................................................18
Standby Data Mover rules.........................................................................19
Uses of the server_standby command................................................................19
Chapter 3: Configuring.........................................................................21
Configure a standby Data Mover........................................................................22
Chapter 4: Managing............................................................................23
Chapter 5: Troubleshooting..................................................................31
EMC E-Lab Interoperability Navigator..............................................................32
VNX user customized documentation...............................................................32
Troubleshoot a failover that has failed...............................................................32
Error messages.......................................................................................................32
EMC Training and Professional Services...........................................................33
Glossary..................................................................................................41
Index.......................................................................................................43
As part of an effort to improve and enhance the performance and capabilities of its product lines,
EMC periodically releases revisions of its hardware and software. Therefore, some functions described
in this document may not be supported by all versions of the software or hardware currently in use.
For the most up-to-date information on product features, refer to your product release notes.
If a product does not function properly or does not function as described in this document, please
contact your EMC representative.
Note: Emphasizes content that is of exceptional importance or interest but does not relate to personal
injury or business/data loss.
Indicates a hazardous situation which, if not avoided, will result in death or serious
injury.
Note: Do not request a specific support representative unless one has already been assigned to
your particular system problem.
Your comments
Your suggestions will help us continue to improve the accuracy, organization, and overall
quality of the user publications.
Please send your opinion of this document to:
techpubcomments@EMC.com
Introduction
EMC VNX for File/Unified can be configured with one or multiple Data
Movers and one or two Control Stations. The standby Data Mover or
Control Station assumes operation from the failed component. The VNX
Installation Assistant also supports the initialization of dual Control Station
systems.
Each Data Mover is a completely autonomous file server with its own
operating system image. During normal operations, the clients interact
directly with the Data Mover, not only for NFS or CIFS access but also for
control operations such as mounting and unmounting file systems. Data
Mover failover can protect the system against hardware or software failure.
The primary role of the Control Station is to monitor a Data Mover and to
affect administrative changes in a Data Mover’s configuration.
This document is part of the VNX documentation set and is intended for
the system administrators responsible for managing high availability for
Data Movers and Control Stations in a system.
Topics included are:
◆ System requirements on page 8
◆ Restrictions on page 8
◆ Cautions and warnings on page 10
◆ User interface choices on page 10
◆ Related information on page 11
System requirements
Table 1 on page 8 describes the EMC® VNX™ software, hardware, network, and storage
configurations required for configuring standbys as described in this document.
Restrictions
When configuring standby Data Movers, you cannot:
◆ Create a standby Data Mover for a standby
◆ Use a primary Data Mover as a standby
◆ Create a new interface for a Data Mover while it is failed over to its standby
Data Mover failover is supported for physical Data Movers only. Virtual Data Movers
(VDMs) associated with a physical Data Mover fail over with the physical Data Mover.
VNX restrictions
VNX for File/Unified can be configured with one or multiple Data Movers and one or two
Control Stations. Table 2 on page 8 lists the Data Movers and Control Station combinations
for the VNX series.
Table 2. Control Station and Data Mover combinations for VNX Series
Table 2. Control Station and Data Mover combinations for VNX Series (continued)
Limitations
More information on limitations in using EMC Symmetrix Remote Data Facility (SRDF®)
with redundant Control Stations is available in Using SRDF/S with VNX for Disaster Recovery.
Restrictions 9
Introduction
Limitations
You cannot use the Unisphere to configure the following actions:
Related information
Specific information related to the features and functionality described in this document are
included in:
◆ VNX Glossary
◆ EMC VNX Command Line Interface Reference for File
◆ VNX for File man pages
◆ Parameters Guide for VNX
◆ VNX System Operations
◆ Using SRDF/S with VNX for Disaster Recovery
◆ Using SRDF/A with VNX
◆ Configuring and Managing Networking on VNX
◆ Celerra Network Server Error Messages Guide
◆ VNX for File Release Notes
◆ Problem Resolution Roadmap for VNX
VNX wizards
Unisphere software provides wizards for performing setup and configuration tasks. The
Unisphere online help provides more details on the wizards.
Related information 11
Introduction
Concepts
Failover occurs when a standby component takes over for a failed primary
component by immediately routing data to an alternate data path or device
to avoid interrupting services during a failure. Data Movers and Control
Stations support failover.
Topics included are:
◆ Data Mover failover on page 14
◆ Data Mover failure detection on page 14
◆ Control Station failover considerations on page 15
◆ Standby operations on page 16
◆ Choose standby Data Movers on page 17
◆ Uses of the server_standby command on page 19
If the Control Station is not running, Data Mover failover cannot occur. When the
Control Station returns to service, it will recognize the Data Mover failure and initiate
the appropriate action depending on the automatic, retry, or manual failover policy.
Failover occurs under any of these conditions Failover does not occur under
these conditions
◆ Data Mover panic: Any hardware faults or ◆ Manually restarting a Data
software exceptions that may result in a Data Mover
Mover panic. ◆ Removing a Data Mover from
◆ Stale reason code: Any software exceptions its slot
or hardware failures (including sticky memory
errors) that result in a stale reason code for a
Data Mover.
◆ Data Mover hang: Any conditions that lead to
both the internal interfaces down and cause
the Data Mover to hang.
◆ Failover failure: Catastrophic failure during a
failover operation.
◆ Spontaneous Data Mover reset: Any software
or hardware exception that causes a sponta-
neous reset of the Data Mover without causing
a Data Mover panic (for release 5.6.48.x and
later).
Note: If a standby Data Mover is not configured, a Data Mover may panic and restart itself in certain
cases when it detects a software problem, that is, a software panic or an exception. Typically, the restart
takes less than 100 seconds. NFS applications and NFS clients do not see any interruption, but might
see the message "server not responding" during the restart. In similar cases, if the standby Data Mover
is configured, then failover occurs.
Standby operations
When any failover condition occurs, you can transfer functionality from the primary to the
standby Data Mover without disrupting the availability of the file system. The standby Data
Mover assumes the following identities from the faulted Data Mover:
◆ Network identity — IP and MAC addresses of all its NICs
◆ Storage identity — File systems controlled by the faulted Data Mover
◆ Service identity — Shares and exports controlled by the faulted Data Mover
The standby Data Mover assumes user file system services (if the policy is set to automatic)
within a few seconds of the failure, transparently, and without requiring users to unmount
and remount the file system.
Failover policies
The failover policy determines how a standby Data Mover takes over after a primary Data
Mover fails. The Control Station invokes this policy when it detects the failure of the primary
Data Mover. Table 4 on page 17 lists the supported failover policies.
Note: The failover policy is associated with the primary Data Mover. Each primary can have only one
policy assigned to it at a time.
Failover example
Figure 2 on page 18 shows how failover works. In this example, server_2 is the primary and
server_7 is the standby. When the standby is activated, the following occurs:
1. The faulted primary is renamed server_2.faulted.server_7.
2. The standby Data Mover assumes the name of the failed primary Data Mover, server_2.
When the primary Data Mover is restored, each Data Mover takes on its original name:
server_2 for the primary Data Mover and server_7 for the standby.
If a Data Mover fails, the VNX clients retain normal NFS functions, but FTP, archive,
or NDMP sessions are lost and not restarted. Connections between CIFS clients and
the Data Mover are lost, but the redirector on the client reconnects with the Data
Mover after the failover. However, all data cached by the clients before failover is
lost and data loss can occur. Client applications that use the shares might not recover.
Note: Using SRDF/S with VNX for Disaster Recovery or Using SRDF/A with VNX provide additional
information on configuring standby Data Movers.
Configuring
Action
To configure a standby Data Mover for a primary Data Mover, use this command syntax:
$ server_standby <movername> -create mover=<source_movername> [-policy <poli
cy_type>]
where:
<movername> = name of the primary Data Mover
<policy_type> = failover policy (auto, retry, or manual). If this option is omitted, the manual policy is used.
Example:
To configure server_5 as a standby for the primary Data Mover server_4 with the automatic failover policy, type:
$ server_standby server_4 -create mover=server_5 -policy auto
Output
server_4 : server_5 is rebooting as standby
If the failover policy of a primary Data Mover is set to manual, use the procedure given
above, when that primary Data Mover fails.
Managing
Procedure
Note:
◆ This process can take some time to complete, depending on the size of the file systems. A message
appears that indicates completion.
◆ The primary Data Mover assumes the name <primary>.faulted.<standby>. The standby Data Mover
assumes the name of the failed primary Data Mover.
Action
To activate the standby Data Mover for a failed primary Data Mover, use this command syntax:
$ server_standby <movername> -activate mover
where:
<movername> = name of the failed primary Data Mover
Example:
To activate the standby Data Mover for server_4, type:
$ server_standby server_4 -activate mover
Output
server_4
server_4 : going offline
server_5 : going active
replace in progress ...done
failover activity complete
commit in progress (not
interruptible)...done
server_4 : renamed as
server_4.faulted.server_5
server_5 : renamed as server_4
Use only this procedure to restart and restore a Data Mover. If you manually restart
the faulted Data Mover, it broadcasts the same MAC address as the active standby
Data Mover. This can cause system conflicts and loss of network connections.
Note: The Data Mover names revert to their original settings. To verify the Data Mover type, use the
nas_server -list command.
Action
To restore the primary Data Mover, use this command syntax:
$ server_standby <movername> -restore mover
where:
<movername> = original name of the failed primary Data Mover. Do not use the faulted name.
Example:
To restore a faulted Data Mover as the primary Data Mover (server_4), type:
$ server_standby server_4 -restore mover
Output
server_4 :
server_4 : going standby
server_4.faulted.server_5 : going active
replace in progress ...done
failover activity complete
commit in progress (not interruptible)...done
Action
To delete the relationship between a primary and a standby Data Mover, use this command syntax:
$ server_standby <movername> [-delete mover=<standby_movername>]
where:
<movername> = name of the primary Data Mover
<standby_movername> = name of the standby Data Mover (if the primary has multiple standbys)
Example:
To remove the standby relationship between server_4 and its only standby Data Mover, type:
$ server_standby server_4 -delete mover
Output
server_4 : done
Action
To change a former standby Data Mover to a primary, use this command syntax:
$ server_setup <movername>-type nas
where:
<movername> = name of the former standby Data Mover
Example:
To make server_5 a primary Data Mover, type:
$ server_setup server_5 -type nas
Output
id = 0
name = jimmy
owner = 0
device =
channel =
net_path = 10.6.1.148
celerra_id = APM000637012930000
done
You must assign a new IP address for the new primary Data Mover. Configuring and Managing
Networking on VNX provides instructions.
Action
To verify the readiness of a standby, use this command syntax:
$ server_standby <movername> -verify mover
where:
<movername> = name of the Data Mover
Example:
To verify the standby status for server_4, type:
$ server_standby server_4 -verify mover
Output
server_4 : ok
Note: The standby Control Station does not take on the IP address of the faulted Control Station. Each
Control Station is configured with its own IP address. However, you can configure an IP alias that
connects to the active Control Station. VNX System Operations provides more information on IP alias
configuration.
Failover from a standby Control Station on page 28 provides the procedure for changing
the state of the standby to that of the primary (takeover).
Failover from a primary Control Station on page 29 provides the procedure for changing
the state of the primary (failover) to that of standby, and activating the standby to take over
the role of primary.
You must log in as nasadmin and use the su command to become root to perform these
tasks.
Action
To change a standby Control Station to a primary Control Station, from the /nasmcd/sbin directory, type:
#./cs_standby -takeover
Output
Action
To change a primary Control Station to a standby Control Station, from the /nas/sbin directory, type:
# ./cs_standby -failover
Output
Troubleshooting
Error messages
All event, alert, and status messages provide detailed information and recommended actions
to help you troubleshoot the situation.
To view message details, use any of these methods:
◆ Unisphere software:
• Right-click an event, alert, or status message and select to view Event Details, Alert
Details, or Status Details.
◆ CLI:
• Use this guide to locate information about messages that are in the earlier-release
message format.
• Use the text from the error message's brief description or the message's ID to search
the Knowledgebase on the EMC Online Support website. After logging in to EMC
Online Support, locate the applicable Support by Product page, and search for the
error message.
The example describes the various tasks associated with a standby Data
Mover:
◆ Standby Data Mover example on page 36
server_2 : done
Note: If you define multiple standby Data Movers for the same primary Data Mover, the most
recent policy for the primary Data Mover is used if no policy is specified.
id = 2
name = server_3
acl = 1432,
owner = nasadmin,
ID = 201
type = standby
slot = 3
member_of =
standbyfor = server_2
status :
defined = enabled
actual = online, ready
id = 1
name = server_2
acl = 1432,
owner = nasadmin,
ID = 201
type = nas
slot = 2
member_of =
standby = server_3, policy=manual
status :
defined = enabled
actual = online, active
server_2 :
server_2 : going offline
server_3 : going active
replace in progress ...done
failover activity complete
commit in progress
(not interruptible)...done
id = 2
name = server_2
acl = 1432,
owner = nasadmin,
ID=201
type = nas
slot = 3
member_of =
standby = server_2.faulted.server_3, policy=manual
status :
defined = enabled
actual = online, active
Note: After a server fails over to one of its standby servers (in this example, server_3), no other
standby for the faulted server can be used. When the primary Data Mover is restored, all standbys
originally defined for the Data Mover will be available again.
server_2 :
server_2 : going standby
server_2.faulted.server_3 : going active
replace in progress ...done
failover activity complete
commit in progress (not interruptible)...done
server_2 : renamed as server_3
server_2.faulted.server_3 : renamed as server_2
Both servers return to their original names.
server_2 : done
10. Verify that server_2 no longer has an associated standby Data Mover:
$ nas_server -info server_2
Output:
id = 1
name = server_2
acl = 1432,
owner = nasadmin,
ID=201
type = nas
slot = 2
member_of =
standby =
status :
defined = enabled
actual = online, active
Control Station
Hardware and software component of VNX for file that manages the system and provides the
user interface to all VNX for file components.
Data Mover
In VNX for file, a cabinet component that is running its own operating system that retrieves
data from a storage device and makes it available to a network client. This is also referred to as
a blade.
failover
Process of immediately routing data to an alternate data path or device to avoid interrupting
services in the event of a failure. The impact to service is dependent on the application’s ability
to handle the change gracefully.
standby device
Device held in reserve against a failure of its active partner. When the active device fails, the
standby device takes over.
cautions 10
change standby to primary 26 F
concepts 13 failover
configuration tasks 22 troubleshooting failure 32
configuring a standby 22 failover, Control Station 15
Control Station failover, Data Mover 14, 17
fail over from primary 29 conditions 14
failover 15 example 17
failover from a standby 28 policies 17
use standby 28 policy types (table) 17
cs_standby 29 failure detection, Data Mover 14
D I
Data Mover interface choices 10
activate standby 24
change from standby to primary 26, 27
create standby relationship 22 L
failover 14
failover policies 17 limitations 9
failure detection 14
restore primary 25 M
standby relationship 14
verify after upgrade 28 messages, error 32
delete standby relationship 26
documents, related 11
N
E nas_server
view Data Mover list 36, 37
EMC E-Lab Navigator 32
error messages 32
P
primary Control Station