You are on page 1of 46

NFS FILE MIGRATION TO A DELL EMC ISILON

CLUSTER
Guidance for optimal data migration of NFS workflows

ABSTRACT
This paper provides technical information and recommendations to help you migrate
data from a single NFS protocol workflow on another NAS vendor to a Dell EMC Isilon
storage cluster. It includes the best practices for planning, setting up, and executing the
migration.

December 2016

WHITE PAPER
The information in this publication is provided “as is.” DELL EMC Corporation makes no representations or warranties of any kind with
respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular
purpose.

Use, copying, and distribution of any DELL EMC software described in this publication requires an applicable software license.

DELL EMC2, DELL EMC, the DELL EMC logo are registered trademarks or trademarks of DELL EMC Corporation in the United States
and other countries. All other trademarks used herein are the property of their respective owners. © Copyright 2016 DELL EMC
Corporation. All rights reserved. Published in the USA. 12/16 White Paper H12517

DELL EMC believes the information in this document is accurate as of its publication date. The information is subject to change without
notice.

DELL EMC is now part of the Dell group of companies.

2
Table of Contents

INTRODUCTION ........................................................................................................................6
Assumptions ...................................................................................................................................... 6
Audience ........................................................................................................................................... 6
Prerequisites ..................................................................................................................................... 7
The challenge of data migration ........................................................................................................ 7
Risk management ............................................................................................................................. 7
Data integrity ..................................................................................................................................... 8
Data availability ................................................................................................................................. 8

PROJECT PHASES AND METHODOLOGY OVERVIEW ........................................................8


Discovery and planning phase .......................................................................................................... 8
Key aspects of the planning phase ............................................................................................................ 8
Migration approach and requirements ........................................................................................................ 9
Migration methodology ............................................................................................................................. 10
Migration toolset selection........................................................................................................................ 10
Testing the migration methodology ................................................................................................. 10
Data migration testing .............................................................................................................................. 10
User acceptance testing........................................................................................................................... 11
Cutover methodology testing ................................................................................................................... 11
Rollback strategy testing .......................................................................................................................... 11
Executing the migration ................................................................................................................... 11
Data transfer ............................................................................................................................................ 12
Cutover .................................................................................................................................................... 12
Acceptance .............................................................................................................................................. 12
Rollback ................................................................................................................................................... 12
Repetition ................................................................................................................................................ 12
Post-migration ................................................................................................................................. 13

SINGLE PROTOCOL NFS DATA MIGRATION ..................................................................... 13


Challenges of a single protocol NFS data migration........................................................................ 13
Data-specific considerations ........................................................................................................... 13
Migration requirements and customer data collection ..................................................................... 15
Requirements gathering ........................................................................................................................... 15
Current infrastructure and data analysis ................................................................................................... 16
Determine migration methodology................................................................................................... 16

3
Migration methodology considerations ..................................................................................................... 16
Migration sequencing ............................................................................................................................... 17
Type of migration ..................................................................................................................................... 18
Host-based migrations ............................................................................................................................. 18
Isilon-based migrations ............................................................................................................................ 20
Migration tool selection and use ............................................................................................................... 21
Migration tools ......................................................................................................................................... 22
rsync ........................................................................................................................................................ 22
Isilon-based migrations – isi_vol_copy ..................................................................................................... 26
Isilon-based migrations from VNX—isi_vol_copy_vnx .............................................................................. 28

MIGRATION PREPARATION................................................................................................. 29
Infrastructure and environment setup .............................................................................................. 29
Source host preparation .................................................................................................................. 29
Migration host preparation—Source and target access................................................................... 30
Isilon cluster configuration preparation ............................................................................................ 31
Additional Isilon cluster considerations............................................................................................ 32
NFS group membership limitation ................................................................................................... 32
Isilon guidelines for large workloads ................................................................................................ 35

MIGRATION APPROACH—TESTING AND PROOF OF CONCEPT .................................... 37


DATA VALIDATION ............................................................................................................... 37
PERFORMANCE .................................................................................................................... 38
USER ACCEPTANCE TESTING—DATA AND WORKFLOW TESTING.............................. 38
START OF MIGRATION EXECUTION ................................................................................... 39
PRE-CUTOVER PREPARATION ........................................................................................... 39
CUTOVER EVENT .................................................................................................................. 40
THE GO OR NO-GO DECISION............................................................................................. 41
ROLLBACK ............................................................................................................................ 41
MIGRATION EVENT COMPLETION ...................................................................................... 42
STEADY STATE ..................................................................................................................... 42
CONCLUSION ........................................................................................................................ 42
APPENDIX: SAMPLE MIGRATION USE CASE .................................................................... 43

4
5
Introduction
This white paper outlines the recommended approach for migrating single protocol network file system (NFS) data from
® ®
other network-attached storage (NAS) systems to a Dell EMC Isilon storage cluster. Single protocol NFS data is defined
as data read, written, or modified using NFSv2 or NFSv3 protocols. This paper includes best practices on Isilon cluster
configuration, tool selection, and host setup to optimize
an NFS-based data migration. The paper also includes best practices to optimize performance, management, and support.
Although this paper addresses a single NFS protocol data migration, the approach and many of the best practices
described can be used as a foundation for other types of data migration.

Much of the relevant information for planning, provisioning, and supporting end-user directories on an Isilon storage cluster
is available through white papers and guides from Dell EMC Isilon at http://support.emc.com. As a result, this paper avoids
duplicating this other content and includes only the information that pertains to setting up and operating an Isilon cluster as
a destination for a single protocol NFS data migration.

Assumptions
This document focuses on the data migration of NFS; it does not specifically address the migration to an Isilon cluster of
NFS exports, local users and groups, or any other NFS configuration from another NAS system.

This document should not be used in a multiprotocol migration. A multiprotocol migration requires many additional
considerations, and the specific actions required for such a migration may be different.

Additionally, this paper assumes that:


 The source data is accessed only through a single protocol NFS workflow. Further, authentication and
authorization for POSIX users (UID/GID) need to be consistent on the source and destination migration clusters.
Lightweight Directory Access Protocol (LDAP), Network Information Services (NIS), distributed local files,
and Active Directory (AD) with RFC2307 enabled (SFU) are all centralized
methods of authentication that should be successfully implemented before the migration begins.
 An authoritative and consistent source for user authentication is assumed (for example, AD with RFC2307, LDAP,
NIS, and so on). If there are multiple sources of authentication from disparate clusters targeted for migration (that
is, several local different/etc./password files), then they must be inspected for collisions and manually combined, if
necessary, to prevent UID and GID collisions. Identity management that is optimized with preference for a single
external directory service is recommended. If necessary, work with the customer to establish a consistent and
authoritative set of users and groups.
 Workflows will be for NFSv3 with the possibility of adding Server Message Block (SMB) workflows in the future;
NFSv4 is not addressed in this document. The Dell EMC Isilon file system will have a “balanced” on-disk identity
setting for global access control list (ACL) policy.
 Files with POSIX mode bits only are in scope for this document; additional permissions such as SMB ACLs are
not covered.
 Multiple source clusters may have identical export directories. A directory consolidation plan must be developed to
address directory name collisions in the newly created single namespace on the Isilon cluster.
 Files and directories have unique permissions that restrict access to the intended users and groups; a typical
migration preserves and transfers them. Post-migration permission transformation is not covered in this paper.

Audience
This paper is intended for experienced system and storage administrators who are familiar with file services and network
storage administration.

The document assumes you have a working knowledge of the following:


 NAS systems
 The NFS storage protocol, as is appropriate for the specific migration requirements

®
The Isilon scale-out storage architecture and the Dell EMC Isilon OneFS
operating system

®
Additional Isilon features, including Dell EMC Isilon SmartConnect™, SmartPools policy
management, SnapshotIQ™, and SmartQuotas™
 File system management concepts and practices, including provisioning, permissions, and
performance optimization
6
 Integration practices for connecting and establishing authentication relationships using centralized
sources (LDAP, NIS, local files, and so on)
 Basic shell commands, command line operation, and basic shell scripting

While this paper is intended to provide a consolidated reference point for data migrations to an Isilon storage cluster, it is
not intended to be the authoritative source of information on the technologies and features used to provide and support
a file services platform. In the event that additional services are required, Dell EMC IT services are available to assist with
streamlining data migrations, reducing risk, and minimizing impact.

Prerequisites
Some of the features that are described or recommended in this document may require separate per-node licensing from
Dell EMC Isilon. For more information, please contact your Dell EMC Isilon representative.

The challenge of data migration


The migration of a storage system’s data and all the existing user access permissions is a complex process. Moving the
data while limiting downtime and protecting the data can be challenging. While you execute a migration, it is critical that
access to the data is available at all times and that data integrity is ensured to protect against data loss or corruption.

It is critical to understand that a data migration is a unique project. Few environments are the same, and, as a result, each
migration should be considered
a unique event. No preexisting approach will necessarily be appropriate for all migrations. That is not to say that common
approaches cannot be used after you evaluate and understand the requirements of a specific migration. The goal of this
white paper is to introduce the recommended approach to designing and executing
an NFS data migration to an Isilon cluster. Every migration is different; this paper provides some examples and guidance.
Dell EMC Professional Services provides expertise in working with customers to build an individual plan that meets their
needs. Whether you will use Dell EMC Professional Services for the migration or you plan to manage the migration in
house, the following are key areas that must considered before the start of any migration project:
 Investigate the composition of the source data. Is it a deep directory structure or a wide structure with many files
per directory? The type and number of files and directories will directly influence how the migration is planned and
executed.
 Understand the sequence of a migration project—this is critical to its success.
The ability to predict and manage the time required to execute the data movement is paramount—it may be the
single biggest factor that will affect the project’s success.
 Maintaining data availability throughout the lifecycle of the data migration project is also critical. Little of today’s
data can be unavailable for days at a time. In order to maintain data availability, you will need a strategy to
maintain access to data throughout the migration process.

Risk management
It is not uncommon to have a number of challenges or perceived problems that can potentially act as blocking issues or barriers to
executing the migration. With sufficient planning and testing, such perceived risks can be addressed and managed successfully.

Common risks associated with data migrations include the following:


 Amount of data (that is, large total volume, high file counts) and the potential for needing an extended
period of time and effort to migrate it
 Potential for performance impacts to the existing data solution and the customer network during the
migration
 Maintaining continuous access to the customer’s data throughout the migration
 Potential for required changes to the data permission models on the new target system for the
migration
 Inherent challenges with moving multiple client connections to the new
target system
 Maintaining a consistent security model after the migration
 Execution of the actual cutover event—with many moving elements that can increase the probability
of error

This paper helps you understand these challenges and risks, and enables you to develop a data migration methodology
that manages these risks while implementing Isilon best practices to facilitate and optimize the migration.

7
Data integrity
Data integrity is critical. The data must be moved exactly as it is, and any modification to the data during migration may
impact the availability of the data and the success of the migration. The goal of the project is to ensure that the data is
successfully migrated and that its integrity is not compromised during its movement. In most cases, this includes the
migration of all relevant file metadata, as well as the underlying data blocks. A complete backup of the source data should
be made, and the validity of the backup should be verified before the migration begins.

Extreme cases where the data integrity must be maintained can be addressed with the use of checksums. An MD5
checksum can be calculated on each file on both the source and destination systems post-migration, which can verify bit-
for-bit integrity.

Data availability
Any migration activity will require a transition or cutover from the existing source systems to the new destination systems
so that a customer’s data clients (end users) can continue to access their data once the data has been moved. This
cutover will require a window of time when the data is unavailable to a customer’s end users. Minimizing this time is the
goal of all migrations, and the time needed for this
process is often determined by the type and the amount of data. A number of migration strategies can be employed to
reduce the period when data is unavailable during the cutover.

Project phases and methodology overview


A data migration project should be broken into distinct phases. The goal of the project phases is to develop a robust and
repeatable migration strategy that aids the migration’s execution and leads to a successful migration cutover.

Discovery and planning phase


The goal of the discovery and planning phase is to design a migration methodology and plan that enables you to execute
the project with minimal risk and downtime. Following are components of the discovery and planning phase:
 Qualify the project
 Identify the migration scope
 Understand expectations
 Identify risks
 Define the timeline
 Identify all migration requirements (that is, a rollback plan)

Key aspects of the planning phase


During this phase, a detailed review of the existing source environment and data is undertaken, and then developed into a
plan to migrate the data to the new target Isilon environment. The planning phase should be completed and validated
before the other project phases are started. The key aspects of the planning phase include discovery of the existing
infrastructure, the data, and the Isilon cluster.

Infrastructure discovery
This is the point where the infrastructure of the existing storage system, network architecture, and the network path
between the source data and the Isilon cluster
are evaluated.

If, for example, multiple source filers are to be combined into a single Isilon cluster, then an extensive analysis of existing
exports should occur. Export directory naming collisions (that is, server1:/exports/home and
server2:/exports/home are to be combined from different source filers) should be investigated to verify that the
data and directories contained within them will be able to coexist in a unified export on the target Isilon cluster (for example,
isilon:/ifs/data/home):
For example:

8
server1:/exports/home
/user1
/user2
/user3
server2:/exports/home
/user4
/user5
/user1
If the /exports/home directories are to be unified into a single /home on the Isilon cluster, the duplicate /user1 directory
must be addressed. Possible solutions include combining (if the user is the same) or renaming the directories (if there are
different users), along with making a corresponding UID change, if necessary. Work with the customer to address these
issues and craft a solution that will be minimally disruptive to their existing environment.

Data discovery
This is the point where you analyze the data and workflows that you plan to migrate and determine how they map to the
target end state on the Isilon cluster.

Quotas
If quotas are utilized on the source volume, they will need to be recreated on
the target Isilon cluster. Implementation of the quotas, however, should not take place until after the migration is complete
to avoid any potential issues while transferring data.

Isilon cluster configuration design and discovery


This is the point where activities such as the design of the Isilon network, disk pools, shares, and authentication can affect
the migration design. The configuration of the cluster is critical to the success of the migration.

The output of the discovery phase influences the migration design and drives the execution of the project.

Migration approach and requirements


The analysis of the data that you collected during the discovery phase drives the migration requirements and the migration
plan. The migration requirements break down into subcategories that ask the questions what, how, and when:
o What—What are you migrating?
 All the data or a subset of the data
 Replicate the existing data as it is or transform it during the migration
 Copy the data but implement a new security model
 Take a hybrid approach
o How—How are you going to migrate the data, security, and workflows?
 Tools used to copy the data and security
 Cutover strategy; how will client connections be moved?
 If the data has a rapid rate of change, how will you accommodate it?
 Data is static and can be moved without impact
 Full data copies and follow-up incremental copies to gather recently
updated data
 Clients access this data currently by method x/y/z
 Limit access to the old data and redirect during the cutover
o When—When are you implementing the cutover?
 A single mass event
 Several large cutover events
 A series of smaller cutovers sustained over a longer time frame
 Rolling migration

Once these requirements have been clearly defined, a migration methodology can be developed to address them.
9
Migration methodology
Analysis of the migration requirements leads to the development of a migration methodology. The migration methodology
follows a waterfall methodology with phases generally occurring on completion of the prior phase. Although the preparation
for upcoming phases can occur before prior phases are completed, the execution is defined by the completion of its
dependent phase.

Figure 1. Sample migration plan

Figure 1 shows a sample migration plan. The plan addresses how all aspects of the migration are achieved: sequencing,
tools, timing, communication, and implementation. After you develop a migration plan, a proof of concept can help
you evaluate the approach and test the phases of the plan.

Migration toolset selection


After you finish the discovery phase and develop a methodology, you can select a toolset (host based, array based, and so
on) to copy the data and permissions.

Testing the migration methodology


After you develop a migration methodology, you must review, validate, and test the migration plan. A test migration is
usually run on a subset of the data. Running a
test migration is also invaluable in helping to estimate the performance and timing
of a migration.

Data migration testing


Testing the actual data movement process and execution is the first phase in testing the overall methodology. The data
migration testing determines whether the proposed methodology meets the requirements and accomplishes the goals of
the project.

The role of data migration testing is as follows:


 Validates the tool selection; does the tool do what you want it to? Does it copy the data and attributes?
Does it preserve hard and/or soft links?
 Validates the data transfer; is the data moved as expected?
 Validates that the permissions are copied over; are the permissions correct, functional, and operational?
 Benchmarks the performance of the data transfer; how long does it take to run full and incremental data
transfers?
 Tests the new data: Is the data available? Are the read/write settings correct? Does the new workflow
function correctly?
 Gives you the option to experiment with different methods, tools, and flags
 Enables you to tune the process to achieve the best results

10
 The testing should give you confidence that the data will be accessible and available after all the data
and users are transferred to the new system.

User acceptance testing


Before you execute the full migration and cutover, user acceptance testing (UAT) should be undertaken against the new
storage system and a sample of the migrated data. UAT validates that the data is ready for cutover by checking that:
 Data is accessible; users and applications can access the data correctly
 Permission models are correct; the required security is applied to the migrated data
 Workflows are operational; there are no issues with using the data

Cutover methodology testing


Cutover methodology testing helps determine how you will move client connections—and how the clients will respond to
the cutover. Through testing, you can gauge how long it takes to move the connections, what kind of issues may occur,
and how to troubleshoot any issues. Testing the cutover strategy thoroughly provides feedback on how to execute the final
cutover.

Rollback strategy testing


You should also test your rollback methodology. The rollback testing should validate that your plan to failback or abort a
migration works so that you are prepared in case any issues occur during the cutover. Be sure to validate that access to
the data on the old system can be restored quickly and efficiently without affecting users.

Executing the migration


Once all the methodology and processes have been completed and validated, you can move on to the main migration.

The core migration phases are as follows:


 Data transfer—all the data is migrated from the old system to the new system
 Cutover—connections and clients are moved to the data on the new storage
 Acceptance—the new data source is ratified
 Rollback—a process used only if required
 Steady state and repeat—migration phase is considered complete but additional separation
migrations may occur
 Post-migration monitoring—the new system and data are monitored following the cutover

11
Data transfer
A standard approach to data transfer is to execute an initial “full” data copy to move all the initially identified data and to
follow it up with a series of incremental copies, which move only the data that had changed since the initial full copies had
run. This gives you the most flexibility in executing the cutover, as the additional incremental copies will be substantially
shorter to execute than the initial large data copies.

Cutover
After data migration, the process of actively moving clients from the old storage system to the new storage system will
occur during a cutover event.

With a high-level cutover plan, the following steps will occur:


 For the old source—remove write access to ensure that clients are unable to write any new data
 Execute a final incremental copy to migrate any remaining data from the old system to the new
system
 Test new target data and connectivity; selective UAT
 Go or no go—decide whether to move forward with the cutover event
 Update the client-to-storage connection mechanisms—Domain Name System (DNS), Distributed File
System (DFS), virtual IPs (VIPs), and so on
 Monitor the new storage system—monitor load and connections as the clients
are transferred
 Validate clients—review and validate that clients can successfully connect
and operate
 Validate workflow—verify that business operations work as expected
 The cutover is complete

Acceptance
Once the data and client connections have been migrated over to the new storage solution, the storage availability and
workflow acceptance of the new data and storage solution must be validated.

Once you begin writing new data to the new storage system, the ease with which you can roll back to the old storage
system diminishes. If you were to roll back to the old system, any changed data would need to be copied back to the old
environment. Unless this newly written data can be discarded, rewritten, or manually reconciled, Dell EMC Isilon strongly
recommends that you execute a rollback before any significant changes are made to the data using the new storage
system.

Rollback
Be sure that you have a fully tested rollback plan in place. A rollback may be needed for a variety of reasons:
 Client connectivity or storage name resolution issues develop following the cutover.
 The final incremental copy is not completed during the outage window, so not all data was migrated.
 An unplanned IT outage or issue occurs at the same time as the migration.
 Data access on the new storage system is invalid and workflows are impaired.

The goal of a rollback plan is to quickly restore access to the old data storage solution. Assuming that the cutover was
executed correctly, restoring the prior data access should be straightforward, and it should be possible to implement the
data restore with minimal additional disruption. The primary goal is to restore access within the cutover window so that no
additional downtime and interruption to data occurs. It is critical to have a tested rollback plan that can be used if an issue
with the cutover occurs.

Repetition
After you validate the data transfer through cutover and client acceptance, most migration projects consist of multiple
migration cycles. The methodology can be executed again on different datasets in migration waves that encompass the
entire project.

12
Post-migration
Following the migration cutover, it is important to monitor both the new storage system and the old storage system. You
should find that client connections are moving to the new system and that active data connections are no longer initiated
on the old storage system. Can clients connect to and work with the new storage system without issues? As connection
counts increase on the new storage system, you should monitor the load and performance, and make performance
adjustments as needed.

You should be monitoring the following items during and after the cutover:
 New system: System load and performance, number of connections, movement of users, security, and
performance
 Old system: Are users still connecting to it? Are there legacy connections being made to it from old applications?

If user quotas were utilized on the source system, it is appropriate to reimplement them on the target system following the
migration.
®
The Dell EMC Isilon OneFS SmartLock feature should likewise be implemented post-migration. Files should be committed
only once the data has been completely migrated.

You should have a transition plan for what you will do with the old storage system. Some common approaches are as
follows:
 Keep it around for a while but with administrator access only
 Provide read-only access for users
 Mothball the system while the new system transitions
 Decommission it
 Purge the data after a defined retention period has been reached

Single protocol NFS data migration


Although this paper addresses a single protocol NFS data migration, the approach and many of the best practices
described can be used as the foundation for other types of data migrations.

Challenges of a single protocol NFS data migration


Moving large amounts of data presents a number of challenges:
 It is difficult to perform such a migration without downtime. Most source clusters are overloaded, requiring that the
data be available at all times and that systems operate at near capacity. This supports the need for migration.
However, the migration itself can present a significant additional load to the source cluster.
 A large number of exports may need to be migrated. You must move not only the data but also the exports and
export permissions. This introduces a second type of migration (configuration) that must be undertaken during the
project.
 The consolidation of multiple source filers into a single unified namespace and directory structure can be difficult
to manage.
 There may be a large number of differently connected clients that require separate cutover and validation events.
 NFS exports may be mounted deeper in the exported tree.
 There may be restricted exports to specific hosts and unique permissions. Export options must be verified.
 There may be a high rate of change. Often large environments contain a large number of concurrently connected
clients. In such cases, you must account for the rapid rate of data change during and after cutover.

Data-specific considerations
When you design a migration strategy, you should determine how you would like your data to appear after it has been
migrated.

13
Consider the scenario of multiple smaller filers being consolidated into a single Isilon cluster. There may be duplicate
UID/GID collisions if the source filers are not using
a consistent source of authentication. In that case, manual remediation may be necessary to combine and fix user and
group accounts that have duplicate IDs.

For example:

filer1: has users user1 (UID:305), user2 (UID:423), and user3 (UID:424).

filer2: has users user1 (UID:305), user4 (UID:423), and user5 (UID:424).

The UIDs for user2 (UID:423) and user4 (UID:423) are the same, so one user would have to change to a different UID and
the file ownership corrected during the migration. The same issue would occur with user3 (UID:424) and user5 (UID:424).
One user would need to change to a new UID and file ownership would need to be corrected before the users would be
able to both use the new cluster. One possible resolution would be to make user4 UID:1423 and change ownership of all
their files prior to the cutover. Likewise, you could make user5 UID:1424 and modify ownership on all of their files.

Keep in mind that OneFS will store all UID/GID information regardless of the source. OneFS does not require NFS
authentication.

Duplicate export paths present a similar issue:

filer1: has exports /vol/share/acct, /vol/share/work, and /vol/share/eng


filer2: has exports /vol/share/acct2, /vol/share/work and /vol/share/engineering

The problem is that /vol/share/work is the same from both source filers. This issue must be discussed with the
customer prior to migration. A plan for directory consolidation must be developed to deal with export path collisions. One
typical solution is to have an additional directory layer that identifies the original source filer:

isilon: would have exports: /ifs/data/filer1/acct, /ifs/data/filer2/acct2, /ifs/data/filer1/work,


/ifs/data/filer2/work, /ifs/data/eng, and /ifs/data/engineering

With this methodology, duplicate export paths can safely be consolidated from multiple source filers into a single cohesive
namespace. Clients, however, must be updated to reflect the updated export paths.

The metadata of files, in particular, can add complexity to a migration. You must identify the metadata that you want to
migrate with the data. The following metadata can affect your migration strategy:
 File properties such as access time, created time, modified times, and owners
 File attributes such as read only or archive (an Isilon cluster does not support compressed and
encrypted)
 Extended proprietary file attributes that are in use; these are not supported on Isilon clusters
 Local users and groups; are these defined on the files?
 Deduplication is in use, or archive stub files or Mac OS X resource forks are present

Other data-specific considerations include:


 Date/access time/creation time retention requirements—These may not be preserved across migrations
depending on which tool is used, for example, creation time is not preserved with isi_vol_copy.
 Symbolic links will break—Depending on where the symbolic links connect to, the underlying paths will probably
change after a migration and require rebuilding.
 Automount maps will need to be repointed (NIS, NIS+)—Similar to symbolic links, the export paths may change
and the hostname may also change.
 Character encoding—Verify that it is the same on source and target; international characters in file names may be
problematic.

How does the data need to appear post-migration:


 Direct replication of all data and attributes
 Move the data, then make updates, fix problems, change the security, and so on; migrate just the data and
implement an entirely new security model

14
Migration requirements and customer data collection
Before you can plan your migration, you must collect requirements.

Requirements gathering
The data migration planning begins with identifying the data that you want to move from the old storage system to the new
storage system. Here is what you need to document:
Current state—what is the current state of:
Source infrastructure
 Existing storage platforms
 Network design and implementation
 Name resolution infrastructure: DNS, DFS, or global namespace
 Servers, clients, OS, and applications
Source infrastructure configurations
 Volumes
 Shares/exports
 Access
 Authentication
Source data
 Logical structure—data layout and directory depth
 Is the structure wide or deep?
 Physical structure—total size, minimum/maximum/average file size
 Number of files
Source data security
 Current security model and how file access is enforced
 Local users and groups
 POSIX permissions
 LDAP users and groups
Target state—what will be the target state:
Target infrastructure: Isilon cluster
 Network configuration
Target configurations:
 Directory layout and structure
 Shares/exports
 Access and authentication model
Target data
 Logical structure—same as the source or new system
 Physical structure—same as the source or new system
Target data security
 Keep it the same as the current security model
 Migrate and change the security model
 Move the data and implement a new security model

15
How to gather the data:
 Interview stakeholders
 Gather documents: network diagrams, run books, and infrastructure and application details
 Create a list of exports
 Develop storage reports, and so on
 Review the share permissions
 Examine the directory structure (shallow versus deep), file composition (small versus large), and the
number of files

Current infrastructure and data analysis


Start the migration design phase by collecting the data needed to develop the migration requirements.

Best practice
Create and utilize a standardized data collection and migration planning document, along with a standard target
configuration guide. By using a structured document to gather and collect all your source data and information, you can
identify your migration requirements, which will lead to clear migration design decisions.

Why: This will simplify and consolidate migration planning and implementation.

You need to collect the following information:


 The amount of data; the actual file data, not compressed or deduplicated data
 If there is deduplicated data, the amount of such data; this number will need to be added to the total
 The number of directories and files; identify the directory trees and the quantity
of them
 The directory structure: shallow and flat, wide and deep, or otherwise
 The number of directories with more than 10,000 files in them
 The number of exports; are there share name collisions or reuse on multiple source hosts?
 The way these exports are used—for home directories, application, or group use
 How permissions are applied to source—at the individual or at the group level
 The number of source locations; single source system or multiple
 How clients access data; protocols and how they resolve storage names
 The rate of file changes; how often and where files are changing
 Networking architecture; source systems and network between it and the Isilon cluster
 Source system load; understand what load the source storage is under and how much additional
overhead from the migration would be tolerable

Determine migration methodology


After you collect information on the source system, the data, and the infrastructure, you are ready to develop a migration
methodology.

Migration methodology considerations


The elements described in the text that follows details some of the key things you must consider as you develop a
migration methodology.

Logical migration design


By analyzing the structure and layout of the source data, you can make logical migration design decisions—structuring the
migration into distinct executable units. A goal of the migration methodology is to identify logical boundaries that facilitate
the cutover of your clients and workflows.

Some logical migration boundaries are as follows:


 Hosts, filers, servers, and arrays
 Volumes
16
 Exports
 Directories—users or groups

Best practice—Define migration boundaries

Identify logical migration boundaries: Identify clearly well-defined data structures for migration and cutover—for example,
entire exports or directories. Be aware of the size of data inside a migration boundary, as the size of the data affects the
outage window required to complete
a cutover.

Why: This best practice organizes the migration into segment waves, making the migration easier to manage.

After you segment the logical boundaries into distinct migration phases, you can address other elements of your workflow,
such as metadata, that you need to migrate.

File attributes and security

Most data migration also includes the migration of the file’s metadata: ownership, access times, creation time, and security
descriptors. Before you can execute your migration, determine how you plan to handle metadata and file security.

Common migration approaches:


 Migrate data files on an as-is basis (with no change to permissions or ownership).
 Migrate data and permissions but also set the destination (recalibrate
the permissions).
 Migrate the data only. Create new permissions on the destination, or create a new security design.
 Migrate away from an existing security model and implement a new model. Dell EMC Isilon
recommends that you use a central authentication scheme on Isilon clusters. For example, if the NAS
system that you are migrating from uses several directory services, you should consider consolidating
the directory services into a single directory service for the new NAS system.

Best practice—Understand the attributes of the source data before the migration

Be data aware: Identify any DOS attributes, nonstandard extended file attributes, and nonstandard permissions that are
not supported by an Isilon cluster. Also, identify your local users or groups and have a plan to deal with them.

Why: Before you execute the migration, you may need to take additional steps to prepare the data for migration so that it
will be available on the new storage system.

Migration sequencing
The execution of a migration will likely require multiple iterations of the data transfer. If the source data is constantly
changing, try to find a window when the source data can be locked in a read-only state, or deny access to clients. Once
access to the source data is removed, the final data transfer can take place. Otherwise, differences between the data on
the source and the target system might result.

The recommended approach for a data migration is to use a multistep migration. A multistep migration consists of an initial
“full” or “level 0” data copy. The initial data copy is followed by a series of “incremental copies” that update only the new or
changed data. The initial data copy moves an entire copy of the source data. It can often take a long time to execute
because all the data must be assessed and transferred over the network to the migration target.

After the initial copy completes, additional differential transfers copy only the data that has changed since the initial full data
copy was executed. Additionally, any data that is deleted on the source will also be deleted on the target through the
incremental process. The size of an incremental copy is affected by the rate of change of the source data.

You should run multiple over-the-top incremental copies to guarantee the integrity and consistency of data that encounters
any issues during the initial full copy. Incremental copies will also keep the two data sources in sync with each other and
require less catch-up work on final copy.

A final incremental data copy should always be executed as part of the migration cutover plan to ensure that all the latest
data is on the new target storage system.

17
Best practice—Run initial full copies followed by incremental copies

Run initial full copies followed by multiple incremental copies. Always execute a final incremental data copy during cutover
to ensure that the latest data from the source
is migrated.

Why: Executing multiple migration passes will ensure that all the data is transferred and that the latest version of the files
will be stored on the target storage system.

Type of migration
You must determine how the migration will be executed. There are two possibilities: an indirect execution from a host and a
direct execution from an Isilon cluster. With host-based migration, an intermediary host executes a copy process between
the source system to the target system through the host, as shown in Figure 2.

rsync
intermediary host

“Source” NAS
storage array

LAN/WAN
LAN/WAN

“Target” Isilon cluster

Figure 2. Host-based migration

With a host-based migration, all data is transferred through an intermediary host en route to the Isilon cluster.

If the source system is supported, the Isilon cluster can execute a direct source to the Isilon data copy by using the Isilon
OneFS isi_vol_copy command, which will copy data using the Network Data Management Protocol (NDMP).

Host-based migrations
A host-based approach might be selected for a number of reasons:
 The source system does not support an Isilon-based migration—isi_vol_copy is not supported.
 Connectivity is restricted—storage is on different networks; a host may bridge
the networks.
 There is flexibility in execution—separate the execution from the administration of the storage
systems.
 There are security restrictions—these can be used to limit access to systems.

In a host-based migration, the toolset executing the migration makes a connection to the source and to the target system,
and then copies the data through the host. For the purpose of this paper, the primary host-based tool is rsync.

18
Best practice—Select a suitable host

Select a suitable host to run the migration that has adequate network bandwidth and processing power.

Why: Because all the data will move through the host, incorrect sizing may lead to a
bottleneck or an interruption in the migration. Using multiple hosts may facilitate
multistreamed migrations in which you can maximize network usage and the Isilon
nodes by executing multiple migrations concurrently. A host with 10 Gb network
connectivity is highly recommended.

Some common considerations are as follows:


 Adequate resources to execute the migrations—CPU, RAM, and network 10 gigabit network infrastructure where
possible
 Connectivity between the host and the source and target storage systems
 Availability; the host is stable and reliable—no reboots or downtime occur
 Dedicated host—not running a lot of other parallel workloads and restricted user access

The migration host needs to be as optimized for the migration workload and have high network throughput as much as
possible, so it will send and receive all the data to be migrated. Figure 3 shows an Isilon-based migration.

“Source” NAS storage Isi_Vol_Copy via NDMP


array protocol

LAN/WAN
LAN/WAN

“Target” Isilon cluster

Figure 3. Isilon-based migration

With an Isilon-based migration, data is pulled directly from the source system to the Isilon cluster utilizing isi_vol_copy.

Another method of migration can be achieved by directly running the Linux rsync replication utility on the Isilon cluster itself.
This approach is shown in Figure 4.

19
“Source” NAS storage
array

LAN/WAN
LAN/WAN
rsync

“Target” Isilon cluster

Figure 4. Isilon-based migration with rsync on individual nodes

Similar to the OneFS isi_vol_copy tool, rsync can be run natively on the individual Isilon nodes against locally mounted
NFS source exports that are mounted directly on each Isilon node. Data is transferred directly from the source cluster to
the Isilon cluster, reducing latency and network congestion and eliminating the need for external host computers to move
the data.

Isilon-based migrations
If the source system is capable of supporting an Isilon-based migration by use of isi_vol_copy or by direct access with
rsync, the connectivity exists, and the migration methodology supports utilizing this approach, a direct migration may be a
more applicable technique. The main advantage of the direct approach is there is no need for an intermediary host to
execute the process or for the data to traverse the external host.

Type of Isilon-based migration

There are two primary types of Isilon-based migrations:


 NDMP-based migration with isi_vol_copy
 Rsync-based migration—use the UNIX rsync tool to connect and either push or pull data directly to the Isilon
target, running natively on the Isilon cluster

NetApp migration

A NetApp Isilon-based NDMP migration requires the following:

Isilon requirements: Isilon OneFS 6.5.5.6 or later

NetApp requirements: Data ONTAP 7.x or Data ONTAP 8.x operating in 7-mode

It is anticipated that additional source systems will be supported in future releases of Isilon OneFS.

As in all migration strategies, it is critical to evaluate the migration methodology against the selected approach to determine
if the method selected will facilitate your migration goals.

Best practice—Evaluate migration approach

Evaluate and select the most appropriate migration approach by selecting the method that meets your specific migration
requirements, provides cutover flexibility, and optimizes
data throughput.

20
Why: The selected approach will impact the migration schedule and planning.

Once you have identified the migration approach you will use, you can select the appropriate migration tool.

Migration tool selection and use


The data migration requirements will help define the tool selected to execute the data migration.

Tool selection
A number of tools are available and will work with your migration. Any file copy method that can connect over NFS to the
source and target storage can be used to move data between the systems. Dell EMC Isilon recommends that you use a
tool that can be automated and which provides robust functionality—a tool that can copy attributes, security, logging, and
so on.

The common NFS data copy tools are shown in Table 1.

Tool Advantages Disadvantages

isi_vol_copy
 Is included with Isilon OneFS  Is supported only against specific
source storage systems—NetApp
 Pulls across all user and group
systems running OnTap 6.5 and
permissions
later with NDMP v4
 Supports both SMB and NFS protocols
 Has limited error reporting
 Utilizes NDMP

 Provides a direct source-to-target


migration

isi_vol_copy_vnx
 Is included with Isilon OneFS  Is supported only against specific
source storage systems (VNX 7.x
 Pulls across all user and group
OE and Celerra DART 5.6.x or
permissions
later)
 Supports both SMB and NFS protocols
 Has limited error reporting
 Utilizes NDMP

 Provides a direct source-to-target


migration

rsync
 Is the Dell EMC preferred tool  Has limited error reporting

 Is designed for synchronizing directories

 Sends only differences in data when files


change

 Can be used with lots of available


switches

 Can be scripted

 Is open source—widely available

tar, cpio
 Can be scripted  Has limited error reporting

 Is open source—widely available  Is designed for backup and restore,


not active copying
 Is good for one full push of data

21
Tool Advantages Disadvantages

 Does not make incremental copies

Table 1. Summary of NFS copy tools

Tool versions
It is important to understand that different tools may behave differently on different hosts. Dell EMC Isilon strongly
recommends that you test tool versions and observe their behavior.

Best practice—Use the correct tool for the job

For NetApp filer: use isi_vol_copy


For VNX: use isi_vol_copy_vnx
For general NFS filers: use rsync

Why: Using the correct tool for the job will give you the best chance for a successful migration.

Best practice—Use the correct version of the tool

Rsync is available on nearly every UNIX and Linux distribution, as well as natively on the Isilon cluster. You must use the
correct version for the OS for the host that is running the tool.

Why: Using the correct version of the tool will optimize throughput and performance, that is, use a 64-bit version if your
host OS is 64-bit.

Best practice—Use the latest version of migration tools

You should always use the latest versions of the chosen file copy tool.

Why: Performance is optimized when you use the latest versions of a file copy tool, and they often have newer features
and bug fixes.

Migration tools
The following section provides an overview of the primary NFS migration tools that can be used in Isilon data migrations.

rsync

Overview:
The rsync tool provides a method for copying files, directories, and subdirectories from NFS exports to other NFS exports
with the ownership and attributes intact. It was designed to efficiently synchronize files and directories from one location to
another, minimizing data transfer while using delta encoding where appropriate. If the source and destination have many
files (and parts of files) in common, the utility need only transfer the differences between them. Incremental change copies
are thus extremely efficient.

Rsync can operate in both a local and remote mode (as a service) and behaves similarly to rcp. It can “pull” or “push” files
from filers.

Rsync should be run as a root to preserve file permissions and ownership. It can also use Secure Shell (SSH), if
necessary, for secure environments.

Source code is available, and rsync is implemented on nearly every modern operating system.

22
Usage:

Rsync [options] <source> <destination>

Features:
 Enables you to copy file data, ownership, and time stamp information
 Is extremely efficient for incremental copies

For a full list of rsync features and switches, run the following from a command shell:

man rsync

Sample rsync command:

rsync $OPTS [$SOURCE:]$SOURCEDIR [$TARGET:]$TARGETDIR

where variables are typically defined:


$OPTS=--force –ignore-errors –delete-excluded –exclude –from=$EXCLUDES –delete –
backup –a”

$EXCLUDES=/path/filestoexclude

$SOURCE=name of source filer

$SOURCEDIR=/path/sourcefiles

$TARGET=name of isilon node destination system

$TARGETDIR=/path/targetdir

Rsync by default runs in a local mode, but with the addition of [$HOST:] in front of either $SOURCEDIR or $TARGETDIR,
it can transfer files remotely between systems.

For example:

rsync –avh /tmp/foo root@host2:/tmp/bar

If run from the source system, it would transfer the local directory (/tmp/foo) to the remote host (host2) and place them in
the target directory (/tmp/bar).

Note that a shell script is usually created to automate and distribute the rsync jobs. Entire migrations can be automated
and incremental passes scripted to run automatically. Review the scripts with the customer to verify that the sequence of
commands matches the expected migration plan.

Best practice—rsync and compression

If the data is mostly binaries or large uncompressible files, Dell EMC Isilon does not recommend that you use
compression, as this will slow the migration considerably. Text files, however, will readily compress and, if the source data
consists of text files, using this option will greatly speed the migration. Know the source file composition.

Why: Trying to compress non-compressible data will greatly slow the migration.

23
Best practice—Exclude snapshots from replication

It generally does not make sense to migrate snapshots because they will not automatically work on the target system as
intended. Therefore, exclude them from the migration to speed the process.

Why: Snapshots will not migrate.

For example:

rsync –avh –exclude=’.snapshot*’ /tmp/foo root@host2:/tmp/bar

If run from the source system, this would transfer the local directory (/tmp/foo) to the remote host (host2) and place data in
the target directory (/tmp/bar) while excluding snapshots.

Best practice—Watch for spaces in names

Be aware that spaces in file and directory names can cause problems.

For example, a directory named “/spaces in my name” and a file named “some file.avi” would require special handling on
both a command line and in a script:

rsync –av foo@foomachine:’/spaces\ in\ my\ name/some\ file.avi’ /local_directory/

The “\” character is used before spaces to prevent the shell from parsing the next word as a separate argument.

Why: Spaces in file names and directories can cause scripts to fail. Because of this, be alert for them.

Best practice—Starting rsync switches

Suggested initial rsync switches:

--a=archive mode; equals –rlptgoD (no –H,-A,-X)

--delete=delete extraneous files on the target system (useful on incremental copies if the source data has been deleted)

--force=force the deletion of directories even if they are not empty (during incremental passes if directories are deleted on
the source)

--compression=compress the data being transferred (if the data is compressible)

Why: Dell EMC Isilon recommends that you start with a baseline of switches and test the copy, validate the results and
behavior of the copy, and make the appropriate adjustments to the rsync switches. No single default set of switches will
work for all migrations. Remember, rsync can be run multiple times incrementally, and different directories/exports may
require different options.

You should become familiar with many of the rsync switches and their use. The following highlights a few possible options
that you should be familiar with. It is important to recognize that each migration will require different switches because of
the unique requirements of each dataset.

A few useful switches to be aware of include the following:

-r,--recursive

Recurses into directories

-l,--links

Copy symlinks as symlinks-p,--perms

Preserves permissions

-h

Outputs numbers in a human readable format

24
--progress

Shows progress during the transfer

-z,--compress

Compresses the file data during the transfer

-g,--group

Preserves the group

-o,--owner

Preserves the owner

-D

Preserves special files and device files

--protect-args

Enables you to transfer files that contain white space; you can either specify --protect-args or escape the white space with
a “\”

-stats

Provides detailed list of the total number of files, files transferred, benchmarks, and an average transfer speed

-t,--times

Preserves modification times

-v,--verbose

Increases verbosity

-n,--dry-run

Performs a trial run with no changes being made

--exclude=PATTERN

Excludes files matching PATTERN

--exclude-from=FILE

Reads exclude patterns from FILE

Symbolic links and hard links

Be aware of symbolic links within the source file system. They may not point to the same target after migration if paths
change.

Rsync has multiple methods of dealing with symbolic links. Choose the most appropriate option after consulting with the
customer. These links can also be addressed in a separate migration pass.

By default, links are not transferred at all. A message such as “skipping non-regular file” is generated for any symbolic links
that rsync encounters. Switches to deal with links include:

--links

Symbolic links are recreated with the same target on the destination. Note that --archive implies --links.

25
-L,--copy-links

Symbolic links are “collapsed” by copying their referent, rather than the symbolic link.

-H,--hard-links

Preserves hard links

--safe-links

Ignores symbolic links that point outside the tree that is being replicated. This is useful for preventing sensitive system files
such as /etc/passwd from being inadvertently copied.

Best practice—Know the rsync switches

You should understand all the rsync switches and when and how to use them.

Why: Different migrations will require the use of different switches to meet the requirements of the data copy and the final
state of the migrated data. Discuss this with the customer before the migration begins to determine the optimal selection
of switches.

Best practice—Parallelizing the rsync processes

Examine the source directory structure and look for obvious ways to divide the source directory tree into smaller, more
manageable chunks.

For example, if you have a file system of 4,000,000 files, it might take six hours to complete the process in this
hypothetical example. Consider if the file system tree was divided into something like the following:

drwxr-xr-x 2 root root 179 Jul 19 15:00 directory_a


drwxr-xr-x 2 root root 179 May 1 00:00 directory_b

It would cut the migration time in half if you could run two simultaneous rsync jobs at the same time (assuming the content
of the directories was balanced nearly equal):

rsync –av –include=”/directory_a*” –exclude=”/*” –progress remote::/ /localdir/


rsync –av –include=”/directory_b*” –exclude=”/*” –progress remote::/ /localdir/

The best performance would result from spreading requests across multiple Isilon nodes and multiple source network
interfaces. Multiple rsync jobs can be run on individual nodes as well, but these processes tend to be network limited. You
want to spread the load across as much of the Isilon cluster as possible, maximizing the available bandwidth on each
node. Be aware if the NFS option “map root to nobody” is implemented, as this may affect access to files.

Why: You will see increased performance, but you may be limited by network bandwidth and source cluster throughput.

Isilon-based migrations – isi_vol_copy


Overview:

Isi_vol_copy is a native Isilon OneFS tool that supports data migration through the use of the NDMP. The tool allows the
cluster to mimic the behavior of a backup target and allows the data to be copied directly from the source system to the
Isilon, which preserves permissions and ownership.

Usage:

isi_vol_copy <src_filer>:<src_dir> [options] <dest_dir> [ -full | -incr ]

[-sa user: | user:password]

[-sport ndmp_src_port]

[-dhost dest_ip_addr]
26
[-maxino maxino]

[-h]

Features:
 Utilizes native NDMP functionality and connectivity
 Supports full and incremental backup levels
 Migrates data and all security, and attribute information
 Will restore the set of permissions and ACLs that existed on the source data
 Will migrate NFS and SMB source data
 Does not impact or interact with client data access
 Provides a dedicated data transfer pipe between the source and cluster
 Starting with OneFS 7.0.2, it supports the Backup Restartable Extension, so that full backups can be interrupted
and restarted from a checkpoint

Limitations:
 Source NAS arrays have limits on the number of NDMP threads and simultaneous backup jobs; you should
therefore avoid overloading the source NAS system.
 It can be limited by source filer network bandwidth.

Sample isi_vol_copy command

isi_vol_copy <source_filer_IP>:/<source> -sa<ndmpuser>:<ndmppassword> /ifs/data/<source_filer> -full

Best practice—isi_vol_copy target data use

Do not alter data on the target Isilon system until after the isi_vol_copy has completed.

Why: This will create problems and you may have to reperform a full copy.

Best practice—Simultaneous isi_vol_copy use

Do not execute multiple isi_vol_copy going to the same target, that is, don’t have all your isi_vol_copy migrations going to
the same target directory. For example:

filer1:/vol/sourcedir -> isilon:/ifs/data


filer2:/vol/sourcedir2 -> isilon:/ifs/data

Why: This creates problems for the copy process and may require remediation after migration.

Instead: Use an additional directory level:

filer1:/vol/sourcedir -> isilon:/ifs/data/filer1/sourcedir


filer2:/vol/sourcedir2-> isilon:/ifs/data/filer2/sourcedir2

If consolidation is required, this can occur after the data is migrated and any potential merging of identically named
subdirectories can be addressed.

Best practice—isi_vol_copy use

isi_vol_copy is optimized to stream as much data as possible across a network; always monitor load on the source and
target systems for any potential impact.

Why: Because isi_vol_copy is optimized to stream as much data possible, don’t overwhelm older source systems and
create potential link saturation or disk problems, especially if there are users connected and they are attempting to access
files.

27
Best practice—isi_vol_copy limits

Dell EMC Isilon recommends that you use fewer than 40 million files per volume transfer when using isi_vol_copy.

Why: All programs have limits, and this is the recommended maximum when using isi_vol_copy for each individual
transfer. Larger source volumes should be broken up into smaller chunks (that is, use a separate isi_vol_copy stream for
multiple subdirectories instead of one large transfer of an entire volume).

Once the initial copy is complete, then incremental copies can be run:

isi_vol_copy filer1:/vol/sourcedir –sa root:<password> /ifs/data/filer1/source –


incr

Important: Do not start an incremental copy job until a full copy has been completed. A successful full copy must finish
before an incremental copy is started. Unlike rsync, which automatically does incremental copies, isi_vol_copy must be
explicitly called with –incr to perform an incremental copy.

Subsets of source directories and volumes can be migrated.

An entire volume does not need to be migrated; subdirectories can be migrated individually as well.

For example:

A volume, /export/vol1, is exported containing subdirectories /work, /scratch, /tmp, and /home. You could migrate the entire
vol1 or any/all of the individual subdirectories under vol1, for example, /export/vol1/work and /export/vol1/home might be
the only necessary directories to move.

Isilon-based migrations from VNX—isi_vol_copy_vnx


Overview:

isi_vol_copy_vnx is a native Isilon OneFS tool that supports data migration through the use of the NDMP for VNX. The tool
allows the cluster to mimic the behavior of a backup target and allows the data to be copied directly from the source system
to the Isilon cluster, preserving permissions and ownership.

Usage:

isi_vol_copy_vnx <src_filer>:<src_dir> [options] <dest_dir> [ -full | -incr ]

[-sa user: | user:password]

[-sport ndmp_src_port]

[-dport ndmp_data_port]

[-dhost dest_ip_addr]

[-h]

Features:
 Utilizes native NDMP functionality and connectivity
 Supports full and incremental backup levels
 Migrates data and all security and attribute information
 Will restore the set of permissions and ACLs that existed on the source data
 Will migrate NFS and SMB source data
 Does not impact or interact with client data access
 Provides a dedicated data transfer pipe between the source and cluster

Limitations:
 Source filers have a limit on the number of NDMP threads and simultaneous backup jobs; you should therefore
avoid overrunning the source filer.
 It can be limited by source filer network bandwidth.

Check with Dell EMC Isilon Support for the latest compatibility with tools, DART codes, and OneFS.

28
Migration preparation
After you finish planning the migration and selecting the tools you will use, you can prepare the source and target systems
for the migration.

Infrastructure and environment setup


Network connectivity

Because all the data in the migration will traverse the network, you should optimize the network infrastructure and
connectivity between the source system(s) and the target Isilon cluster.

Common recommendations include the following:


 Maximize network bandwidth; 10 Gb/s is preferred to 1 Gb/s, optimized end to end, with a Maximum Transmission
Unit (MTU) of 9000 bytes
 Limit hops and latency between the source and target storage systems
 Isolate migration traffic so that it does not to compete with client access
 Limit potential network bottlenecks that can occur with routers, firewalls, Intrusion Detection System (IDS), and
shared network infrastructure

Best practice—Optimize the network for the migration traffic

Optimize the migration network path; try to limit other production traffic from this network and limit network devices the
traffic traverse (firewalls, IDS, and so on). Ideally, try to create a dedicated private migration network that can be
optimized for only the migration traffic.

Why: Separating the migration traffic from other network traffic will allow for maximum throughput and reduce potential
impact to existing production traffic by limiting
network saturation.

Migration account

In order for the migration data to be copied from the source to the target system, the tool accessing the data must be able
to access all of the source and target data.

Commonly used migration accounts:


 root
 User accounts created explicitly for the execution of isi_vol_copy, for example, ndmp

The account used to connect to the source and target storage systems will depend on the security model implemented in
the environment.

Best practice—Use a specific migration account to execute migration tasks

Use a specific migration account or an account with group membership that has the required access to all source and
target data, that is, root.

Why: Using a dedicated account will allow for oversight and management of the migration
data access. It will also allow migration tasks and users to be separated from other
production accounts.

Source host preparation


The source data storage system should be prepared and optimized for the migration.

Best practice

Access can be restricted to the source cluster exports, preventing users from changing data on the source cluster instead
of the migration target clusters. Change exports to read-only status once the migration and incremental copies are

29
completed to prevent clients from writing to them.

Why: This allows you to separate migration events from normal production access. This process can be used post-
cutover to deny read/writes from being made by a normal user to the source cluster. This prevents updates to the data
during data cutover and post-migration while still allowing administrative access.

Migration host preparation—Source and target access


The migration host should be prepared and optimized for running the migration copies:
 Limit workload and access to optimize throughput
 Restrict access and reduce service issues with the host
 Prepare all migration jobs as scripts
 Test and validate network throughput

Best practice—Watch out for root_squash

On the source cluster, exports sometimes restrict access by using root_squash to prevent root users from connecting
remotely and having root privileges. But this access is needed for migrating data. Instead, use the option
“no_root_squash” to turn off root squashing.

Why: Root access (or its equivalent) is needed to migrate all files and directories.

We can also set root squashing on the Isilon NFS exports as shown in Figure 5.

Figure 5. Setting root squashing on Isilon NFS exports

30
In Figure 5, “User/Group Mappings: Use default: Map root users to nobody” has been set for the export /ifs/data/work. This
should be disabled to allow the root full access to the file system. Note that this can be set on an export-by-export basis. In
addition, root access to a specific client, 192.168.43.200, has been restricted in the figure example. Typically, this would be
the host performing the migration.

Isilon cluster configuration preparation


All primary setup and configuration of the Isilon cluster should be completed before data migration begins. The
configuration includes, but is not limited to, the following:
 Authentication provider integration—ensures that all authentication providers are online and fully operational.
 If local users, all UIDs and GIDs are created and tested.
 Access zone and role-based access control (RBAC) setup—complete any zone and RBAC setup.
 Exports for clients are created and tested.
 Networking design and setup—complete the setup and implementation of the network configuration.
 SmartPools—complete the implementation of any SmartPools policies to limit post-migration work.

®
Dell EMC Isilon SyncIQ —prepare any existing SyncIQ policies to operate alongside any data migration events.
 SnapshotIQ—prepare any SnapshotIQ policies to operate alongside any data migration events.
 SmartLock—execute all preliminary SmartLock work prior to migration.
 SmartQuotas—disable SmartQuotas until the migration is completed.

Dell EMC Isilon recommends that you use a dedicated Isilon migration directory export to execute all migrations against.
Using a dedicated administrative migration export with the appropriate access and configuration can facilitate the migration
without impacting workflows or data permissions. Normal user clients will not mount this export; it will be mounted only by
migration hosts. Data can then be moved into place after the migration is completed with minimal disruption.

Best practice—Implement a logical NFS export path structure methodology

Be aware of export rules and how they interact with each other:
Path—should be unique, though nesting is possible if used with caution
Permission options—restricted by user ID mapping and IP addresses
Security—netgroups, authentication system, and Kerberos

The order of evaluation is path, client ACLs, then security types (unix, krb5).

For example, if you have exports:

/ifs/data --client=8.8.8.0/24
/ifs/data/something --client=10.10.10.0/24

and your client IP is 10.10.10.10, you would not have access to /ifs/data/something because the export, /ifs/data, has a
different IP restriction. The client must be able to traverse the path if the exports are nested. Access for 10.10.10.0/24
would need to be added to the /ifs/data export:

/ifs/data --client=8.8.8.0/24 --client=10.10.10.0/24


/ifs/data/something --client=10.10.10.0/24

Why: Complex and restrictive rules may prevent clients from connecting to exports that are nested. Clients may
encounter problems mounting a directory that is nested from a different export. If the parent directory has more restrictive
permissions, a client may not be able to mount a child export of that directory.

Best practice—Create the NFS exports

Create the new Isilon NFS exports prior to data migration.

Why: This will allow the creation and setup of the exports and export permissions prior to data migration and cutover for
initial testing and access validation/UAT.

31
Best practice—Do not use default /ifs export

Not using this export enables you to mount subdirectories and have open access to the whole file system.

Why: This should be used for easy setup use only, as it can be a potential security issue.

Best practice—Create the correct NFS export permissions

Set up the correct export permissions on the newly created user exports.

Why: Setting the correct export permissions will allow you to test and validate workflows when test migrations are
undertaken and maintain security.

Note: The migration methodology may include adding an explicit “deny” permission settings on users or directories, so
that they cannot write data to these exports until the cutover has been executed, as well as specific IP addresses to
prevent clients from accessing the exports.

Additional Isilon cluster considerations


The following are some additional Isilon cluster considerations that may need to be addressed prior to and during a data
migration.

NFS group membership limitation


The NFS standard by default does not support membership in more than 16 groups per individual user. This limitation can
be addressed on an Dell EMC Isilon cluster by enabling “Map Lookup UID” under NFS Settings ->NFS Export Settings -
>Export Behavior Settings -> Map Lookup UID as shown in Figure 6.

32
Figure 6. Increasing the number of group memberships in an Isilon cluster

Additional information can be found in the Dell EMC document “NFS supplemental groups limited to 15 in OneFS 6.5.4 and
earlier” (Article Number:000089550).

Production or preproduction cluster


An important consideration when planning and executing a data migration is the current status of the Isilon cluster. Is the
cluster in production or will the migration mark the initial cutover to active production traffic? The primary goal should be to
lessen any impact on a production cluster during migration activities, so appropriate steps should be taken to address
these concerns.

Common factors to be aware of while migrating to a cluster are as follows:


 Administratively destructive actions
 The saturation of network links
 Cluster load and ingest, and impact they have on production workflows

Access zones and role-based access control


If the cluster uses an Isilon access zone or RBAC, the migration methodology may need to be adjusted to accommodate
this configuration. Currently, OneFS 7.0 only allows for NFS exports in the default System Zone and no other zones.

33
NFS RPC threads
By default, the number of NFS RPC threads is set to 16 per node. This number can be increased for specific workflows.
Consult Dell EMC Isilon Support for assistance.

Isilon OneFS SmartConnect or direct node connections


The current status of the cluster may dictate that you should try to optimize and segregate migration traffic within the
clusters network configuration. You can do this by:
 Using SmartConnect to autobalance traffic
 Separating migration traffic from existing production traffic by using a direct node or separate SmartConnect zone
connection for migration traffic

If you use SmartConnect, you should validate and optimize the configuration before you transfer data across the network.

Isilon OneFS SyncIQ considerations


If the data to be migrated will be replicated to a secondary cluster through a SyncIQ policy, additional planning should be
undertaken to address the impact of the data migration and its interaction with active SyncIQ policies. With this scenario,
you should:
 Pause active SyncIQ policies if they include migration paths
 Schedule SyncIQ jobs to run outside of data copy windows
 Utilize SmartConnect zones for copying and SyncIQ replication

Isilon OneFS SmartPools

Any SmartPools data policies should be in place prior to data migration, or additional cluster overhead may be required to
move data within the cluster post-migration.

Isilon OneFS SnapshotIQ

Any active SnapshotIQ policies should be analyzed for any impact during the data migration.

Antivirus integration

You should review and disable any active antivirus scanning policies that may be running against the target data.

34
Best practice—Disable antivirus scanning

Disable active antivirus scanning on migrated data during the initial full and incremental copies.

Why: The large influx of data associated with the migration can place an excessive load on
the antivirus scanning architecture and create a slowdown and potential bottleneck for
inbound data.

Isilon guidelines for large workloads


Be careful not to exceed the maximum configuration values listed in Table 2.

Tested or Theoretical or
default max. practical
Guideline value value Comments

NFS max. read 128 KB 1 MB This applies to both NFS3 and NFS4.
size
Prior to OneFS 7.0, the maximum read size was 128 KB.

NFS max. write 512 KB 1 MB This applies to both NFS3 and NFS4.
size
Prior to OneFS 7.0, the maximum write size was 512 KB.

NFS3 1000 N/A The number of TCP sockets available on the node is
connections (per typically what limits NFS connections. Unlike the Isilon
node) SMB server, nfsd uses file handles instead of file
descriptors to represent files and internally handles work
items much differently.

1,000 connections is a very conservative tested limit and


represents 1,000 mounts over 20 exports. NFS
connection testing is an ongoing test effort.

No maximum connection limit for NFSv3 has been


established at this time.

NFS3 exports 750 2,000 Beyond 2,000 connections, manageability becomes a


(per cluster) problem. Cluster size does not matter.

nfsd threads (per 16 16 This is a kernel limit, which is exposed via the Linux sysctl
node) command.

The value should not be changed without first consulting


Dell EMC Isilon Support.

This limit represents the maximum number of


simultaneous work items the server can service.
Simultaneous work items beyond this number are queued
and serviced when resources become available.

File name length 255 B 255 B Note: Most Unicode character encodings (like UTF-8,
which is the OneFS default) allow for multiple bytes per
character (UTF-8 allows for up to 4 B/character).

So, of the 255 B in the filename, that could represent 255

35
Tested or Theoretical or
default max. practical
Guideline value value Comments

characters, 63 characters, or some number of characters


in between.

Path length 1024 B 1024 B This is the maximum absolute path (for example,
/ifs/data/foo/bar/baz/) length that can be passed into a
syscall, not the maximum depth of a directory in the file
system (see “Directory depth”).

Directory depth 8470 Unlimited* * No specific hard limit is in place, but several other limits
could come into play (inode limits, metadata storage
limits, and so on). In tests, command line (shell) utilities
begin to experience problems at a depth of 8470 (EBADF
from many commands). At a depth of about 30000,
internal utilities (for example, Job Engine TreeDelete) also
begin to fail.

For utilities that make calls with absolute paths (for


example, cd /1/2/3/…), depth will be limited by “path
length” as described above. This applies to path-based
OneFS commands like “isi snapshot” and “isi quota”). For
utilities that access relative paths (for example, cd 1, cd 2,
cd 3…), these higher limits may apply, although the value
of extraordinarily deep directories is questionable.

36
Tested or Theoretical or
default max. practical
Guideline value value Comments

File size 4 TB 4 TB This is the hardcoded OneFS limit. Note that Job Engine
performance can be impacted on files larger than 1 TB
due to inefficient per-file threading.

Table 2. Isilon guidelines for large workloads

Migration approach—Testing and proof of concept


Once you have developed the migration approach, selected the toolset, and prepared the infrastructure for the data
migration, you can proceed with your initial testing of the methodology. The goal of the testing is to validate the outcome—
is data migrated, are the permissions moved, and are the timestamps moved? The testing phase also allows you to tune
and modify the migration approach to optimize all parts of the process.

The recommended testing approach is as follows:


 Run the full copy—benchmark and monitor
 Review and validate—potentially look at tuning or tweaking the methodology and re-run the full copy
 Run the incremental copy—benchmark and monitor
 Review and validate—potentially look at tuning or tweaking the methodology and re-run the
incremental copy
 Continue to run the incremental copy—and continue to monitor it

Dell EMC Isilon recommends that you test different copy methodologies to tune and optimize the throughput while meeting
your migration requirements.

Best practice—Execute multiple test migrations to validate the methodology

Dell EMC Isilon recommends that you execute multiple migration tests on smaller subsets of different data.

Why: Because different data will tend to have different properties and access profiles, it is important to test all data types
and how the migration methodology may need to be modified for different datasets.

Critical areas to evaluate and monitor during data migration testing are the following:
 Network performance—throughput, saturation, and impact
 The time to execute a full data copy—will allow for refinement of project plans
 The time needed to execute an incremental copy after a set number days after the data change occurs—will help
define cutover windows
 Cluster load, source load, and host load—will help tune and refine the migration methodology

Best practice—Test all phases of the migration methodology

Execute all steps in the migration methodology to identify the time involved and to verify that the proposed methodology
fulfills all the migration requirements.

Why: It is important to identify issues with the methodology before executing production migrations and cutovers.

Data validation
After you migrate the data, you must validate the data and the file attributes. You must verify that:
 File data copied correctly—data is intact and integrity is maintained
 File security, ownership, and attributes migrated correctly
 File timestamps are correct

37
Next, review the access control entries on a file by running the ls –led followed by a file name as is shown in Figure 7.

Figure 7. Reviewing access control entries on a file

You should also validate the data. Common methods include the following:
 File size compares
 Checksum/file hash compares—MD5 checksums
 Tools—MD5, sum, and checksum
 Audit and review directory structures

Once you have reviewed the data attributes directly, it is critical that you validate that the data works in client workflows.

Performance
A migration often moves a large amount of data. You must ensure that the migration methodology, toolset, and
environment are optimized for performance and throughput to work within the migration timeline. The common areas to
focus on when evaluating performance are as follows:
Identify bottlenecks—attempt to identify the worst-performing component
 Disable antivirus scanning processes on target and/or source file systems during initial migration copies to
minimize CPU impact on client access and its potential impact on extending the length of copy times.
 WAN bandwidth physical (circuit limitations) and concurrency impact other systems that are replicating data (SAN,
backup, and so on) over a shared link. This could affect replication performance for SyncIQ jobs that need to be
run to completion before certain cutovers can be conducted.
Review the timing of the execution—how time of day and day of week tests were executed versus performance
Collect metrics on the data copies, network throughput, source, host, and target systems—evaluate the copy as a whole

Best practice—Time the incremental copies

Benchmark the incremental copies by timing how long they take to execute so that you can plan and orchestrate the
cutover phases appropriately.

Why: Knowing how long an incremental copy will take will likely help you determine the length of time required to execute
a cutover and will help you determine the data outage window that may occur.

User acceptance testing—Data and workflow testing


The final step of migration data testing is the UAT in which the data is tested for integrity with existing workflows. Dell EMC
Isilon recommends that you use test workflows because this data should be considered test data only at this time and may
be removed by later migration steps.

Best practice—Check workflows with test migrated data

Review all workflows on test migrated data.

Why: It is critical that you validate that newly migrated data can be integrated into workflows (that is, user home director
or group share access, and so on) at cutover time without issues. By testing the workflows, you can ensure that cutovers
will occur without incident.

38
Start of migration execution
After you complete all the testing and validation, you can begin to move into the production migration phases. All the
information obtained from testing and tuning should be used to modify and optimize the overall methodology so that the
production migrations are as clean and quick as possible.

The migration execution phases are as follows:


 Execute the initial full copy
 Execute incremental copies to keep the new storage up to date and as close to production as
possible
 Based on performance, you can execute multiple migrations simultaneously if they are supported and
if the network infrastructure can support the additional load

Multiple hosts (if using a host-based migration) can replicate data in parallel; be aware of load placed on the source cluster.
To reduce contention, coordinate hosts so that they are not all writing to the same subdirectory. Spread hosts across the
Isilon cluster to maximize network bandwidth and processing power.

Alternative migration methodology: Mount the source NFS exports directly on the individual Isilon nodes and run the rsync
commands directly on the cluster. The benefit with this method is that if you have multiple nodes, your transfer bandwidth
scales, and you remove the “middle man” from the process. Data moves directly from the source filer to the Isilon cluster,
without having to go to a host first and then back out to the Isilon cluster. As a result, latency is greatly reduced and
migration times will drop.

Best practice—Continue to run incremental copies

Continue to run incremental copies, even if the cutovers are not scheduled.

Why: This will keep the source and target data more in sync and require less data transfer during the final pre-cutover
copy.

Depending on the size of the data migrations, the initial full copies may take a while to execute. During this time, you can
prepare for the final cutover events.

Pre-cutover preparation
After you start to migrate data, you can begin to prepare your cutover events.

Best practice—Create a detailed migration plan

Create a detailed migration plan with all the specific steps and timing of the migration execution.

Why: This document will dictate the commands and work that are being executed. The plan controls the entire migration
from start to finish. All roles, tasks, and responsibilities are defined.

The detailed migration plan dictates how the migration is executed.

Best practice—Create a cutover document

Create a cutover document that defines the high-level cutover tasks, responsibilities, and timings. The document should
outline the phases and sequence in which tasks are executed.

Why: This document will outline the sequence of events that need to occur during a cutover. It can be used to track and
monitor the progress of the cutover.

Best practice—Create a schedule and define outage windows

Have a well-defined cutover schedule and outage window.

Why: The schedule helps execute the migration cutover. The outage window can be scheduled when you have clearly
determined that access to storage will be unavailable and that you can make storage system changes without impact to

39
clients.

Best practice—Create a communication plan

Have a communication plan.

Why: This communication plan will clearly outline the protocols needed to keep all users up to date on the status of a
migration and enable storage administrators to stay focused on the execution of the cutover and not be distracted by
information requests from end users.

Best practice—Prepare the DNS name resolution infrastructure for cutover

Lower the DNS time to live (TTL).

Why: This will facilitate the cutover of clients using DNS name resolution by reducing the time between authoritative
updates to DNS.

Additional pre-cutover preparation steps often include the following:


 Prepare the DFS namespace, if applicable
 Create CNAMEs in DNS
 Update scripts used by clients for storage connections
 Prepare clients and applications

Cutover event
Once you have migrated the data and prepared the environment for the cutover, the actual final migration event can occur.

In general, the high-level cutover sequence resembles the following steps:


 Initiate the migration cutover window—communicate the event
 Restrict access or make source data read only—prevent new writes to the old
data source
 Execute a final incremental—copy all final data to the new storage system
 Validate final incremental—validate that the source data is ready for the cutover
 Execute final testing—the final cutover testing is completed
 Make a go or no-go call on a full cutover—decide if the migration should continue or roll back
 Update connection and name resolution protocols; DNS, CNAMEs, DFS, and scripts
 Enable new storage to read/write—enable writes to the new storage system
 Continue testing and user acceptance—continue to test and monitor as production traffic moves over
to the new storage system
 Execute the redirection of client to the new storage system—initiate the client redirection process
 Monitor—assess the cutover, new storage system, and clients

When executing a cutover event, the best practices that follow are recommended.

Best practice—Follow the cutover schedule

Follow a cutover schedule.

Why: By following a well-defined schedule, you can monitor and control the migration. Dell EMC Isilon recommends that
you execute cutovers during off hours or when the number of active connections is low.

Best practice—Test the migrated cutover data

Prepare a number of data and workflow tests to execute against the migrated data. Have a number of well-defined
production use cases, data tests, and test users available to conduct post-cutover testing and review.

Why: Having a well-defined use case and users to validate the migration cutover will help you in making the decision to
continue with the cutover.

40
Best practice—Monitor clients and application during migrations

Monitor client and application connections to the new storage system during the cutover.

Why: This will verify that your cutover methodology is working as defined and that clients are moving and connecting to the
new storage system successfully.

Best practice—Develop a client connection remediation plan

Have a client remediation methodology plan in place and which is ready to execute against clients that exhibit any issues
connecting to the new data targets.

Why: Have a well-defined strategy to handle client connection issues, including a dedicated support line, email address,
or an IT desk.

The go or no-go decision


During the migration cutover window, a critical point will be reached. This threshold determines whether you will continue
with the cutover or abort the cutover and roll back to the existing storage system.

Common abort cutover situations include the following:


 Final incremental does not complete in the outage window
 Cutover methodology fails; clients are not connecting correctly
 Security issues with the new storage system
 Workflow issues post-cutover
 Load and availability problems
 Other unknown issues

Best practice—Clearly define your cutover criteria

It is critical that you have a series of cutover criteria that clearly defines when a migration will continue or be aborted and
rolled back.

Why: The criteria removes uncertainty, provides help with decision-making, and dictates the best action to take.

Once you begin to write data to the new storage system, reverting to the old system becomes much more complicated
because you now need to reconcile data with the original storage system.

Rollback
If a decision to abort the cutover is made, there should be a well-defined rollback plan in place that was developed and
tested ahead of time so that you can restore data access as quickly as possible.

Rollback plan:
 Prevent any new writes to the new storage system
 Move client connections back to the old storage system
 Enable writes to the old storage system

Best practice—Develop a rollback plan

Have a clearly defined rollback strategy that is easy to implement and which can restore user access to data quickly and
cleanly. Also, make sure the plan is tested.

Why: A rollback plan will help you restore client data access quickly in the event that a migration cutover event fails.

If any data has already been written to the new storage system and a rollback is executed, then steps to remediate this
data must be taken to restore the new data back to the original storage system.

41
Common strategies for reconciling data during a rollback are as follows:
 Manually reconcile the data—identify and manually move any data from the new storage system to
the oldone.
 Perform a reverse incremental—have migration type jobs to run in a reverse direction to update the
old storage system.
 Discard the data—consider the data as noncritical and decide to reconcile it.
 Rewrite the data from the client or application to the old storage system—allow applications and
clients to rewrite the data.

The goal of any rollback strategy is to limit the impact on end users and restore data access as seamlessly as possible. It is
for this reason that your migration cutover criteria should be well defined and that the rollback strategy should have been
well tested in the event that you need to implement it.

Migration event completion


After you successfully complete a cutover, you should continue to monitor the new storage system.

Best practice—Monitor the new storage post-cutover

Continue to monitor the cutover storage after the cutover event for any issues that may result from the cutover.

Why: Production load and workflow may be unpredictable; closely monitor the new storage system to rectify any post-
migration issues.

Steady state
Repetition

Most data migrations will constitute multiple cutover events. Once you have developed a well-structured migration
methodology, these additional cutover events should be run with the same plan and strategy.

Lessons learned

After you complete a migration, you should assess the success and failures of the methodology. If an additional migration
needs to be performed, the lessons you’ve learned from the migration process will enable you to refine the process.

Ask yourself the following questions:


 What worked during the migration and cutover?
 What did not work during the migration and cutover?
 Can the migration methodology be modified or optimized?

Conclusion
The goal of this paper is to supply you with solid guidance for conducting an NFS single protocol file migration from a NAS system
to an Isilon cluster. The guidance is based on
a comprehensive set of industry knowledge and best practices on the technical aspects and process of data migration. As stated
in the beginning of this document, this paper does not aim to be an exhaustive authoritative source on the subject of NFS single
protocol migrations, but rather a comprehensive reference document that covers the key areas that will help ensure your success.
Dell EMC can provide comprehensive services, including migration services, Isilon training and education, and residency
services, to reduce risk and maximize system uptime and service levels during and after a data and system migration.

42
Appendix: Sample migration use case
This appendix provides a sample high-level overview of how to collect information for and plan a migration of NFS source
directories from a single NFS server. Keep in mind that the information in this section provides only a skeleton of some of the
information that you would want to collect, as well as an overview of the strategy that you will want to define for your migration.
The use case that follows answers the following question at a high level: What are the recommendations and best practices as
well as supported Isilon configurations to migrate all directories and data?

Tables A-1 through A-3 show a sample NFS filer storage system to an Isilon cluster.

Source configuration and data Sample directory structure

Single source system—NFS filer, 4 x 1 Gb Ethernet /exports


ports /acct
/engineering
Total data: 25 TB
/home
Max. file size: 4 GB /production
/RandD
Min. file size: 0 B
/scratch
Avg. file size: 256 KB /temp
File count: 8,000,000 /work

8 top-level exports:
acct
engineering
home
production
RandD
scratch
temp
work

User home directories: each user has a single home


directory under a higher level share, /exports/home.

Table A-1. Source data

43
Exports and permissions

drwxr-xr-x 10 root wheel 512 Aug 27 14:50 .

drwxr-xr-x 21 root wheel 512 Aug 27 14:50 ..

drwxrwxr-x 2 root rd 512 Aug 27 14:50 RandD

drwxrwxr-x 2 root acct 512 Aug 27 14:50 acct

drwxrwxr-x 2 bob eng 512 Aug 27 14:50 engineering

drwxrwxr-x 2 root users 512 Aug 27 14:50 home

drwxrwxr-x 2 prod prod 512 Aug 27 14:50 production

drwxrwxrwx 2 root wheel 512 Aug 27 14:50 scratch

drwxrwxrwx 2 root wheel 512 Aug 27 14:50 temp

drwxrwxr-x 2 root eng 512 Aug 27 14:50 work

Table A-2. Exports and permissions

Additional source information Additional Isilon information

All LDAP: single domain Dell EMC Isilon X200 x 3: ~61 TB

Source system network connectivity: OneFS 7.0.x


4 x 1 Gb
LDAP authentication
No firewalls, IDS, or QoS
3 x LACP (2 x 1 Gb/s each)
Monthly full backups
SmartQuotas
Antivirus scanning in place
SnapshotIQ
Same data center as the Isilon cluster

No deduplication or offline files

DNS: 2 CNAMEs

No routing or VLAN restrictions

Table A-3. Additional source data information

Requirements
 All data and permissions are moved as is with no changes. All existing POSIX permissions and ownership is
retained.
 Eight cutover events; 12-hour window—Saturday 8:00 P.M. through Sunday 8:00 A.M.
 One migration is performed per weekend.
 Each user has a defined quota; quota limits are to be replicated on the Isilon cluster.

Migration project assumptions (including, but not limited to):


 Customer will have approved change controls submitted for any migration activity.
 Migration plan and design will have been reviewed and approved by the customer prior to the start of the
cutovers.
 Any recommended array OS upgrades (and firmware updates) necessary for the migration will be applied before
any migration cutover activity occurs.
44
 Source NAS and target Isilon systems must be in a known good state prior to conducting the migration.
 The customer will have successfully completed a full system backup and verified its reliability.

Strategy
 For an Isilon-based migration, mount source NFS exports directly on individual Isilon nodes
 Conduct pilot migration; validate methodology, document performance metrics, refine and tune rsync
switches and scripts; test migrated data with clients and users
 Investigate sizes and file counts in each export; this will help determine the order for the migration (that
is, the largest directories will take the longest time and should be started first)
 Validate change rates and time to execute incremental copies
 Have completed a full backup of all file systems that are to be migrated before the cutover; verify that the
backup is good
 Develop a detailed project timeline and cutover schedule with the customer
 Develop a backout plan, and review it with the customer
 Execute the migration in phases: execute initial full copies, followed by nightly incremental copies
 Use a DNS update methodology to redirect clients
 During the cutover, make sure that each source file system is changed to a read-only stare after the
source directory is successfully replicated to prevent clients from making any changes
 Reduce DNS TTLs in advance of cutover windows
 Develop client communication; the customer should provide dedicated cutover IT support desk/personnel

Source system configuration

If possible, restrict access so that clients cannot modify data during the cutover (change to read only).

Isilon configuration
 Pre-create NFS shares with identical share permissions
 Disable snapshots on data until the cutover is completed
 Disable antivirus scanning

Toolset selection

use rsync

Example:

rsync --a --delete sourcefiler:/exports/acct /ifs/data/exports/acct --exclude ‘.snapshot*’

Migration testing
 Map source and target directories from the migration host
 Replicate a small test set of data
 Use a small set of test users to validate full data access (discard or overwrite data following the test)

Migration

This example directory structure can be broken down into at least eight separate rsync jobs. Several of the jobs can be run
in parallel, assuming the source cluster can tolerate the additional load while the migration is occurring. Monitor the source
cluster and scale it up or down accordingly.

If the customer has a preference for the order of directory migration, then plan the transfers accordingly. Otherwise, start
with the largest directories because they will take the most time to replicate.

Example:

Initial migration testing indicated that three simultaneous rsync jobs would be an acceptable additional load on the source
cluster. Run three rsync jobs concurrently so that you do not overload the source cluster. Note that in other scenarios,
additional nodes could run additional rsync jobs in parallel, if the source cluster has enough performance and bandwidth to
accommodate this.

Mount an individual export to a different node and start an rsync:

Isilon node 1:

45
Mount filer1:/export/acct with a command similar to:
mount filer1:/export/acct /import/acct

(mounts the remotely exported directory to a created directory locally)

Rsync to /ifs/data/filer1/acct with a command similar to:


rsync –av /import/acct /ifs/data/filer1 --exclude ‘.snapshot*’

(this would copy files in “archive” mode, which ensures that symbolic links, devices, attributes, permissions, ownerships,
and so on are preserved in the transfer. There is no compression. Exclude any snapshot directories from the transfer).

Repeat with Isilon node 2:

Mount filer1:/export/engineering and rsync to /ifs/data/filer1/engineering.

Do the same with Isilon node 3:

Mount filer1:/export/home and rsync to /ifs/data/filer1/home.

Monitor the source cluster to verify that it is not overloaded with the additional strain of migrating data. The idea is to avoid
impacting clients as the migration progresses.

Repeat this process with the remaining shares until all of the source data is migrated.

Once an initial full copy of the source data has been completed, incremental copies should be run to propagate any
changes that were made once the migration began.

On the day of the cutover, final incremental copies should be run and access to the source cluster should be restricted if
possible to prevent clients from writing data that may not be migrated.

User acceptance testing


 Verify that data and permissions on the Isilon cluster are replicated correctly following the final copies
 Review user access and verify that users have connectivity and correct permissions
 Use a small set of test users to validate that full data access occurs during the migration event to identify any
problems early on
 Verify that a user can read/write to a file, create a new file and directory, and traverse the directory structure
 Monitor performance of the Isilon cluster and client connections as load increases

Cutover plan
 Make source file systems read only
 Execute final incremental copies
 Update DNS
 Verify that clients can connect to Isilon
 Test automated workflows, if possible
 Initiate user logoff and re-logon

Rollback plan
 Reverse DNS update
 Make old source file systems read/write and remove any restrictions
 Remove any connections to the Isilon cluster and stop exports
 Note that any data that was written to the Isilon cluster is considered lost; no reverse or reconciliation process will
be performed.

Exit criteria
 DNS resolves to new target storage and shares
 Confirm that client can successfully read/write data to directories
 Confirm that workflows were successfully completed with no user or permission problems
 Verify that there are no connectivity issues

Post-cutover
 Conduct a customer meeting to review and triage the migration
 Create documentation of the migration protocol
 Conduct a “lessons learned” discussion with both the internal customer and their team

46

You might also like