HP 3par Storeserv 7000 Advanced Training

HP 3PAR
StoreServ 7000 Advanced Training
Day 4
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Remote Copy Overview
Agenda
• What is Remote Copy

• Terminology
• The use of snapshots
• Sync and periodic operation
• Topologies
• Disaster recovery
• Performance (impact)
These slides are HP Confidential and are designed for Restricted HP internal use Only.
If you want to share the information with partners or customers, use the other HP Official
permitted documents/presentations
3 © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
What is Remote Copy? (1 of 2)
Remote Copy is the disaster recovery feature for 3PAR products
modes of operation:
Synchronous (sync):
Data is replicated in real time from primary to secondary. This gives a RPO (Recovery Point
Objective) at the expense of increased service times.
Asynchronous Periodic (async / periodic):

Snapshot-based ,Data is periodically replicated from primary to secondary. This allows for
low service times, but the RPO is 5 minutes at best.
What is Remote Copy? ( 2 of 2)
Starting with HP 3PAR StoreServ OS 3.1.2 some Remote Copy topologies support
mixing synchronous and periodic asynchronous replication between a pair of nodes
-For the topologies that support mixing synchronous and periodic asynchronous modes you cannot mix
the modes on a shared pair of RC links, each mode must reside on its own pair of links
-Each mode requires its own unique target definition
Mixed Sync/Periodic mode requirements:

-Two targets between the same system pair.
-One for synchronous mode and one for periodic mode.
-Each target has its own links. Two links are required for each of the target.
4 links are required between the same node pair( 4 RCIP links or 2 RCIP and 2 RCFC
links)
-Synchronous remote copy group created using Synchronous mode target
-Periodic groups created using Periodic remote copy target.
Remote Copy Advantages
• We use zero detect on replication links to minimise traffic.

• Thin aware so only real data is replicated.
• A special patented (7539790) synchronous replication protocol is used. It
transfers data with a single round trip communication between StoreServs by
dummy reads from the secondary system.
− Normal SCSI = Write-Ack-Data-Ack
− 3PAR Replication = Read-Data
• You will see READS between sites (not writes as expected) and high service times (eg 60 seconds). This is
normal as we are PULLING data from the remote site.
• Async periodic is snapshot based and minimises traffic by not transferring every
write.
Remote Copy Terminology ( 1 of 2)
• Volumes on the primary (source) are added to a Remote Copy group.

• The group is the basic configuration unit of Remote Copy
• The data in volumes in a group are time consistent
• The group has a target associated with it

• The target is either an FC (Fiber Channel) or IP connected remote system
• A target has links associated with it. The links are used to communicate with the remote system
• Targets must be configured on all systems using Remote Copy
• The target of a Remote Copy group is known as the secondary (Backup).

• Each group can either be in periodic or synchronous mode. All the volumes in
the group operate in the same mode.
• Data replication starts when a group is started
Remote Copy Terminology ( 2 of 2)
• Direction - Natural or Current

− Natural direction is Primary to Secondary (Backup)
• Role - Primary, Secondary, Primary-Rev, Secondary-Rev

− Primary(-rev) is writeable
− Secondary(-rev) is read only
− Rev means reversed
• Group Status – Started or Stopped

− Started means data is currently being replicated
RC Transport Layers
Remote Copy over IP (RCIP)

− Storage systems that are connected to each other over an IP network and use native Remote Copy IP ports
to replicate data to each other. F, T, 7000 and 10000 storage systems use embedded GbE ports for RCIP.
One per node.
− IP address must be on different subnet to management.
− MTU 1500 or 9000.
Remote Copy over Fibre Channel (RCFC)
− Storage systems that are connected to each other over a Fibre Channel Storage Area Network (FCSAN) and
use Fibre Channel ports to replicate data to each other. One RCFC link is supported per node.
Remote Copy over FCIP (Fibre Channel over IP)
− FCIP as a transport layer enables Fibre Channel frames to be sent over an IP network. Storage systems use
FC ports, the same ones they use for RCFC, and the SAN they use is extended across an IP network by
using FC-to-IP routing with external FCIP gateways.
Remote Copy Maximum Latencies
Remote Copy Type Max Supported Latency

Synchronous FC 2.6ms round trip *
Synchronous IP 2.6ms round trip *
Asynchronous Periodic FC 2.6ms round trip *
Asynchronous Periodic IP 150ms round trip
Asynchronous Periodic FCIP 120ms round trip
*
• Optical Fibre networks typically have a delay of ~5 us/km (0.005ms/km)
• Thus 2.6ms will allow Fibre link distances of up to 260 km
 2 x 260km = 520km  520km x 0.005 ms/km = 2.6ms
RC Directions
Unidirectional Primary Secondary
− A given storage system pair is said to be

under unidirectional replication if replication
P S
occurs only in one direction.
Bidirectional
− A given storage system pair is said to be Primary Secondary
under bi-directional replication if replication
is occurring in both the directions.
S P
P S
Remote Copy Links
Sending Link Created manually during the Remote Copy setup
The commands that configure the “Sending Link” are:
• admitrcopylink
• creatercopytarget
HP 3PAR Remote Copy uses sending links to transmit data from
a system to its remote-copy target system (the other system in
the remote-copy pair).
Receiving Link Automatically created on all nodes that have sending links
configured
HP 3PAR Remote Copy uses receiving links to:
• listen for remote-copy data and commands from the target system
in the remote-copy pair
• read the incoming data and commands
• send the data and commands to the appropriate remote-copy
process
Remote Copy Groups
• Volumes on the primary are added to a Remote Copy group
− The group is the basic configuration unit of Remote Copy
− The data in volumes in a group are time consistent
• The group has a target associated with it
− The target is either FC or IP and identifies a remote system
− A target has links associated with it. The links are used to communicate with the remote system
− Targets must be configured on all systems using Remote Copy
• The target of a Remote Copy group is known as the secondary
• Each group can either be in periodic or synchronous mode. All the volumes in the
group operate in the same mode.
• Data replication starts when a group is started
• Secondary group name =
− <primary_group_name>.r<primary_system_system_ID>
Snapshots
• Snapshots are critical to Remote Copy operation (no license required but
strongly recommended)
• A snapshot is a view of a volume at a particular point in time
• Remote Copy typically uses coordinated snapshots
− Traffic to all volumes in a group is blocked and snapshots are taken
− This ensures that the snapshots of all volumes represent the state of the volumes at a distinct single point
in time
• Remote Copy uses resync snapshots when syncing data
− A resync snapshot is the starting point of a sync (can also be used as a recovery point)
− A sync snapshot is the target state of a sync
Remote Copy Synchronous
5. I/O complete
1. I/O to local
array 4. Acknowledgment
to host
3. Acknowledgment
from remote
11 2
2. Replicate to 1 2
1 2
remote site
X ms latency
Primary (Source)System Secondary (Backup) System
Sync Mode Basic Operation
• Group started for first time

• All data in base volumes is duplicated from primary to secondary
• When group is running
• Data written to base volumes is replicated to secondary before host is acknowledged
• When group is stopped
• Snapshot is taken on the primary- this is a resync snapshot
• When data is written to base volume, old data is written to resync snapshot
• When group is started again
• Snapshot is taken on the secondary- this is the resync (recovery) snapshot
• Differences between resync snapshot and base volume on primary are copied to the secondary
• If links go down during resync, volume on secondary is promoted back to recovery snapshot
• When sync is completed all snapshots are deleted
Disaster Tolerant Solution Considerations
Synchronous replication server performance impact
• Host-initiated write is performed on both the active and the backup storage servers before
acknowledging the host write
• On active storage server, the write is written to the write cache of two nodes; this is the standard redundancy of the 3PAR
InServ Concurrently,
• the write is sent via communication link to backup storage server
• The write request is written to the write cache on two nodes before it sends acknowledgment to the active system
• Host write is acknowledged once the local cache update completes, as well as receiving the remote acknowledgment
• Server write I/O performance will be paced by both the speed of the inter-subsystem links
and the network latency on these links
• Total write IO service time on the primary array includes
• Local array IO service time
• Replication latency
• Replication latency can be much higher than the network “ping” time
• Remote array IO service time
• Server write IO performance decreases as the link latency increases or link speed decreases
• Provisioning a larger (faster) link can help reduce the IO service time for an individual
replicated IO but the IO service time the server receives will always be something larger than
the link latency plus local array IO service time (it may be a lot larger)
Asynchronous Periodic Mode
• Host writes are performed on the active server
• Host write is acknowledged as soon as the data is sent to the cache of two nodes
(normal host write acknowledgment)
• The “active” and “backup” servers are resynchronized periodically
• The resynchronization can be scheduled or done manually
Remote Copy Asynchronous Periodic
Primary Site Remote Site
Sequence Base Volume Snapshot Base Volume Snapshot
1 Initial Copy A SA
Resynchronization.
Starts with snapshots B SA
2
P
Resynchronization. Delta B-A
Copy delta SB
Upon Completion.
3
Delete old snapshot A SA
Ready for next

resynchronization B SB
Periodic Mode Basic Operation
• Group started for the first time

− A sync snapshot is taken on the primary
− The contents of the sync snapshot are duplicated to the secondary
− When the sync completes the sync snapshot becomes the resync snapshot
• When a periodic resync is done
− A sync snapshot is taken on the primary
− A resync (recovery) snapshot is taken on the secondary
− Differences between sync and resync snapshots are duplicated to the secondary
− If links go down during the resync, then secondary can be promoted to recovery snapshot
− When resync is complete, old resync snapshot is deleted and old sync snapshot becomes new resync
snapshot on primary. Recovery snapshot is deleted on secondary.
• When group is stopped
− No need to take snapshots as there already are resync snapshots for next resync
Remote Copy Asynchronous Periodic multi-volume IO
consistency
• For an application using a single volume by itself IO consistency is always insured as the delta data
is applied in an all or nothing fashion
• For an application that spans multiple volumes a Remote Copy “Volume Group” containing the
volumes ensures IO consistency is maintained across the volumes
• During the delta resynchronization of volumes in the Volume Group RC creates snapshots of all the
volumes before the resynchronization starts
• If RC is in the processes of updating volumes in the Volume group when a failover occurs all
volumes in the Volume Group are promoted back to the last snapshot that was taken, hence
ensuring IO consistency across the volumes
showrcopy
root@inodee4cd3:~# showrcopy
Remote Copy System Information

Status: Started, Normal
Target Information
Name ID Type Status Options Policy

s626 1 FC ready 2FF70002AC01852A mirror_config
Link Information
Target Node Address Status Options

s626 0:0:1 20010002AC01852A Up
s626 1:0:1 21010002AC01852A Up
s626 2:0:1 22010002AC01852A Up
s626 3:0:1 23010002AC01852A Up
receive 0:0:1 20010002AC01852A Up
receive 1:0:1 21010002AC01852A Up
receive 2:0:1 22010002AC01852A Up
receive 3:0:1 23010002AC01852A Up
Group Information
Name Target Status Role Mode Options

group1 s626 Started Primary Sync
LocalVV ID RemoteVV ID SyncStatus LastSyncTime
cpvv.1 2283 cpvv.1 2003 Synced NA
Name Target Status Role Mode Options

group2 s626 Started Primary Sync
LocalVV ID RemoteVV ID SyncStatus LastSyncTime
showrcopy -d
During sync:
Group Information
Name ID Target Status Role Mode LocalUserCpg LocalSnapCpg RmUserCpg RmSnapCpg Options
group1 66 s626 Started Primary Periodic Last-Sync 2012-11-05 11:46:11

EST , over_per_alert
LocalVV ID RemoteVV ID SyncStatus Resync_ss Sync_ss VV_iter R_iter S_iter LastSyncTime
cpvv.1 2283 cpvv.1 2003 Synced rcpy.66.2283.1 rcpy.66.2283.1.1 34089/2 34089/1 34089/2 2012-11-05 11:47:40 EST
cpvv.2 2284 cpvv.2 2004 Syncing (0%) rcpy.66.2284.1 rcpy.66.2284.1.1 34089/2 34089/1 34089/2 2012-11-05 11:45:51 EST
After sync:
Group Information
Name ID Target Status Role Mode LocalUserCpg LocalSnapCpg RmUserCpg RmSnapCpg Options
group1 66 s626 Started Primary Periodic Last-Sync 2012-11-05 11:46:11

EST , over_per_alert
LocalVV ID RemoteVV ID SyncStatus Resync_ss Sync_ss VV_iter R_iter S_iter LastSyncTime
cpvv.1 2283 cpvv.1 2003 Synced rcpy.66.2283.1.1 none 34089/2 34089/2 NA 2012-11-05 11:47:40 EST
cpvv.2 2284 cpvv.2 2004 Synced rcpy.66.2284.1.1 none 34089/2 34089/2 NA 2012-11-05 11:48:35 EST
Supported Topologies
Remote Copy One-to-One (1:1) Topology
• Supported between a pair of StorServ arrays

• Supported with synchronous OR asynchronous periodic mode
• With HP 3PAR OS 3.1.2 and later you can run synchronous AND asynchronous periodic at the
same time between a pair of HP 3PAR arrays in a 1:1 configuration
• Mixed mode requires multiple targets
Sync
RCFC or RCIP, 2-4 nodes, bidirectional Sync RC links
Group Group
A A’
Periodic
RCFC or RCIP, 2-4 nodes, bidirectional
Async Periodic RC
FCIP, 2 nodes, bidirectional Group Group
links
B’ B
Mixed Sync and Periodic
RCIP or RCFC, 4 nodes, bidirectional 3PAR 3PAR
StoreServ StoreServ
RCIP periodic, RCFC sync, 2-4 nodes,
bidirectional
HP 3PAR Remote Copy Many to One (N:1) Topology
N:1 Topology
• Only supported with Asynchronous Periodic
P
replication
• StoreServ Requirements
Primary Site A
• Current max support is 4:1 InServs
P
• One of the four primary InServs can mirror bi-
directionally with the target hence protecting the
RC RC
target array’s data Primary Site B
RC P
• If the solutions is FC-IP based then the Target
StoreServ requires two nodes for every Primary RC
StoreServ
P Target Site
• One relationship may be bi-directional
• Cannot mix different transports (all links are
RCIP or all RCFC or all FCIP) Primary Site C P
• Sync not supported
RC
Primary & Target Site D
One to Many (1:N) Topology 3.1.2 Support
Source periodic
RCIP, 2-4 nodes, bidirectional
RCFC or FCIP, 4 nodes, bidirectional P1`
RCFC/FCIP and RCIP, 4 nodes, bidirectional
Source sync
RCIP, 2-4 nodes, bidirectional
RCFC, 4 nodes, bidirectional
RCFC and RCIP, 4 nodes, bidirectional P1 Target 1
Source periodic and sync (mixed)

Sync RCIP/RCFC, periodic RCIP/RCFC/FCIP, 4 nodes, P2
bidirectional
N is a max of 2
Supported configs
Source
Async Periodic between all arrays
Sync between one pair and Async Periodic between
other pair
P2`
*Bi-directional replication supported starting
Note: Different volume groups are being replicated (1:N)
with HP 3PAR OS 3.1.2
Target 2
Remote Copy: Supported Topology
• Synchronous Long Distance

• Synchronous Long Distance (SLD) is intended to provide a solution with a potential for an RPO of
zero at the end of an asynchronous replication link if the primary data center fails
• SLD allows a given RC group to be replicated to two different HP 3PAR arrays simultaneously
− One mode is synchronous and the other is asynchronous periodic
− This is the ONLY configuration that allows replicating the same remote copy group from a source HP 3PAR
StoreServ array to two separate target HP 3PAR StoreServ arrays
• Synchronous replication occurs to one target 3PAR StoreServ and Asynchronous Periodic
replication occurs to the other target 3PAR StoreServ
• Requires the synchronous replication relationship link to be RCFC or RCIP (FCIP is not allowed)
and the Asynchronous Periodic relationships to be RCIP or FCIP
• With HP3PAR OS 3.1.2 we allow bi-directional replication between the synchronous arrays
• This means we can support two SLD configurations using the same three arrays
Synchronous Long Distance Topology 3.1.2 Support
Source to sync target
- In an SLD configuration, bi-directional
RCFC or RCIP, 2-4 nodes,
synchronous replication is supported bidirectional
between Source and Sync Target starting with
HP 3PAR OS 3.1.2
- Supported configs:
Only one Source and Sync Target pair
allowed Bi-direction
Source/DR pair is Uni-direction as before
- Allows customers with 3 datacenters (DCs) to
deploy primary applications on their source
and Sync sites, and protect their Source Target 1
data/applications using storage system
located in another DC.
Sync target to periodic target
Source to periodic target
RCFC or RCIP, 2-4 nodes
RCFC or RCIP, 2-4 nodes
FCIP, 4 nodes
FCIP, 4 nodes
Uni-directional
Target 2
Can now support two SLD configs among three arrays

Synchronous Long Distance Topology 3.1.2 Support
Bi-directional synchrnonous
Support for two SLD configurations across three replication via RCFC or FCIP
arrays
P1 P1`
Source1 Sync
Target 1
P2` P2
Sync Source 2
Target 2
P1``
Async P2``
Async
Target 2 Target 2
Topology Enhancements in 3.1.3
Many‐to‐Many (M‐to‐N) Remote Copy
The latest and greatest 3PAR Remote Copy Config Guide, including 3.1.3
enhancements, is available at:
Disaster Recovery
Disaster Recovery Actions
Supported Disaster Recovery Actions
• Reverse
• Failover
• Recover
• Restore
Failover
• Revert Failover
• Switchover
Recover Revert Failover
Restore
Reverse
• Changes the natural and current direction of all specified groups.

• The operation is mirrored resulting in a direction change on both systems.
• Used in background of peer persistence switchover command
• This option should not be used as a part of the normal disaster recovery process
Failover
• This action is typically performed when the primary has failed.

• It is run on the secondary and the group must be stopped.
• The secondary becomes the primary reverse and data can now be written to the
volumes on this system.
• This allows continuation of service following a primary disaster
• When the replication is stopped (either due to link issues or user action), a rcopy
snapshot (RO)is created for the volume on the source and the source write
continues
• when you do the failover , a rcopy snapshot (RO)is created on destination as well
and the volumes are writeable on both source and destination
Recover
• This action can be performed on groups where the failover command has already
been completed successfully.
• It changes the original primary to a secondary reverse.
• The groups are then started and synchronized. Delta changes from backup array
been synced to the original primary array
• When recover is complete the Remote Copy system is working normally but in a
reverse direction.
Restore
• Restore is followed by Recover in order to put the Remote copy to the original
direction
• Used on groups where the recover operation has been performed.
• The secondary-reverse is changed back to primary and primary-reverse changed

back to secondary.
• The groups are started and synced.
• When complete the groups are operating in the normal, natural direction.
Note: this sounds similar to a reverse, but the reverse does not sync the data from the
secondary to the primary!!
Revert failover
You can undo a failover operation by reverting the Remote Copy groups to their normal
state. When the Remote copy start sync after the revert ,it will over write if there was any
data written to the Backup System Volume.
• When you revert, the data on the destination volumes is restored to the state of
stopping the replication using the snapshot.
• The volume will be still writable (RW)on source but on destination it will become read
only (RO).
Note: the snapshots on both source and destination remains until the replication started.
• The replication has not started at this stage.. user action is required to start
replication unless autostart is enabled.
• Start replication and When the volume is synced both snapshots get deleted.
switchover
• Operation added for the peer persistence feature in 3.1.2

• This will be talked about in the Vmware Peer Persistence
Disaster Recovery Process
1. Change secondary volume groups to primary.

• setrcopygroup failover
2. Reverse the natural direction of data flow and synchronize the systems.
• setrcopygroup recover
3. Restore the natural direction of data flow.

• setrcopygroup restore
Limits
Remote Copy Limits
• Synchronous OR mixed (sync/async) support 800 replicated volumes in 3.1.2 and 2400
in 3.1.3
• Asynchronous only = 2,400 replicated volumes in 3.1.2 and 6000 in 3.1.3
• Max of 4 links between a source and target
• Maximum number of volumes per group is 100 in 3.1.2 and 300 in 3.1.3
• More volumes means longer snapshot creation, hence longer failover time
• Can share HBA between host and RC, dedicated port for RC the rest for host use
• StoreServ 10000, E, F, S, T all require dedicated RCFC HBAs if running 3.1.2MU1 or earlier
• One RCIP and/or one RCFC port per node until 3.1.2, upto 4 RCFC ports per node in
3 .1.3
• You can configure one sending link per node per target system
Remote Copy Timeouts
When a failure occurs such that all links between the systems are broken:
Synchronous:
• After 15 seconds, the system marks the sending links as Down
• After another 15 seconds, the system marks the targets as failed
Asynchronous:
• After 25 seconds, the system marks the sending links as Down
• After another 200 seconds, the system marks the targets as failed
Gotchas
You need to stop the remote copy group when:

− Admitting VVs
− Dismissing VVs
− Growing VVs
You can’t rename a volume group

− Use inter group transfer
Performance (impact)
Local Write Request with Remote Copy
After the write data is mirrored between the nodes connected to common HDDs
The write data must be DMAed to one of the nodes performing Remote Copy
Replication. Notice that the nodes being used to implement Remote Copy can
(will) see higher CPU utilization and they will consume more write cache than
nodes not performing Remote Copy
• A node can tell from the information in the write request (LUN and the LBA for the
write) which LD (and hence which node) the write data ultimately will be going to
• The node can also tell which nodes are responsible for Remote Copy replication
and hence must receive a copy of the data
• For this example, the write is destined for an LD which is owned by Node 2 or
Node 3 even though the request comes in to Node 2
Intel Multi-Core Intel Multi-Core Intel Multi-Core Intel Multi-Core
Processor Processor Processor Processor
Node 1 Node 3
Multifunction Controller Multifunction Controller
L Control Control L
Cache Cache
D D
Data Data Data Data Virtual Volume
Cache Cache Cache Cache
Presented
3PAR Gen4 3PAR Gen4 3PAR Gen4 3PAR Gen4
ASIC ASIC ASIC ASIC To The Server
Remote
PCIe PCIe PCIe PCIe PCIe PCIe
Copy link Switch Switch Switch Switch Switch Switch
Intel Multi-Core Intel Multi-Core Intel Multi-Core Intel Multi-Core

Processor Processor Processor Processor
Node 0 Node 2
Multifunction Controller Multifunction Controller
L Control Control L
Cache Cache
D D
Data Data Data Data
Cache Cache Cache Cache
3PAR Gen4 3PAR Gen4 3PAR Gen4 3PAR Gen4

ASIC ASIC ASIC ASIC
Remote Processor
Xfer
Write
IO
Data determines
Ready
Complete
request from
to
from
host which
node
server
PCIe
Switch
PCIe PCIe PCIe
Switch
PCIe PCIe Data DMAed
Server sends
DMAed to
to node
data to
node node
Switch Switch Switch Switch
Copy link LD(orthe write is forin
1’s
and 3’s0’s)cache
data is placed
cache
node’s cache
Performance Statistics
• statport
− displays read/write (I/O) statistics for ports
• statrcopy
− displays statistics for Remote Copy volume groups
• statrcvv
− displays statistics for Remote Copy volumes
• checkrclink
− Performs a connectivity, latency, and throughput test between two systems
Interrupt Coalescing
• Enabled by
default on RC
• Disabling can
reduce service
times for single
threaded apps
• Only use on sync
replication
Performance Impact
During an initial sync or resync on one Remote Copy group:

• High service times are observed on host IOs to volumes in a separate Remote Copy group
operating in synchronous mode. This also results in massively reduced throughput.
• High service times are observed on host IOs to non Remote Copy volumes on the secondary.
Remote Service Times
400
350
300
Remote 250
service times 200
(ms)
150
100
50
0
0 200 400 600Elapsed time800
(s) 1000 1200 1400
-50
Group1 Rmt Svt throttle disabled Group2 Rmt Svt throttle disabled
Host traffic to group1 and initial sync on group2
Throughput
900000
800000
700000
600000
Data Rate 500000
(kBps)
400000
300000
200000
100000
0
0 200 400 Elapsed
600 time (s) 800 1000 1200 1400
-100000
Group1 rate throttle disabled Group2 rate throttle disabled
Host traffic to group1 and initial sync on group2
What’s Happening?
• Basically we’re overloading the secondary system.

• It’s being asked to do so many writes that the secondary runs into problems with disk access and CMP
availability
• Frequently (though not always) we see delayed ACKs on the secondary.
• High service times are the mechanism that a system uses to indicate to a host that the system is heavily
loaded
• Normally this will cause the host to back off and reduce the throughput
• Upon examination of the secondary we see that Remote Copy queues up huge
numbers of IOs.
• The high service times are largely a result of time spent queuing on the secondary
• Zero detect makes the problem worse as the pipe does not limit throughput!
How Writes are Done
Host writes to volumes in sync groups are done in the context of vio server threads:
• There are thousands of these per node
• The local write is first done, then the thread sends the write to the secondary and blocks waiting for a response
• When the remote service times are high, the effect of the blocking mechanism is that the throughput is reduced
Initial syncs are resyncs are done using sync threads:

• There are 20 threads per node for initial syncs and 20 for resyncs. Each thread syncs 1 volume
• The syncs are done in 10G sections. Each node scans the volume and syncs the blocks that it owns.
• When a block needs to be synced, the local data is read and an IO is sent to the secondary
• The write is non-blocking. Each thread will send IOs until one of a number of criteria are hit:
− There can be no more than 400 outstanding IOs on the node
− There can be no more than 50MB of outstanding data on the node
− There can be no more than 20MB of outstanding data on a single thread
• Threads see high service times as pure latency and don’t attempt to back off
Solution
• The solution is to reduce the throughput for initial syncs/resyncs in the event of
high service times.
• Generally when the secondary is overloaded we start seeing a few excessive service times (> 75ms)
• As soon as an ACK is received with an excessive service time we reduce the rate by a small amount
• When no excessive service times have been seen for a certain period then we gradually increase the rate
• The rate is controlled on a per VV, per node basis.
• This is enabled by default in 3.1.2MU2 prior to this release it is not supported
Performance measurement
Traditional performance checks do not apply:
----- statport -rw -d -iter -----
r/w I/O per second KBytes per sec Svt ms IOSz KB
Port D/C Cur Avg Max Cur Avg Max Cur Avg Cur Avg Qlen
0:1:1 Data r 0 0 0 0 0 0 0.0 0.0 0.0 0.0 -
0:1:1 Data w 6 6 6 9 9 9 0.4 0.4 1.4 1.4 -
0:1:1 Data t 6 6 6 9 9 9 0.4 0.4 1.4 1.4 0
0:1:2 Data r 98 98 98 8599 8599 8599 1.0 1.0 87.7 87.7 -
0:1:2 Data w 112 112 112 2241 2241 2241 3.6 3.6 19.9 19.9 -
0:1:2 Data t 210 210 210 10840 10840 10840 2.4 2.4 51.5 51.5 4
0:2:1 Data r 223 223 223 422 422 422 1734.1 1734.1 1.9 1.9 -
0:2:1 Data w 0 0 0 0 0 0 0.0 0.0 0.0 0.0 -
0:2:1 Data t 223 223 223 422 422 422 1734.1 1734.1 1.9 1.9 382
0:2:2 Data r 91 91 91 356 356 356 4253.4 4253.4 3.9 3.9 -
0:2:2 Data w 0 0 0 0 0 0 0.0 0.0 0.0 0.0 -
0:2:2 Data t 91 91 91 356 356 356 4253.4 4253.4 3.9 3.9 382
For RC we use a pre-read of 80Kb. So a long svc time means we waited this long for the Primary side
to place data in the read buffer.
The timeout is 5 seconds so 5000 is the highest you will see there.
Performance in sync mode will be improved if IntCoal is disabled on FCRC ports
Appendix
• Some background Operations

• Remote Copy data verification
Background Operations
Code
• Remote Copy spans both user space and kernel space.
• In user space it is part of the sysmgr process and most code resides in
rmconfig.c and rmutil.c.
• The sysmgr code is largely responsible for configuration and management.
• Kernel code is contained in mirror.c, rmt_mirror.c and tickets.c.
• Kernel code is responsible for requesting reads/writes from volumes and identifying differences
between snapshots.
• Managing IO requests and handling link downs/node failures.
• Sending data to the links.
59
RTI
• An rti is an instance of rti_t (defined in mirror.h).

• It is a Remote mirror Ticket Information structure and contains all the
information required to manage a Remote Copy write request.
• These are tracked by timer code. If an rti is active for too long then a “hung rti”
assertion is seen in development builds.
• An rti contains a ticket number, which is allocated by the ticket dispenser.
60
Ticket Dispenser
• This is a kernel task and is active on a single node
• Whenever Remote Copy wants to send a write to the secondary it must request
a ticket from the ticket dispenser.
• Primary job is to ensure that all Remote Copy IOs are successfully written to the
secondary.
• The ticket dispenser checks for overlapping IOs and will not issue a ticket if an
active IO would write to the same memory as the requesting IO.
• It also handles load balancing i.e. decides which link an IO should be sent on
• In the case of a node down the ticket dispenser can replay active tickets to
ensure continuity of service.
61
Syncs/Resyncs in Sysmgr
• rm_util_sync_group() function called to sync the group
• This spawns threads running rm_util_vv_sync_thread()- one thread per volume
• Maximum of 20 threads at any time
• The volume is broken into chunks
• Chunk size is 1/32 of volume size or 16GB whichever is larger
• Thread sends message to the kernel on each node to sync the particular chunk
• RMCMD_VVSYNC ioctl used
• When chunk is synced on all nodes the next chunk is synced
• When chunk is synced, the percentage synced count is updated
• When volume is fully synced then rm_vv_sync_cb() is called
• When final volume is synced, group maintenance is performed in this function
62
Syncs/Resyncs in Kernel
• Main function in the kernel is rm_synch_vv_thread()- called once per chunk

• Function reads and transmits the base volume OR
• Identifies the segments that differ from a resync snapshot and transmit those
• Each node handles only the segments for which it is the owner node
• Thread does not explicitly wait for ACKs.
• A separate mechanism waits for ACKs.
• Data will be transmitted until an outstanding data limit or IO count has been reached.
63
Sync Mode
• Write request received from Host and handled in vio server thread-
vol_io_write()
• rm_get_resources() is called to create an RTI and to get a ticket
• IO enqueued to secondary
• Data copied to link node if required
• Data sent out on link
• Secondary takes data from link and enqueues it to VIO server thread
• Local write (to cache) done on primary

• Wait for ACK to remote write
• If no response received or NACK received then stop the group
• Acknowledge the Host
• The number of vio threads being used by Remote Copy is limited.
64
Remote Copy data verification
Remote Copy data verification
• The need for Remote Copy Data Verification
− Remote copy does not currently provide any means for customers to compare live primary
and secondary volumes
• Volumes that belong to remote copy groups that are in the started state
− Inconsistencies between the two volumes are not easily detectable
− In fact there is no easy way to carry out a comparison without using third-party software or
running a file system check (such as fscheck).
• Remote Copy Data Verification
− Provides customers with a data verification tool which compares primary and secondary
volumes and reports inconsistencies.
− The key feature of the project is that the comparison can be carried out without stopping the
remote copy group.
− Additionally, an option is provided to automatically correct any miscompares which are
discovered.
66
How does it work? (Function)
• checkrcopyvv detects miscompares in passive volumes
− No I/Os occur during the compare operation
− Uses snapshots, rather than the base volumes, during the compare.
• Avoids the requirement to quiesce the volumes
• Repair option employs the existing resync code
− Writes the miscompared blocks back to the secondary volume
− The repair option makes some modifications
67 Rev. 12.41
Configuration and installation
• Remote Copy Data Verification can be used on all Remote Copy configurations
• It can only be used to check one volume at a time, so for example, on an SLD system
the command will need to be issued once for the sync target and once for the
periodic target.
• The command must be issued from the primary 3PAR 7000 system
• The data on the primary 3PAR 7000 system is assumed to be correct. When
miscompares are discovered, the primary volume is used as a source to repair any
errors on the target volume.
• The feature requires 3.1.2 or later
• There is no impact to upgrade/downgrade functionality.
• There are no special installation or configuration requirements.
68
Performance benefits
• Previously undetected inconsistencies can be detected and repaired with a single
command.
Any gotchas?
• The command runs at a relatively low priority so as not to impact on normal system
operation.
• This means that very large volumes can take a considerable amount of time to
compare/repair.
• It also means that the command can run slowly on very busy systems.
• This is deliberate.
69
Troubleshooting
• What files to gather (file name and path)
− Output is logged to the sysmgr log:
− /var/log/tpd/sysmgr
• Commands to run and what output to collect
• No special commands – just the sysmgr log
• The repair operation task log can be viewed by using the showtask –d <tid>
command.
Common error messages and limitations
• checkrcopyvv cannot be run using the –v option while the remote copy group is
stopped.
• checkrcopyvv cannot be run using the –r option while the remote copy group is
stopped.
70
Troubleshooting (Cont)
• a second instance of checkrcopyvv cannot be started while an existing instance
exists.
• checkrcopyvv can only be run from the primary 3PAR 7000 system.
• checkrcopyvv cannot continue if there is a configuration change since the online compare
command was issued.
• The showtask –d command returns a log detailing the progress of the repair sync. The log will
indicate a problem if the repair operation did not complete successfully.
• A secondary snapshot might be left behind if all Remote Copy links go down during a compare
operation. This will expire and delete automatically.
What does each error message mean?
• Error messages have been written to be self explanatory.
• Log messages related to this feature are prefixed with the letters RCDV. (Remote Copy Data
Verification)
• Bugs for this feature can be found by searching bugzilla for the feature id RCOPY2
71
Thank you

HP 3par Storeserv 7000 Advanced Training

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

HP 3par Storeserv 7000 Advanced Training

Uploaded by

Copyright:

Available Formats

HP 3PAR

StoreServ 7000 Advanced Training

• What is Remote Copy

Asynchronous Periodic (async / periodic):

-Each mode requires its own unique target definition

Mixed Sync/Periodic mode requirements:

• We use zero detect on replication links to minimise traffic.

• Volumes on the primary (source) are added to a Remote Copy group.

• The group has a target associated with it

• The target of a Remote Copy group is known as the secondary (Backup).

• Data replication starts when a group is started

• Direction - Natural or Current

• Role - Primary, Secondary, Primary-Rev, Secondary-Rev

• Group Status – Started or Stopped

Remote Copy over IP (RCIP)

Remote Copy Type Max Supported Latency

Unidirectional Primary Secondary

− A given storage system pair is said to be

Primary (Source)System Secondary (Backup) System

• Group started for first time

• The “active” and “backup” servers are resynchronized periodically

• The resynchronization can be scheduled or done manually

Sequence Base Volume Snapshot Base Volume Snapshot

Ready for next

• Group started for the first time

Remote Copy System Information

Name ID Type Status Options Policy

Target Node Address Status Options

Name Target Status Role Mode Options

Name Target Status Role Mode Options

group1 66 s626 Started Primary Periodic Last-Sync 2012-11-05 11:46:11

group1 66 s626 Started Primary Periodic Last-Sync 2012-11-05 11:46:11

• Supported between a pair of StorServ arrays

Primary & Target Site D

Source periodic and sync (mixed)

• Synchronous Long Distance

Can now support two SLD configs among three arrays

Recover Revert Failover

• Changes the natural and current direction of all specified groups.

• This action is typically performed when the primary has failed.

• It changes the original primary to a secondary reverse.

• The secondary-reverse is changed back to primary and primary-reverse changed

• The groups are started and synced.

• Operation added for the peer persistence feature in 3.1.2

1. Change secondary volume groups to primary.

3. Restore the natural direction of data flow.

• Asynchronous only = 2,400 replicated volumes in 3.1.2 and 6000 in 3.1.3

• Max of 4 links between a source and target

You need to stop the remote copy group when:

You can’t rename a volume group

Intel Multi-Core Intel Multi-Core Intel Multi-Core Intel Multi-Core

3PAR Gen4 3PAR Gen4 3PAR Gen4 3PAR Gen4

During an initial sync or resync on one Remote Copy group:

Host traffic to group1 and initial sync on group2

Host traffic to group1 and initial sync on group2

• Basically we’re overloading the secondary system.

Initial syncs are resyncs are done using sync threads:

• This is enabled by default in 3.1.2MU2 prior to this release it is not supported

• Some background Operations

• An rti is an instance of rti_t (defined in mirror.h).

• Main function in the kernel is rm_synch_vv_thread()- called once per chunk

• Local write (to cache) done on primary

You might also like