Performance analysis of the HP StorageWorks Enterprise Virtual Array storage systems using HP StorageWorks Command View EVAPerf software

tool — white paper

Introduction......................................................................................................................................... 3 Scope............................................................................................................................................. 3 Target audience............................................................................................................................... 3 Additional resources......................................................................................................................... 3 HP Command View EVAPerf software overview....................................................................................... 3 Capturing performance information with the HP Command View EVAPerf software ..................................... 4 Single sample .................................................................................................................................. 4 Short-term duration ........................................................................................................................... 5 Long-term duration ........................................................................................................................... 5 Organizing HP Command View EVAPerf software output data for analysis................................................. 6 Performance objects and counters.......................................................................................................... 6 Comparing HP StorageWorks Enterprise Virtual Array 3000/5000 and HP StorageWorks Enterprise Virtual Array 4000/6000/8000 counters ................................................. 7 Host port statistics performance object ................................................................................................ 8 EVA Virtual Disk performance object .................................................................................................. 9 Physical Disk performance object ..................................................................................................... 12 HP Continuous Access Tunnel performance object.............................................................................. 13 EVA Host Connection performance object......................................................................................... 15 Histogram metrics performance objects............................................................................................. 15 Array Status performance object ...................................................................................................... 17 Controller Status performance object ................................................................................................ 18 Port Status performance object......................................................................................................... 19 Viewing individual performance objects ............................................................................................... 19 Separating EVA storage subsystems ................................................................................................. 19 Host port statistics .......................................................................................................................... 20 Virtual disk statistics........................................................................................................................ 22 Physical disk statistics ..................................................................................................................... 23 HP Continuous Access tunnel statistics............................................................................................... 25 Host connection statistics................................................................................................................. 25 Virtual disk by disk group statistics ................................................................................................... 26 Recombining for more information ....................................................................................................... 26

Storage subsystem.......................................................................................................................... 26 Disk groups ................................................................................................................................... 28 Performance problems........................................................................................................................ 29 Reviewing the data ............................................................................................................................ 29 Host port....................................................................................................................................... 30 Controller statistics ......................................................................................................................... 30 Host connection ............................................................................................................................. 31 Virtual disk .................................................................................................................................... 31 Physical disk.................................................................................................................................. 32 HP Continuous Access tunnel ........................................................................................................... 33 Controller workload ....................................................................................................................... 33 Disk group .................................................................................................................................... 34 Performance base lining ..................................................................................................................... 34 Normal operation .......................................................................................................................... 35 When something goes wrong .......................................................................................................... 35 Comparison techniques .................................................................................................................. 35

Introduction
Scope
This document provides guidelines and techniques for analyzing the data provided by HP StorageWorks Command View EVAPerf software tool and detailed information on the data it presents. The HP Command View EVAPerf software is included, at no charge, with the HP Command View EVA software suite. The software is also available as a free download at http://h20000.www2.hp.com/bizsupport/TechSupport/DriverDownload.jsp?pnameOID=471498&l ocale=en_US&taskId=135&prodSeriesId=471497&prodTypeId=12169&swEnvOID=1005. It is not intended to be an in-depth performance troubleshooting or storage array-tuning manual. The document does not cover any information required to install, configure, or execute the tool.

Target audience
This document is intended for storage administrators or architects involved in performance characterization of an HP StorageWorks Enterprise Virtual Array (EVA). You should have a basic understanding of storage performance concepts, as well as EVA architecture and management. It is assumed that you are familiar with the operation and output selections available in the HP Command View EVAPerf software.

Additional resources
• For information on acquisition, installation, and execution of the HP StorageWorks Command View EVAPerf tool, refer to the HP StorageWorks Command View EVA User Guide included on the HP StorageWorks Command View EVA distribution CD. • For information on the EVA models refer to the following websites:
http://www.hp.com/go/eva3000 http://www.hp.com/go/eva5000 http://www.hp.com/go/eva4000 http://www.hp.com/go/eva6000 http://www.hp.com/go/eva8000

• For information on the HP Command View EVAPerf software and related documentation, refer to the HP StorageWorks Command View EVA website at http://h18006.www1.hp.com/products/storage/software/cmdvieweva/index.html.

HP Command View EVAPerf software overview
Many performance monitoring tools and utilities are available for host (server) and storage array network (SAN) infrastructure. These tools measure the workload that the host presents to an attached EVA and provide an indication of the ability of the array to process the load. For example, host-based parameters such as request rates, data rates, and latencies can provide a good picture of the activity on the array. However, a server only reports on its own activity against any given virtual volumes that it accesses. If a composite picture of the activity workload to EVA is required, then the HP Command View EVAPerf software is required. The HP Command View EVAPerf software also provides information on activity within the storage system that can be useful in determining saturation levels or assessing capacity for additional workload. Measurements reported at host port and physical disk levels can, for example, expose possible bottlenecks within the array. This analysis is similar to looking into the discrete components within a computer system—CPU, memory, paging file—to identify bottlenecks within the host.

3

performance. Snapclone. You have maximum flexibility in viewing different objects in quick succession to determine where the point of interest lies or deciding where more investigation must be done. All choices depend on how the data is to be used. such as: • Virtual disk RAID (Vraid) levels • Snapshot. it is usually most helpful to restrict the data to a single performance object. For an in-depth performance investigation. Some examples where this method might be helpful are: • Total load to each EVA system • Controller processor utilization • Virtual disk read hit percentage • Disk group loading These quick views provide a convenient snapshot of the system behavior at the moment of sample and enable you to determine where to focus for more in depth investigation. is appropriate and effective. 4 . Command line execution is straightforward and enables you to display that information on the screen or save to a file for later review. consisting of a single sample. and some configuration parameters for the system. and simply display the results on the monitor screen. Three decisions must be made regarding data capture: the duration of the capture. This method enables the user to quickly determine the workload. and the selection of counters to be sampled. the capture duration. The first step is to define the intended use for the resulting data. such as host port statistics. the frequency at which the counters will be sampled. and HP Continuous Access EVA remote mirror volume identification • Disk group configuration and physical layout • Virtual and physical disk sizes or capacities • Disk group occupancy Additional information that might be helpful for documentation and problem solving includes: • Host bus adapter (HBA) firmware and driver revisions • All driver and registry parameters • SAN fabric map and switch firmware levels • EVA firmware • Application workload profiles Capturing performance information with the HP Command View EVAPerf software The HP Command View EVAPerf software provides many parameter choices for capturing workload and performance information from the EVAs visible to the tool. it cannot discern other important array configuration parameters. which can be critical in understanding the collected data. Single sample If you want to monitor the storage system interactively to determine the current state of the system.While the HP Command View EVAPerf software provides a substantial amount of performance and associated information for several components within the storage array. When using single sample capture. Viewing all objects in a single sample can be difficult to view after capture and requires the longest time to execute. it might be necessary to acquire additional information not directly available from the HP Command View EVAPerf software.

An example of multiple execution instances with 20-second sample intervals with two-hour duration follows: evaperf cs –cont 20 -dur 7200 > cs<datetime>. long-term events. To accomplish this.or two-hour duration are desirable. each capturing a single desired performance object. Restricting the number of performance objects captured optimizes the sample rate. latency spikes) would then be as short as practically possible. However.Short-term duration A longer view of system activity is helpful when investigating behavior that cannot fully be demonstrated with a single sample. for capturing an infrequent event.txt where <datetime> is the timestamp set by the scripting execution. it is often desirable to limit the data to that which is most useful.txt evaperf pdg –cont 20 -dur 7200 > pdg<datetime>. again depending on the objective. host ports. or for multiple. and physical disk groups. and reduces the effort to review the data later. Sample intervals for this mode of operation would tend to be longer. Capturing short-term phenomena requires short sample intervals as described in the previous section. Examples of such behavior are: • Transient effects of workload • Internal system behavior (such as physical disk activity) • External factors such as inter-site link (ISL) behavior • Problems which occur at specific times during the work day In such cases. then a finer granularity is required and 10 to 60 seconds is appropriate.txt evaperf vd –cont 20 -dur 7200 > vd<datetime>. When generating an historical record of EVA workload and response. As written. reduces the amount of storage required to hold the data.txt evaperf hps –cont 20 -dur 7200 > hps<datetime>. for data manageability and selection for retrieval. particularly for transient behaviors (for example. virtual disks. the duration is forever. if applicable. Long-term duration The longest durations of data capture would be used for background performance and workload logging (base lining). If it is necessary to keep a detailed record of workload and performance for anticipated problem solving activities. these commands will be executed sequentially. and HP StorageWorks Enterprise Virtual Array 8000 (EVA8000) will show less variation because of averaging over the sample interval. durations of 15 minutes to a few hours would be appropriate. Unless otherwise specified. This duration ranges from hours to continuous. After selecting counters. HP StorageWorks Enterprise Virtual Array 6000 (EVA6000). consecutive iterations of one. Additional benefit can be derived from host connections and. ranging from tens of seconds to minutes per sample. the next decision involves how to select the sample rate and duration. This requires scripting to launch the HP Command View EVAPerf software executable at those intervals and creating the corresponding time-stamped output files. The most valuable information is captured from the controller statistics. which is a problem because concurrent data is usually required. Long sample intervals on the HP StorageWorks Enterprise Virtual Array 4000 (EVA4000). keeping with timing constraints discussed in the following section. you must execute multiple instances of the HP Command View EVAPerf software. Sample intervals. then sample intervals of one to five minutes should be adequate. depending on the objective. HP Continuous Access tunnels. If the information is to be used primarily for long-term characterization of workload. It is 5 .

which would happen if the sample interval is set too low. IMPORTANT: In all examples provided. Organizing HP Command View EVAPerf software output data for analysis HP Command View EVAPerf software data can be viewed using the Microsoft® Windows® Performance Monitor (Perfmon) utility or through the HP Command View EVAPerf software command line interface (CLI). separate terminal sessions or use a background scheduling utility for concurrent execution of each command. Selecting a continuous output mode (-cont) creates a text stream of performance data that can be viewed directly or redirected to an output file. 20%). or the slowest will run continuously and fall behind the others. This data is collected from all EVAs attached to the HP Command View EVAPerf software host and for as many components as you want to monitor. 6 . It is always preferable to have the sample rate determined by timed execution. is excellent for readability. it is best to capture in a tighter format such as a comma separated variable (CSV) file. You can select multiple output formats for the HP Command View EVAPerf software data—text. but not all. most. It is critical that all data captures complete in less time than the sample interval. if there are 750 virtual disks over all systems sampled. Performance objects and counters The following sections provide examples of output data that have been captured in a plain text format for easy readability. However. For example. metrics and ancillary information are available through Perfmon. can take up to a minute to capture. The plain text output format. In the current release of the product. Add a comfortable margin to this number (for example. This paper discusses only the HP Command View EVAPerf software CLI.therefore necessary to execute the HP Command View EVAPerf software in multiple. it is best to perform a brief test execution to determine the minimum setting of the collection interval before commencing a long-term data gathering session. In the example. All data examples provided are generated using synthetic workloads and therefore show great variation between virtual disks but very little variability over time. and tab separated variable (TSV) formats. Run the HP Command View EVAPerf software in continuous screen mode for a few samples. The larger the requested set of data. each execution samples at 20-second intervals and should therefore correlate to each other. which is the output from the preceding command examples. Refer to the HP EVA documentation for a complete list of objects and counters supported in the Perfmon implementation. and use that time in seconds as the sample interval for that object. The amount of time required to capture the data in the sample is shown in milliseconds at the bottom of the screen capture. when post processing tools are used. All data is shown strictly as examples of output format. However. use the longest interval for each of the sample sets. as opposed to continuously running. it is unlikely the vd command execution shown in the preceding example will remain in lock-step with the others. bad. Performance objects in systems that have many members. These examples do not represent any real workload observed in a customer environment. if the data is to be further processed. there can be no inference that the numbers represent good. although start times can be skewed. If multiple instances of the HP Command View EVAPerf software will be executed. All information regarding corresponding Perfmon performance objects and counters is still valid as are their interpretation and use. the longer it takes to capture it. CSV. Therefore. or even valid performance information. specifically virtual disks and physical disks. HP recommends the CSV format.

EVA6000. and controller status.000. These objects and counters and their meanings are defined in the following sections. There is an option for data rates to be presented in KB/s. The HP Command View EVAPerf software displays individual counters with the appropriate data rate metrics in the header. physical disks. Friendly names are much easier to work with but were not used when these examples were captured. The A or B nomenclature may or may not correlate to that displayed in Command View. Those include: • The Ctlr column indicates for which controller the metrics are being reported. the node data has been abbreviated for clarity. In the examples in this paper. and so on) are one-second averages per sample. physical disks by disk group. transfer size histogram.000 bytes per second. write latency histogram. host connections. array status. EVA6000. the node or object name has been abbreviated for clarity. the performance data is tabulated into a matrix of rows and columns. however. read latency histogram. which is equivalent to 1. Some non-performance information is common to several objects. the array node name 5000-1FE1-5001-8680 has been abbreviated to …-8680. Therefore. Comparing HP StorageWorks Enterprise Virtual Array 3000/5000 and HP StorageWorks Enterprise Virtual Array 4000/6000/8000 counters The counters are managed somewhat differently in HP StorageWorks Enterprise Virtual Array 3000 (EVA3000) and HP StorageWorks Enterprise Virtual Array 5000 (EVA5000) systems than in EVA4000. port status.024 bytes per second. count metrics such as HP Continuous Access tunnel retries and host connection busies are non-resetting incremental counters.The output data file consists of performance counters and their associated reference information. EVA6000. and EVA8000 systems. and EVA8000 systems. the counters are true averages over the sample interval. MB/s. data replication tunnels. In EVA4000. the counters present the rate of counts that have been logged during the sample interval. By default. regardless of the interval between samples. virtual disks. When viewing the output data file. 7 . In EVA3000 and EVA5000 systems. most counters (Req/s. Each of these performance objects includes a set of counters that characterize the workload and performance metrics for that object. each sample represents an instantaneous snapshot of the activity at that moment. • The GroupID column indicates in which disk group that a virtual disk or physical disk is a member. and EVA8000 systems. so highly variable data will have different characteristics for longer samples than for shorter. Thirteen performance objects are available in the HP Command View EVAPerf software monitor: host ports. while in EVA4000. • The Node column indicates the EVA storage array from which specific data has been collected. In some cases. but latencies can also be specified in microseconds. so unambiguous identification must be done using controller serial numbers. virtual disks by disk group. The default unit for latencies is milliseconds (ms). The virtual disk WorldWide Name (WWN) 6005-08B4-0010-2E24-0000-7000-015E-0000 has been abbreviated to …-015E. which is equivalent to 1. the HP Command View EVAPerf software uses MB/s for data rates. For example. In EVA3000 and EVA5000 systems.

data rates and latencies for both reads and writes provide an immediate picture of the activity for each port and each controller.5 400 3.0 683 44.5 393 3.70 7.7 27 A 5000-1FE1-5001-8680 2.55 18.----. the client is a host.Host port statistics performance object The EVA host port statistics collect information for each of the EVA host ports.0 758 49. Enumerating requests per second. eight ports per controller pair. with respect to a physical disk. The EVA8000 has four host ports per controller.28 11.79 16. 8 . For this document.6 29 B 5000-1FE1-5001-8680 4. This time is an average of the read request latency for all virtual disks accessed through this port and includes both cache hits and misses.1 0 0.------------------4. four ports per controller pair.0 15 A 5000-1FE1-5001-8680 169.00 0. Write MB/s This counter tabulates the write data rate of host-initiated requests that were sent through each host port for write commands.-----.00 0. but rather an additional metric to help understand performance issues. the client is a controller.0 246 A 5000-1FE1-5001-8680 0.8 13 B 5000-1FE1-5001-8680 2.68 2.7 0 0. Ctlr Node MB/s Latency Req/s MB/s Latency Queue (ms) (ms) Depth -----.0 90 B 5000-1FE1-5001-8680 135. All data rates in these examples are expressed in the default unit of MB/s. All rates can optionally be listed in KB/s. Host port statistics with related counters Name Read Req/s Read Read Write Write Write Av. Read MB/s This counter tabulates the read data rate of host initiated requests that were sent through each host port for read commands. Read Latency Read latency is the elapsed time from when the EVA receives a read request until it returns completion of that request to the host client over the EVA host port. No remote mirror traffic is included in any of these metrics.77 8.8 0 0.55 16.----FP1 582 FP2 0 FP3 155 FP4 2587 FP1 0 FP2 584 FP3 155 FP4 2070 Read Req/s This counter shows the read request rate from host-initiated requests that were sent through each host port for read commands.00 0. the term “request” means an action initiated by a client to transfer a defined length of data to or from a specific storage location.---. With respect to the controller.1 67 A 5000-1FE1-5001-8680 0.----.------. These metrics provide information on the performance and data flow through each of these ports.00 0.23 11.------.55 2.00 0. All other controller models have two ports per controller.0 8 B 5000-1FE1-5001-8680 ---. NOTE: All host port statistics include only host initiated activity.5 0 0. Queue depths by themselves should not be construed as good or bad.77 17. Write Req/s This counter tabulates the write request rate of host initiated requests that were sent through each host port for write commands.00 0.

Activity for each virtual disk is reported separately for each controller accessing that virtual disk. there is little additional impact to write performance for a proxy controller in Active-Active mode. and EVA8000 systems. and EVA8000 controllers. In the tool headers. Virtual disk statistics provide a wealth of information about the activity against each of the virtual disks in the storage system. remote mirror volumes on the replication system are visible without being presented. Writes to the proxy controller are stored in local cache but copied across the mirror port. “Mirr” indicates whether cache mirroring is enabled for each virtual disk. or preferred. compared with writes to the owning controller. controller for a given virtual disk (the owning controller) but requests can still be processed by the other controller (the proxy). support a feature called Active-Active mode in which one controller is the online. Read commands to the proxy controller are passed to the owning controller. Write data is always mirrored in EVA4000. where the disk is accessed solely through the controller to which it is online. EVA6000. A virtual disk can also be a Snapshot. whether owning or proxy. However. in turn. all request rate and data rate activity for a given virtual disk is simply the respective sums for both controllers. all activity to a given virtual disk is through a single controller. including non-data transfer commands. The EVA4000. Therefore. The write mode also immediately changes to write-through if a controller detects a battery condition in which there is a possibility of not being able to guarantee cache integrity. In EVA4000. Information unique to virtual disk data The “Online To” column defines which controller is the owning controller for each virtual disk. Av Queue Depth This counter tracks the average number of outstanding host requests against all virtual disks accessed through this EVA host port and over the sample interval. which then satisfies the host read request. all host requests are logged only by the receiving controller. Snapclone. however. Virtual disks must be presented to a host to be seen by the HP Command View EVAPerf software. For EVA4000. “Wr Mode” indicates whether the specific controller is in cache write-back or write-through mode. retrieves the data from disk or cache. EVA6000. All requests to the owning controller are processed in the same fashion as for an EVA3000 or EVA5000 controller. in the same manner as with EVA3000 and EVA5000 systems. you can specify write-through mode on an individual virtual disk basis. The total activity for each virtual disk is then the sum of the reported activity for each controller. The owning controller is responsible for cache flushes to physical disk. Therefore. This metric includes all host-initiated commands. and EVA8000 systems. EVA6000. The EVA3000 and EVA5000 controllers function in Active-Standby mode. EVA6000. For these cases. the mode reverts to write-back when the action is complete. You must change write-back mode to write-through on each virtual disk before initiating a threephase Snapshot or Snapclone operation against that virtual disk. The data is then passed to the proxy controller through the mirror port. That mode persists until that battery condition is corrected. the term “Logical Unit Number” (LUN) is used interchangeably with virtual disk. 9 . which. EVA Virtual Disk performance object The Virtual Disk object reports workload and performance for each virtual disk on the EVA. While write-back is the default mode.Write Latency Write latency is the elapsed time from when the EVA receives a write request until it returns completion of that request to the host client over a specific EVA host port. cache mirroring is always enabled to support the Active-Active feature. or replication volume. and EVA8000 controllers. Thus. This time is an average of the write request latency for all virtual disks accessed through this port.

00 0.00 0. Write Data Rate This counter reports the rate at which data is written to the virtual disk by all hosts and includes transfers from an HP Continuous Access EVA source system to this system for data replication.0 0 0.----.EVA virtual disk with related counters ID Read Hit Req/s -.0 0 0.17 27. Read Hit MB/s This counter reports the rate at which data is read from the EVA cache memory because of host virtual disk read hit requests.0 39 0.00 0.-----.0 0 0.64 15. transfers from an HP Continuous Access EVA source system to this system for data replication.0 0 0.----.00 0. Read Miss Latency This counter reports the average time from when a host read request is received until that request has been satisfied from the physical disks.-------0.00 0.00 Default B Yes Back A …-0168 …-8680 7.80 19.---. Read Miss Data Rate This counter reports the rate of read data that was not present in the EVA cache memory and had to be read from physical disks.0 0.-----.00 0.00 Mirror Prefetch Group Online Mirr Wr Ctlr LUN MB/s MB/s ID To Mode Node -----0.97 0. and host data written to Snapshot or Snapclone volumes.00 7.0 0 0.3 0 0. 10 .00 1.00 0. Data will be in the EVA read cache because of a previous cache miss or a prefetch operation generated by a sequential read data stream.69 -------.00 0.0 143 1.00 0.00 0.0 0 0.5 0 0.---------------0.0 0.00 0.0 97 0.------.------.0 0. Write Req/s This counter reports the rate of write requests to a virtual disk that were received from all hosts.69 12.0 117 7.----0 0 0 0 1 0 1 0 2 0 2 0 Read Read Read Read Read Write Write Write Flush Hit Hit Miss Miss Miss Req/s MB/s Latency MB/s MB/s Latency Req/s MB/s Latency (ms) (ms) (ms) -----.00 0.0 0.00 0. Read Hit Latency This counter reports the average time taken from when a read request is received until that request has been satisfied from the EVA cache memory.00 Default B Yes Back A …-0163 …-8680 0.00 Default B Yes Back A …-015E …-8680 0.0 0.00 0.-----.00 0.----.5 0. Read Miss Req/s This counter reports the rate at which host read requests could not be satisfied from the EVA cache memory and had to be read from physical disks.00 0.00 0.70 Default B Yes Back B …-0168 …-8680 Read Hit Req/s This counter reports the rate at which read requests are satisfied from the EVA cache memory.00 0. Write Latency This counter reports the time between when a write request is received from a host and when request completion is returned.00 0.00 0.80 0.-----.00 Default B Yes Back B …-0163 …-8680 0.00 Default B Yes Back B …-015E …-8680 0.

Flush Data Rate This counter represents the rate at which host write data is written to physical storage on behalf of the associated virtual disk. but data flow for internal Snapshot and Snapclone normalization and copy-before-write activity is not included. The aggregate of flush counters for all virtual disks on both controllers is the rate at which data is written to the physical drives and is equal to the aggregate host write data. However. If a write stream has been determined to be sequential access or if the virtual disk is Vraid 1 or 0. Mirror rates for the virtual disk on the destination system for a remote mirror pair includes initial copy and merge traffic to the replication volumes on the destination system as per normal host write operations. This counter represents the rate at which data moves across the mirror port in servicing read and write requests to a virtual disk. Host writes to Snapshot and Snapclone volumes are included in the flush statistics. Prefetch Data Rate This counter on the controller that owns the virtual disk reports the rate at which data is read from physical storage into cache in anticipation of subsequent reads when a sequential read stream is detected HP Continuous Access EVA initial copy data traffic for a replicated virtual disk is reported by the owning controller in the source volume prefetch statistics. Data replication data written to the destination volume is included in the flush statistics. The mirror data rate includes read data from the owning controller that must be returned to the requesting host through the proxy controller. Mirror Data Rate NOTE: The mirror data rates referenced in this section refer only to the data across the mirror ports and are not related to the physical disk mirroring for Vraid1 redundancy. Write data is always copied through the mirror port when cache mirroring is enabled for redundancy. 11 . the mirror traffic for that stream includes only its write data. Reported mirror traffic is always outbound from the referenced controller to the other controller. Snapclone normalization writes and copy-before-write activity for both Snapshot and Snapclones are done by the owning controller and do not pass through the mirror port. so the mirror data rate for that specific stream is double the host write data rate (one parity byte for every host data byte). Host writes to Snapshot and Snapclone volumes are mirrored and included in the mirror data rate statistics for the replication volumes. a random workload to a Vraid 5 volume requires parity to be copied as well.

so the total activity to each disk is the sum of both controllers’ activity.90 14. prefetch.52 22. which is not in EVA4000.80 21. which is not in EVA3000 and EVA5000 systems.10 16. or EVA8000 systems. reports the average time for a physical disk to complete a read request from each controller. in process from each controller. These counters record all the activity to each disk and include all disk traffic for a host data transfer. Drive Latency This counter.------7 0.------12 1. Each controller’s metrics are reported separately.4 11 1.----0 37 0 29 1 35 1 28 2 32 2 32 3 31 3 31 4 38 4 30 5 34 5 27 6 35 6 34 Drive Latency (ms) ------Read Req/s Read Latency (ms) ----.52 21.79 20.79 19.5 12 0. EVA6000. This time is not broken into read and write latencies but is a “command processing” time. and redundancy traffic such as parity reads and writes or mirror copy writes.Physical Disk performance object The physical disk counters report information on each physical disk on the system that is or has been a member of a disk group.1 11 0. Completion of a disk command does not necessarily imply host request completion because the request to a specific physical disk might be only a part of a larger request operation to a virtual disk.52 20.---2 1 Default 2 1 Default 3 1 Default 3 1 Default 1 1 Default 1 1 Default 5 1 Default 5 1 Default 6 1 Default 6 1 Default 4 1 Default 4 1 Default 14 1 Default 14 1 Default Ctlr Node ----------A …-8680 B …-8680 A …-8680 B …-8680 A …-8680 B …-8680 A …-8680 B …-8680 A …-8680 B …-8680 A …-8680 B …-8680 A …-8680 B …-8680 Drive Queue Depth The queue depth metric is the average number of requests.4 10 0.-----.79 20.90 17.2 13 0. sparing.1 11 1.53 21.89 17.89 16. reports the average time between when a data transfer command is sent to a disk and when command completion is returned from the disk.1 8 0.--.91 19.2 8 0.52 21.1 12 1.8 11 0.2 13 0.2 8 0.93 17. This activity includes metadata updates.1 7 0.--.80 20.94 17.09 17. Read Latency This counter.79 20.2 11 0. leveling. Read MB/s This counter reports the rate at which data is read from a drive for each controller.91 17.3 Read MB/s Write Write Req/s MB/s Write Latency (ms) ----.10 16.0 12 0.5 7 0. The ID of each disk is from an internal representation and is not definitive for disk identification.4 13 0.3 10 0.93 15.4 11 0. as well as internal system support activity.-----.52 20.80 18. 12 .2 Enc Bay Grp ID --. Write Req/s This counter reports the rate at which write requests are sent to the disk drive from each controller.1 12 0.4 13 0.9 13 0.7 12 0. Physical disk with related counters ID Drive Queue Depth --.09 15. Read Req/s This counter reports the rate at which read requests are sent to the disk drive for each controller.2 13 0.91 17. cache flushes.2 8 0.53 21. both read and write.0 12 0. Snapclone and Snapshot support.

For example. Multiple HP Continuous Access EVA groups can share the same tunnel.-----..00 .00 0. The numbering refers to that reported by the EVA controller and is reported by HP StorageWorks Command View EVA.00 0.00 .-----. Otherwise.00 0..00 0. EVA6000. ID Round Trip Delay (ms) -----4. The first five columns define the physical and logical representation for each tunnel.00 0.-----..------.-----.. which is not in EVA3000 and EVA5000 systems.00 0.-8680 0 0 0.9 10. 16 MB/s of write activity in this case.00 0. More than one tunnel can be reported as open. but for a single HP Continuous Access EVA group.-2350 0 0 0.00 0.-----. In this example. The hex source and destination IDs are with respect to each controller and should correlate between controllers.00 . only one will be active.-----.00 0.00 0. Two systems are represented in this data capture.00 0.5 17. reports the average time for a physical disk to complete a write request from each controller. Statistics for each tunnel are reported by both the source EVA and the destination.00 0.00 0. the source and destination system on each end of the tunnels. the active tunnel is the one tunnel that shows activity.00 16.-8680 0 0 0.-2350 13 . there can be up to four tunnels open on a given host port. only a header prints. and EVA8000 controllers.00 . The HP Continuous Access tunnel statistics have values only if there is at least one active HP Continuous Access EVA group on the storage array.-8680 0 0 0.------0 0 0. Write Latency This counter.00 0.-2350 0 0 0.00 .4 Copy Write Retries Retries Copy In MB/s Copy Out MB/s Write In MB/s Write Out MB/s Node ---. HP Continuous Access Tunnel performance object The EVA HP Continuous Access tunnel counters report the intensity and behavior of the link traffic between the source and destination EVAs.16 . Write Out MB/s for one controller will be duplicated as Write In MB/s on the other.2 11.-----A 2 2 040900 030700 A 3 2 040900 030900 B 3 3 040800 030700 A 3 3 030900 040900 B 2 2 030700 040800 B 3 2 030700 040900 ------. Bay This is enclosure slot number in which the drive resides.4 25.18 0.00 16. On EVA4000.00 0.---. It is tunnel 3 on host port 3 of controller B on the source side (node -8680) and tunnel 2 on host port 2 of controller B on the destination side (node -2350). Enc This is the number for the drive enclosure that holds the referenced drive.Write MB/s This counter reports the rate at which data is sent to a drive from each controller... except that the directional counters will be complementary.5 10. HP Continuous Access tunnel statistics with related counters Ctlr Tunnel Host Source Number Port ID Dest.

14 . A value of 10. the Round Trip Time counter measures the link delay if there is no data flow between source and destination. Copy Retries Copy Retries is a count of all copy actions from the source EVA that had to be retransmitted during the sample interval in response to a failed copy transaction. If the destination controller is busy. Retries are reported by both the local and remote systems. Copy Out MB/s The Copy Out MB/s counter is identical to the corresponding Copy In MB/s counter. Round trip delay is reported for each active tunnel for both local and remote systems. only the failed segments are retransmitted. then the round trip time is extended accordingly. Write Out MB/s The Write Out MB/s counter is identical to the corresponding Write In MB/s counter. except that the direction is outbound from the local source EVA to the remote destination EVA. Write Retries Write Retries is a count of all write actions from the source EVA that had to be retransmitted during the sample interval in response to a failed write to the replication volume. Retries are reported by both the local and remote systems. It includes the intrinsic latency of the link round trip time and any queuing encountered from the workload. A merge is an action initiated by the source controller to write new host data that has been received and logged while replication writes to the destination EVA were interrupted and now have been restored. except that the direction is outbound from the local source EVA to the remote destination EVA. Each retry results in an additional copy of 8 KB.000 is reported when the tunnel is initially opened and then is updated if successful pings are executed. Copy retries are included. If the replication write consists of multiple 8 KB segments. Round trip delay is reported for all open tunnels. thus increasing round trip delay.Round Trip Delay The Round Trip Delay counter shows the average time in milliseconds during the measurement interval for a signal (ping) to travel from source to destination and back again. Copy In MB/s The Copy In MB/s counter measures the inbound data from the source EVA to create a remote mirror on the destination EVA during initial copy or from a full copy initiated by the user or from a link event. A value of 10.000 is also shown for any tunnel interrupted by a link problem. whether active or inactive. Thus. merges. Each retry results in an additional copy of 128 KB. then the ping can be queued behind any data transmissions already in process. If there is replication traffic on the link. and replication write retries. Write In MB/s The Write In MB/s counter measures the inbound data rate from the source EVA to the local destination EVA for host writes.

For example. The histograms consist of a set of 10 ranges of milliseconds in which the counter reports the number of requests that exhibited latency in the given range during the sample interval. some variation in a given metric is normal. Histogram metrics are reported for each virtual disk for each controller in the EVA4000. Histograms provide a broader description of those metrics by indicating the range of values observed in each interval and the frequency with which different values were observed.---C92B-65B1 14 0 A C92B-67B9 41 0 A C92B-68A3 1 0 A C92B-679C 8 0 A C92B-67CE 0 0 B C92B-638B 7 0 B C92B-6836 3 0 B C92B-671F 30 0 B Node ------------------5000-1FE1-5001-8680 5000-1FE1-5001-8680 5000-1FE1-5001-8680 5000-1FE1-5001-8680 5000-1FE1-5001-8680 5000-1FE1-5001-8680 5000-1FE1-5001-8680 5000-1FE1-5001-8680 Port This counter.EVA Host Connection performance object The two counters in this object deal with the host port connections on the EVA and provide information for each external host adapter that is connected to the EVA. and EVA8000 systems. The average number of requests in process (queue depth) over the sample interval is reported. Queue Depth This counter records the average number of outstanding requests from each of the corresponding host adapters.---. in the following histogram. along with the number of busies that were issued to each adapter in the interval. Busies This counter represents the number of busy responses sent to a specific host during the sample interval.-----. there were 1. The host connection metrics provide some information on the activity from each adapter seen as a host to the array. Read or write latency histogram The latency histograms provide insight into the distribution of response times for requests. Histogram metrics performance objects Each of the preceding metrics reports a single average value for each counter over the sample interval. and which experienced a latency of at least 13 ms but less than 26 ms in the sample interval. EVA6000. Host connection with related counters Host Port Queue Busies Ctlr Name Depth --------. is the host port number on the controller through which the host connection is made. The read latency distribution combines both read hits and read misses and sometimes shows the bimodal distribution associated with the two processes. Distributions also show how well the response times are grouped or if they have values across many levels of latency.229 requests to virtual disk 0. In a complex system such as the EVA. and EVA8000. Busies represent a request from the controller to the host to cease I/O traffic until some internal job queue is reduced. Histograms enable you to assess that variability as normal or problematic. 15 .----. EVA6000. Histograms are not available in the EVA3000 and EVA5000. which is not in EVA4000.

.-8680 16 .---0 0 0 15 1229 889 52 6 0 0 A 0 0 0 0 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 0 A 6 57 23 146 248 67 12 1 0 0 B 0 0 16 1152 636 0 0 0 0 0 A 0 0 0 0 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 0 A 0 0 0 0 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 0 A 0 0 0 0 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 0 A 11 147 157 721 920 194 25 3 0 0 B 0 0 0 1 329 221 16 2 0 0 A 0 0 0 0 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 0 A 852 5304 1647 44 1 0 0 0 0 0 B 0 2 133 1256 618 197 17 2 0 0 A 0 0 0 0 0 0 0 0 0 0 B LUN Node -0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 ------------------.....6 3........-0172 .. only one example of latency histogram is presented.-8680 .-017C ..-8680 ..-8680 .-0181 ..Distributions are reported individually for each virtual disk by each controller providing the host service.-0163 .... and coarser granularity is provided at the upper end.6 13 26 52 105 210 420 Ctlr 1.----......3 6..-8680 . The following example was taken from read latency histograms.....6 3.-8680 .. All histogram buckets are fixed according to an exponentially increasing scale.-0177 .-8680 .3 6...----. Read or write latencies histogram ID 0 1.-8680 ..-0168 .-0172 .-015E ....-8680 ....-016D .-0186 .-8680 . and each of the 10 ranges is fixed..-0168 ....-8680 ..-017C ..-0163 ... Therefore.----.----....-8680 .-8680 .----..-0181 . so finer granularity is provided at the lower end of the scale.-0186 ..-8680 .....-016D .-015E .----.-0177 . It is therefore necessary to sum the statistics from each controller to accurately report the full histogram counts..-8680 ..-8680 ....-8680 ..-8680 .----..6 13 26 52 105 210 420 INF ----.. Latency units are milliseconds.....----... but it is indistinguishable from the write latency in format and information.----...

----.---0 0 0 3512 0 0 0 0 0 0 A 0 0 0 0 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 0 A 0 0 0 0 582 0 0 0 0 0 B 0 0 0 0 0 0 1753 0 0 0 A 0 0 0 0 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 0 A 0 0 0 0 0 0 1819 0 0 0 B 0 0 0 0 0 0 1440 0 0 0 A 0 0 0 0 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 0 A 0 0 0 3600 0 0 0 0 0 0 B 0 0 0 0 567 0 0 0 0 0 A 0 0 0 0 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 0 A 0 0 0 0 0 0 7706 0 0 0 B 0 0 0 3631 0 0 0 0 0 0 A 0 0 0 0 0 0 0 0 0 0 B LUN Node -0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 ---------------015E ...-8680 -0186 ..----.-8680 -0186 .Transfer size histogram Transfer size distributions are somewhat different than that for latencies because transfer size distributions are defined by the application workload and are...----.. but transfer sizes show the makeup of that workload..-8680 -0168 .-8680 -0168 . Array status with related counters Total Total Node Host Host Req/s MB/s ----.....------------------894 1. 17 .-8680 -0181 .-8680 -0181 .....-8680 -0172 .-8680 Array Status performance object The Array Status object reports basic workload to the overall storage system. if the maximum transfer size is 128 KB and the application tries to write a 512-KB chunk.-8680 -0177 .. Most operating systems have an upper limit of transfer size.----.00 5000-1FE1-0015-2350 The array status counters provide an immediate representation of the total workload to a controller pair in request rate and data rate. in essence.-8680 -016D .... For this case..-8680 -0177 .-8680 -015E . Larger transfers can be specified at the application level...-8680 -0163 .----. For example. array statistics will not match host statistics for the same operation.01 5000-1FE1-5001-8680 0 0. the synthetic workload used to capture this data used a single transfer size for each virtual disk’s load... Real workloads usually show more variation than this example....-8680 -0163 ..-8680 -016D .. Latencies show how the system responds to a given workload. In the following example.----..-8680 -017C .----.. Transfer size histogram ID 0 2K 4K 8K 16K 32K 64K 128K 256K 512K Ctlr 2K 4K 8K 16K 32K 64K 128K 256K 512K INF ----. the controller will see four transfers of the smaller size. and that is clearly shown in each line.----. but the operating system breaks those transfers into multiple smaller chunks. deterministic values..----.-----.-8680 -017C ..-8680 -0172 ....

Data % Percent data transfer time is the same as CPU %.Controller Status performance object The controller status output shows the controller processor utilizations and host data service utilizations. while one that is saturated shows a value of 100%. it does not include background tasks such as sparing. or communication with tools such as the HP Command View EVAPerf software. leveling. Thus. Controller status with related counters CPU % --84 99 3 12 Data % ---82 97 3 8 Ctlr ---A B A B Serial -------------P8398FXAAR2003 P8398FXAAR200D P8398FXAAR301Q P8398FXAAQZ001 Node ------------------5000-1FE1-5001-8680 5000-1FE1-5001-8680 5000-1FE1-0015-2350 5000-1FE1-0015-2350 CPU % This counter presents the percentage of time that the CPU on the EVA controller is not idle. Snapclone. virtual disk management. A completely idle controller shows 0%. 18 . remote mirror traffic. Snapshot. Controller status metrics for two storage systems are reported in the following example. except that it does not include time for any internal process that is not associated with a host initiated data transfer.

such as Perl and Awk.-----Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Up 0 Bad Rx Char ---0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Loss Link Rx Discard Bad Proto Ctlr Node of Fail EOFa Frames CRC Err Synch ----. The original raw output file can provide a more sophisticated and in-depth performance analysis by organizing the performance object data into separate files for each performance object. 19 .---. which enables you to view a time series representation of each reported component for analysis and graphing. Separating EVA storage subsystems If you have multiple EVA storage subsystems (controller pairs) represented in one output file. Text processing tools. the text file containing the performance data from the HP Command View EVAPerf software can become very large and difficult to analyze. Port status with related counters Name Type Status Loss of Signal -----.Port Status performance object The port status counters report the health and error condition status for all fibre channel ports on each controller and are not performance related. Multiple concurrent executions of the HP Command View EVAPerf software can be performed to capture each system in a separate file. When gathering performance data for many components over an extended period.------. When you have each EVA storage subsystem represented in its own file. This additional processing is also required to derive performance of components that are not directly represented in HP Command View EVAPerf software data such as controller level workload.---. additional output file processing is required to facilitate performance analysis over a longer period and when graphing a performance counter value over time. use the following procedures to organize your performance data for component-level analysis for each storage system. each containing data for a single EVA storage subsystem.-----------0 0 0 0 0 0 A 5000-…-8680 0 0 0 0 0 0 A 5000-…-8680 0 0 0 0 0 0 A 5000-…-8680 0 0 0 0 0 0 A 5000-…-8680 0 0 0 0 0 0 A 5000-…-8680 0 0 0 0 0 0 A 5000-…-8680 0 0 0 0 0 0 A 5000-…-8680 0 0 0 0 0 0 A 5000-…-8680 0 0 0 0 0 0 A 5000-…-8680 0 0 0 0 0 0 A 5000-…-8680 0 0 0 0 0 0 B 5000-…-8680 0 0 0 0 0 0 B 5000-…-8680 0 0 0 0 0 0 B 5000-…-8680 0 0 0 0 0 0 B 5000-…-8680 0 0 0 0 0 0 B 5000-…-8680 0 0 0 0 0 0 B 5000-…-8680 0 0 0 0 0 0 B 5000-…-8680 0 0 0 0 0 0 B 5000-…-8680 0 0 0 0 0 0 B 5000-…-8680 0 0 0 0 0 0 B 5000-…-8680 ----DP-1A DP-1B DP-2A DP-2B MP1 MP2 FP1 FP2 FP3 FP4 DP-1A DP-1B DP-2A DP-2B MP1 MP2 FP1 FP2 FP3 FP4 -----Device Device Device Device Mirror Mirror Host Host Host Host Device Device Device Device Mirror Mirror Host Host Host Host Viewing individual performance objects While the plain text output data provides a general sense of performance over a short period. NOTE: The –sz command line switch restricts data capture to only the systems specified.----.---. can help you with this task. separate the data in the output file into multiple files. but care in setup and execution is required if the object samples in one file must be time-correlated to those in another file.--.

2074.2055.5.00.4.5000-1FE1-5001-8680 8:44:2.0.A.37.0.2.FP3.0.5.0.A.396.0.1865.6.2.0.87.5000-1FE1-5001-8680 8:43:42.5000-1FE1-5001-8680 2.3.0. Rearrange the extracted data so that all of the ports for a single timestamp are grouped on a single line.FP4.75.7.5000-1FE1-5001-8680 8:43:42.0.0.00.0.7.160.580.2.2572.0.135.0.0.B.0.2.B.372.5000-1FE1-5001-8680 8:44:2.02.Ctlr.FP1.3.0.A.00.5000-1FE1-5001-8680 8:43:42.5000-1FE1-5001-8680 8:43:42.3.93.2631.7.18.Write Req/s. 1.55.0.A.2. extract all host port data and present it chronologically for each port.B.11.0.8.5000-1FE1-5001-8680 8:43:42.11.0.93.397.155.FP3.41.5000-1FE1-5001-8680 8:43:52.680.4.FP2.415.5.47.2570.6.0.1.FP1.151.17.FP1.00.FP2.0.0.5000-1FE1-5001-8680 8:43:52.17.0.B.FP4.FP2.Av. Example of host port statistics extracted from the output data file for both controllers Time.146.0.0.61.0.3. Extract the port statistics from the HP Command View EVAPerf software output file into one file of all host port data.FP1.0.11.2.4.168.25.11.124.B.00.11.47.0.6.2.32.5000-1FE1-5001-8680 8:43:52.0.5000-1FE1-5001-8680 8:44:2.Read Req/s.0.6.7.0.FP3.4754.40.7.0.61.FP2.B.0.0.0.B.157.0.B.5000-1FE1-5001-8680 8:43:52.768.0.569.FP4.44.0. Queue Depth.0.580.44.32.0.FP4.5073.00.00.46.27.7.5.18.0.3.5000-1FE1-5001-8680 8:43:52.0.21.0.5000-1FE1-5001-8680 8:43:42.0.FP2.8.8.00.172.Write MB/s.Host port statistics To effectively view host port statistics.7.8.144.134.B.18.57.00.625.16.0.00.0.17.05.1.0.00.37.4.5000-1FE1-5001-8680 8:43:42.37.0.17.5000-1FE1-5001-8680 8:43:52.5000-1FE1-5001-8680 8:44:2.00.44.75.00.Name.Read Latency (ms).63.6.0.0.168.403.26.894.11.17.41.FP1.00.58.68.2.A.8.FP3.183.2.0.00.16.5000-1FE1-5001-8680 8:44:2.105.4.4.156.0.6.1.0.B.923.579.FP4.10.1724.A.0.FP2.66.36.16.36.5000-1FE1-5001-8680 8:44:2.Write Latency (ms).00.FP3.Read MB/s.A.159.0.3.23.1906. 20 .0.48.5000-1FE1-5001-8680 8:44:2.0.680.0.49.A.0.1.7.5000-1FE1-5001-8680 8:43:52.17.7.30.59.2.7.B.5000-1FE1-5001-8680 8:43:52.4.3.B.76.37.FP1.752.0.9.546.7.40.A.1904.0.49.FP3.5.754.Node 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 14/Mar/2005 8:43:42.00.50.A.0.594.FP4.0.0.8.0.A.0.394.0.0.2.0.0.0.0.5000-1FE1-5001-8680 8:44:2.00.A.2.0.1677.0.0.40.

0.FP4.0.24.44.0..6.23.93.3.59.5.6.0.23.156.40.168.0.44.4.53.0.3.73.3.49.383..9.17.3.39.7.0.49.0.2567.45.0.11.8 .36.63.2.7.4.168.8.758.594.7.3.2.0.2.894.386.0.0. Queue Depth.Fp3.63.0.26.2.0.8.5.0.6.586.8.49.0.1.754.582.169.2065.27.3.0.2.38 8:45:12.0.2.37.573.0.4.135.39.0.0.37 8:44:12.16.2.FP4.3. Time.8 .4 1.7.Write MB/s.8.23.0.72.404.Ctlr A.0.32.6 .40.Av.5.42.55.625.44.2567.580.1.7.14.754.39.2.18.58.76.02..7.36.156.81.0.2.0.0.35.FP1.2.14.) .681.155. calculated by dividing the data rate by the request rate for that interval.18.11.4754.37.0.0.Write MB/s.4..759. Example of host port data grouped by timestamp (All eight host ports appear on one line with a single timestamp.158.0.Ctlr B.2.152.39.FP2. 44.5.755.83.Write Latency (ms).Read Latency (ms).0.0.11.Read MB/s.Write Req/s..0.0.52.07.Write Latency (ms).16.7.4.3.11.7.3. Example of host port data grouped by timestamp from previous example in a spreadsheet format (This spreadsheet illustrates only the first three host ports for controller A).157.0.2.0.684.1.0.0.Write Req/s.45.77.7.400.2.88.0.66.69.17.20.7.401.Write Latency (ms).4..34.0.1677.4.1.0.0. Queue Depth.18.0.7.0.3..58.4.7.04.6.34.Av.135.12.02.40.388.563.8.575.0.18.62.5.0.Read MB/s.28.147.8.4.40.753.89.82.0.61.Ctlr A.1.76..4.154.6.385.0..572.923.11.44.FP2.Ctlr A.6.39 8:44:42.0.0.41.2580.159.61.7.0.9.6.183. Queue Depth.49.0.8.6.3..0.57.99.7.Av.1724 8:43:52.11..8.29.2.0.0.100.51.0.0.024 yields units of KB/transfer.580.7.5073.0.Read Req/s.0.Read Latency (ms).3.396.0.3.4.34.Read MB/s.3.124.7.2.Write Latency (ms).0.40.2.20.2.768.405.2.2572.2.0.4.7.0.156.Ctlr B.2574..8 .4..756.11.0.91.0.49.575.0.0.FP1.0.23.36 8:44:22.2.397.23.49.8..0.2.0.135.2.0.04.3.66.7.2.0.0..8.0.39.49.46.0.584.16.7.16.24.59.0.28..4.0.2.599.0.3.41.0.2.Read MB/s.688.Write MB/s.136.4.36.0.0.16.0.7.592.4.7.40.24.0.415.8 .Av.Write MB/s.4.58.0.0.11.2.38.2067. This estimation is an average for the sample interval.159.49.34.Read Req/s.0.0.53.2568.7.87.2064.44.1.3.2567..3.7.74.4.0.Write MB/s.1865.3.372.11.146..4.0.19.17. 8.0.158.17.0.5.2570.4.Write Req/s.3.16..162.Read Latency (ms).2074.7..0.4.It might be helpful in a performance analysis effort to determine estimated transfer sizes at the host port level or the virtual disk level.5.0.3.04..0.11.0.7.Write Latency (ms).684.0.2631.0.2.61.2.6.Read Latency (ms).0.589.0.5.6.0.6.4.1906.4.0.0.45.6.0.16.0.2.2.16.55.Read Req/s.7.2055.57.0..18.0.0..2.2571.1.0.0.16.591.0..77.18.4.168.5.0.58.7.2. 3.144.0.77.0.0.Write MB/s.2.396.6.0.0.0.Ctlr B.11.0.1.41.686.2060.45.151.0.47.25.0.38.26.69.34 8:45:02.0.0.37.11.40.1.10.0.31.9.105.40. Queue Depth 8:43:42.3.2.Av.17.17.75.72.8.7.50.27.168.17.0.0.3.1.37.44.75.2.0.40.3.3.000 and dividing by 1.17.8..0.35.135.0.2.17.73.48.389.0.152.7.2.7.168.0.0.0.0.0.21.16.0.2.46.0.49.44.546.158.2575..32 8:44:02.134.11.0.58.Write Latency (ms).0.0.7.40.4.Av.0.0.8.16.0.34.24.11.2060.5.0.0.168.0.0.5 .30.8..135. Queue Depth.0.0.0.25..2.407.0.0.569.172.17.0.0.160.0.6.149.7.Read MB/s.74..135.38.2079.82.49.0.0.4.0.7.0..581.0.40.FP3.3.1.Write Req/s.7.5.135..756.5.2.2.11.7.3.37 8:45:32. 5.11.Write Req/s.579..16.1.56.17.7.8.000.160.197.394.0.16.73.149.398.752.1.0.68.18. The following sample is exactly the same information in spreadsheet form to more clearly show content.0.14.11.85.47.6.44.6.0.8.18.2.2.0.2.7.40.0.0.0.680.0. The unit for this calculation is MB/transfer.11.4.40.79.168.0..93.2061.403.16.0.0.582.17.71.44.2574.3.15.62.9.8.Read Latency (ms).7.387.83.0.0.0.0..Read Req/s.0.679.17.0.0.66.8.Read MB/s.36.16..6.Read Latency (ms).Ctlr B.Read Req/s.29.2.0.17.0.Read Req/s.48.18.Write MB/s.4.395.10.Read Req/s.49.2.0.579.. 5.18.160.0.394.0.580.2062.11.5.589.0.49.16.75.Read Latency (ms).15.0.Write Req/s.135.09.401.15.31.18.0.152.16..5.41.1.35 8:45:22.758.18.05.135.157.0.683.0.0.678.36.0.Read Latency (ms).40.0.0.0.395.0.181.0.1.0.9. Multiplying by 1.11.0.7.4.5.8 .35 8:44:52.2.Read Req/s.33.5.5.85.Read MB/s.3.Av.6.2. Graph the individual port data over the time interval of interest.0.0.0.Av.17.17.40.9.0.0.0.0.168..37 8:44:32.7.0. 21 .2.Write Req/s.39.682.7.7.16.1904.680.27.6.152.151.Read MB/s.168.0.7.7.18.59.0.11.4.131.55.36. Queue Depth.0.394.16..38 The preceding illegible example is included to present the extraction of host port data as time series information as a tool would process it.7.7.758.17.2060.2.0.5.Write Latency (ms).168.8. Queue Depth.0.16.0.0.16.7.0.16.2.0.16.28.Write Latency (ms).Write Req/s.0.28.Ctlr A.Write MB/s.0. Queue Depth.17.3.0.155.7.2.32.35.586.39.

0.Back.256.Yes. and write misses for each virtual disk as described previously for host ports (optional).B.1.Back.83.Yes.0.17.0.…-0163 9:43:28.0. 2.256.Yes.0.0.Prefetch MB/s.0.B. in most situations.0.0.256.82. in many circumstances.256.97.18. combine read hit and read miss statistics into a single set of metrics for all reads.0.0.0.0.0.0.-.0.…-0163 9:43:38.1.Read Hit MB/s.0.0.0.1.0.Back.0.Virtual disk statistics To view the virtual disk performance statistics.0.0.0.…-015E … 9:43:18.17.Back.1.0.0.42. Read hits and read misses are reported separately for the virtual disks.0.0.0.39.0.0.0. For a simpler presentation of the virtual disk read activity.B.0.0.2.0.1.0.0.72.4.0.256.Flush MB/s.…-015E 9:43:58.26.14.Back.0.-.0.…-0163 9:43:28.0.15.0.0.88.…-015E 9:43:48.Read Miss Req/s.1.0.0.0.16.…-015E 9:43:48.38.83.B.0.…-015E 9:43:38.Yes.0.256.0.B.0.1.…-0163 9:43:48.0.0.Group ID.0. However.-.B.144.0.0.0.-.0.B.83.256.B. 3.92.-.A.0.B.03.0. Depending on what information you need.8.0.1.LUN 9:43:18.8.0.0.A.Online To. read misses.0.B.1.Read Hit Latency (ms).Mirr.0.0.1.19.-.B.4.0.256.Yes. it is also helpful to combine all virtual disk data for each controller.0. it is advisable to combine the data for both controllers for each virtual disk.26.0.-.0.0.1.Back.0.0.0.…-0163 In the preceding example.0.B.-.0.142.B.ID.0.0.Ctlr.B.0.0.0.0.0.0.-.Yes.0.0.98.A.Back.B.1.0.0.0.63.0.Write Latency (ms).-.…-015E 9:43:28.0. .0.0.-.0.Yes.0.0.0.-.Read Miss Latency (ms).B.A.0.256.0. This procedure provides the complete activity picture for each virtual disk.1.0.0.0.Yes. the first four samples for virtual disk 0 and 1 are presented for each controller.08. Example of virtual disk performance data extracted from the full output data file or from the single object data file Time.-.Yes.0.0.76.0.0.Yes.0.0.-.0.0.15.0.256.0.B.Back.Back.7.0.256.0.B.0.0. they must always be combined as weighted averages using following formula: 22 .26.141.Flush RPS.2.9.0.B.B.0.0.131.Yes.Back.18.Write MB/s.0. For further analysis.Wr Mode.0.0.256.0.256.0.0.0.-.0.0.0.0.0.-.0.1.100.256.0.38.95. 1.98.Write Req/s.0.0.0.0.1.0.Yes.0.0.0.0.0.Back.0.0.8.Yes.256.B.0. extract the data for each virtual disk into a separate time series file by performing the following suggested steps.B.0.82.Back.B.Mirror MB/s.Back.Yes.63.A.14.Yes.8.1.…-015E 9:43:28.0.0.0.0.4.A.0.17.256.0.Back. Produce additional information by calculating transfer sizes for read hits.…-015E 9:43:38.-. If latencies are to be merged.0.0.0.…-0163 9:43:48.0.0.0.…-0163 9:43:38.A.0.0.Yes.0.98.Yes.0.0.…-0163 … 9:43:18.A.0.Back.B.8. Organize all virtual disk statistics for each controller into separate files or separate fields in one file.64.0.Back.B.Back.8.0.Read Hit Req/s.0.…-015E … 9:43:18.0.Read Miss MB/s.0.4.0.0.0.0.0.0.0.0. average all request rates and data rates for each virtual disk separately for the measurement interval for selection of those with dominant workloads.0. Combine read hit and read miss statistics into single set of metrics for all reads (optional).256.B.27.0.1.6.0.

For example.5 ms at a request rate of 80 requests/s and a read miss latency of 14. queue depths. the combined read latency is: This formula is extensible to any number of elements by summing the products of all request rates and their associated latencies. where all disks are members of virtual disks. However. An example of the average workload and performance metrics for the first 17 volumes of the virtual disks is shown in the following example. However. individual or drive subgroup reporting might be necessary. then dividing by the aggregate of all request rates. 23 . or latencies must be done manually. Example of virtual disk performance summary over the data capture duration Physical disk statistics In an EVA. if there are different capacity physical disks or different speed disks within a single disk group. Because there are so many physical disks in an EVA system. usually the value of the physical disk information is providing detailed performance analysis at the disk group level. it is difficult to extract meaningful information on an individual physical disk basis. Each line represents the average of the full data capture period.2 ms at a request rate of 327 requests/s. if a given virtual disk reports read hit latency of 1. This is actually accomplished easily in the HP Command View EVAPerf software using the CLI pda command to view overall data rates in a physical matrix of physical disks by shelf and bay. which enables you to generally compare long-term virtual disk performance metrics and quickly determine which volumes to focus on for further analysis. request rates.

1.0.17.1.6.0.0.33.8.3.99.11. To identify physical disks that have might have disproportionate levels of workload skew.170.1.91.07.0.0.166.1.8.3.256.A 11:53:28.1.0.85.Read Req/s.57.16.1.8.93.B … The previous example shows the first few samples for physical disks 0 and 1 for both controllers.256.16.55.173.2.0.19.1. Physical disk performance data extracted from the output data file Time.9.256.5.2.1.Ctlr 11:41:00.0.1.256.0.17.03.22.1.1.6.21.B … 11:41:00.0.9.168. average all request rates and data rates for each physical disk separately for the measurement interval.151.8.23.11.16.256.85.A 11:53:28.11.18.3.13.146.1.256.1.1.256.Write MB/s.12.11.57.11.1.83. Latencies must be combined using the weighted average formula.12.256.88.0.A … 11:41:00.2.2.3.256.1.ID.0.A 11:49:26.1.0.146.256.1.83.21.21.Bay.0.To obtain performance data on the physical disks.B 11:53:28. The following example shows the contribution from each controller.0.6.0.256.3.16.0.1.0.57.07.256.99.31.59.3.B 11:53:28.50.Read Latency (ms).7.256.1.2.0.187.8.1.7.3.A 11:49:26. each controller presents its statistics separately.22.5.20.B 11:49:26. Again.3.4.4.10.8.8.149.0.30.11. but usually it is preferable to combine controller statistics for each physical disk.14.20.0.0.0.9.1.0.92.174.12.55.1.0.146.256.1.0.1.12.0.A 11:51:27.Enc.8.Write Req/s.16.95.0.0.2.12.3.Grp ID.93.2.0.20. rearrange the data for individual physical disks into time series.0.0.3.16.11.11.1.3.0.31. The next step would be to combine statistics from both controllers for each physical disk so that all activity is coalesced for each line.20.19.A 11:51:27.9.7.10.03.256.5.18.0.0.Read MB/s.7.A … 11:41:00.2.0.Drive Queue Depth.27.1.28.1.1.0.2.95.15.06.16.3.0.05.13.B 11:49:26.0.92.7.0.12. 24 .0.0.B 11:51:27.1.0.0.59.2.10.156.Write Latency (ms).5.0.16.08. Further reduction is discussed in the next section.256.1.11.3.B 11:51:27.2.

3.Source ID.0.30700.0.0.40800.40800.B.0.03.B.Copy In MB/s.5000-1FE1-5001-8680 Host connection statistics To obtain performance data on the host connection.3.3.Ctlr.0.0.14.40800.Tunnel Number.B. separate all tunnel records into separate time series files.Host Port.Copy Out MB/s.12.14.0.12.3.07.0.B.5000-1FE1-5001-8680 9:39:58.) 25 .30700.12.0.Write Retries.27.0.) Time.Node 9:39:39.8. which is usually a subset of all open tunnels.0.3.8.Write In MB/s.30700.0. one host connection per field. separate queue depth for each host connection into time series file.B.0.3.16.0.0.12.40800. Single HP Continuous Access tunnel statistics extracted from the output data file (This should be done for each active tunnel.0.3.8.5000-1FE1-5001-8680 9:40:08.0.3.0.0.0. three host connections are shown on a one line for each timestamp. ID.5000-1FE1-5001-8680 9:39:48.0. Host connection statistics arranged as a time series (In this example.16.0.5000-1FE1-5001-8680 9:40:28.0.0.0.16.8.B. which will show all active and inactive tunnel statistics in a contiguous file.87.40800.16.3.3.3.3.40800.16.HP Continuous Access tunnel statistics To obtain performance data on the HP Continuous Access tunnels.14.0.30700.Round Trip Delay (ms).12.Dest.30700.Copy Retries.5000-1FE1-5001-8680 9:40:18.12.8.0.8.0.0.0.Write Out MB/s.30700.

75 12.02 22.01 0.19 ----44.21 7. Otherwise.00 0. the metrics must be summed over all disk groups.00 0.00 0. which have been combined from the individual host port statistics and must be calculated as part of this step.0 0. This will display the controller workload and performance information. Storage subsystem To obtain detailed information on controller activity and storage subsystem activity.5 11.84 0.90 15.0 0.00 44. combine all host port traffic for each controller.00 0. 1.4 0.67 15.00 0.00 44.00 0.81 14.1 0.23 11.84 0.84 11. Beginning with the host port data grouped by timestamp.4 0.71 6.00 2003 200D 301Q Z001 T03T T07C T03T T07C T03T T07C ---Cab2 Cab2 Cab4 Cab4 Cab6 Cab6 Cab6 Cab6 Cab6 Cab6 Recombining for more information The time series files can further be manipulated to combine specific object counters to pinpoint performance data at the controller and disk group level.9 13.00 0.4 0.4 Total Flush MB/s Total Total Ctlr Mirror Prefetch MB/s MB/s Node ------Default Default Default Default DG2 DG2 DG3 DG3 Default Default ----5458 0 5453 0 1733 892 1547 1358 935 1853 ----44.85 0.00 0.84 0.1 14.8 0. 2. Combine the latencies as weighted averages using the formula on page 23.0 0.Virtual disk by disk group statistics The vdg command line option is valuable for capturing controller workloads. 26 .00 0.00 0.0 0.---44.05 0. then the metrics exactly correlate to controller statistics.9 0.7 0.0 0.00 Average Read Hit Latency (ms) ------0.00 14.2 10. Virtual disk statistics by disk group (The first two systems each have a single disk group.00 0.00 0.03 17.86 0.00 0.00 0.00 0. If there is a single disk group behind the controller.00 13.0 11. This option provides the aggregate virtual disk statistics on a disk group basis.9 Total Write Req/s Total Write MB/s Average Write Latency (ms) ------17.00 0.00 44.0 0.00 0.5 0.2 0.80 11.01 0.00 0.7 0. including the controller read and write latency.31 12.) Disk Group Total Read Hit Req/s ----0 0 0 0 1 1 1 1 0 0 Total Read Hit MB/s ----0.00 0. so the data presented is exactly the controller workload.75 Average Read Miss Latency (ms) ------23.0 0.-------.0 0.00 20.09 -----.0 Total Read Miss Req/s ----7963 0 0 0 2543 1351 2323 2103 1345 2775 Total Read Miss MB/s ----65.78 0. so controller workloads must be acquired through summation of each disk group’s data. recombine the host port data using the following procedure.01 0.00 15.37 0.00 0.2 0.01 0.68 11. The third system has three disk groups.1 0.13 7.62 7.07 19.8 10.0 0.

In this example. and latencies should agree quite closely with the host port rollups for the same timestamps. Timestamps are shown only with controller A because the two tables would normally be shown beside each other. combine all virtual disk statistics for each timestamp by controller as shown in the following example. To acquire a more complete performance picture for the controllers. This provides backend and front-end data and separates read hits from misses. 27 .Recombined host port statistics provide controller-level performance data 3. the timestamps have been modified to show elapsed time in minutes instead of absolute wall-clock time. data rates. The request rates.

which enables you to determine whether the disk group size is sufficient for the workload. all workload is from a single controller. in the continuous mode. it would be necessary to separate the information for each disk group into separate files or separate fields on one line for each timestamp. the entire physical disk data must be processed to extract data accordingly.Recombined virtual disk statistics to provide controller-level performance data 4. If desired. it would be beneficial to combine workloads to view the full workload to a disk group. First. it is easier to view the physical disk data aggregated into disk groups. If it is necessary to view the workload separately for drive subgroups. combine these performance matrices further to acquire a single table of array-level performance information. 28 . in this example. This is most easily accomplished using the CLI pdg command. but because that is not always the case. To view each disk group as a time series. There are two opportunities for extracting additional information. Secondly. The following example is the output from a single EVA3000 or EVA5000 system with three disk groups. 5. Disk groups To determine how heavily the disks are being exercised. which groups all activity from all physical disks according to disk group. all information for all counters is presented for each sample interval. This plain text shows the activity for all disk groups on the system and for each controller providing the workload. Graph the controller-level data over the time interval of interest.

Examples of nonstorage related poor performance causes are: • Host problems with the application. It is critical. 29 .0 0 0.6 0 0. If possible. however.1 0 0. A lengthy time series is not always necessary to identify clear problems such as excessive port imbalances or saturation. file system parameters. • Operating system configuration parameters such as maximum transfer size.34 48 A …-AF10 01 0 0. but now I’m impacting production. the problem is likely not in the storage system.” or “Users are complaining of long response times on their e-mail. Each adapter can have impressive performance. begin with host-based metrics that quantify the problem. While addressing those problems is well beyond the scope of this paper. there are several sources for system performance and tuning for many applications and their environments. This is done from a user’s requirements perspective.00 48 0. Reviewing the data Performance problems can sometimes be identified using the HP Command View EVAPerf software text output data as it is without further processing. to be mindful that workloads are usually quite dynamic and variable.” or “My backup window used to be fine.00 80 B …-AF10 Performance problems The first step is to determine if there is indeed a performance problem. • Too few paths to the storage system or paths poorly configured.00 38 B …-AF10 Default 2 11.9 0 0. high response times at the virtual disk level.” It is far more helpful to define a performance problem in terms of its effect on the user instead of interpretation of the metrics. link speed settings. Driver queue depths.00 0 0. Current host bus adapters can deliver on the order of 25 to 30.” While some metrics might readily indicate a problem. Some examples are: “My database application is getting storage access time-outs.0 0 0. and so on can have a profound effect on storage performance.000 IO/s and 200 MB/s for reads. I/O response coalescing.84 38 A …-AF10 02 0 0. It is not unusual for a storage user to inspect performance statistics and decide that a certain metric is “bad. so what is observed at one time instance might not be representative of the workload character.Physical disk performance data by disk group Disk Average Average Average Average Average Average Average Average Number Ctlr Node Group Drive Drive Read Read Read Write Write Write of Queue Latency Req/s MB/s Latency Req/s MB/s Latency Disks Depth (ms) (ms) (ms) ------------------------------------------------------------------------------------------01 2 10.0 0 0.00 48 B …-AF10 02 2 8. but I’m doing only writes to my Vraid 5 volumes.00 0 0. a judgment of goodness must be made at the system or user level.” Examples of non-problems are: “These host port queue depths are way too high. and so on.79 80 A …-AF10 Default 0 0.” or “The HP Command View EVAPerf software shows lots of read activity on physical drives. but if the SAN fabric is misconfigured.00 81 1. page swapping. then compare with controller performance metrics. If the host-based metrics that indicate a problem are consistently different than that seen from the controller metrics.00 0 0. heavy proxy controller workloads. or unexpectedly finding virtual disks in writethrough mode.00 51 0. there can still be a performance problem. storage access restrictions.

Controller statistics Controller utilizations will show how hard the controller processor is working. The data path utilization reported is how much of the controller processor time is allocated to servicing data requests. all can indicate problems with the storage array. If the processor utilization is low. determine the level of activity compared to practical performance limits for the given workload. If the absolute levels are well below the controller capability. determine if the reported behavior conforms to what is expected. then the physical disk storage might be the bottleneck. are sequential reads or random read/write mix expected? Are transfer sizes in line with the given workload? Does workload variation correspond to known changes in usage patterns? Is each virtual disk active at the appropriate time? • HP Continuous Access tunnel activity—Do Write Out/In values correlate to the corresponding remote mirror write activity? Are copies active when they should not be or not active when they should be? Do ping times reflect expected link parameters and behavior? Host port The first look is often at the host port information. database internal metrics. high data rate sequential workloads can saturate controller data paths yet leave the controller CPU with low utilization. There are many factors to consider. Otherwise. It is not unusual for CPU utilization and data rates to be inversely related. Not all EVA counters are available in Perfmon. depending on constraints such as fabric topology. it indicates a rebalancing opportunity. which has the added benefit of being able to select specific objects and counters within the performance object.A reasonable alternative to examining top level output data is to monitor the same counters in the Perfmon window. and so on. Estimate the controller load from the aggregate of all ports on that controller. server workload skew. In absolute terms. 30 . and queue depths with respect to externally generated performance metrics if performance problem identification began there. so data rates close to that might be experiencing the technology limit. despite the appearance of a very busy controller. request rates. Some of these activities might not be a priority and can be interrupted by data transfer requests from a client. Also. fabric monitors. how port workload is balanced. Single port data is usually not as informative as the aggregate across the ports for the controller load. Those metrics. and if the metrics correspond to the user’s observations. so short-term direct comparison is effective. it will be anywhere from simple to do to impossible to perform. Awareness of reasonable values can facilitate closing in on a performance problem. which allows opportunity for further host activity. yet there is a performance problem with the array. a few of which are: • Server activity—Are the appropriate host adapters providing expected relative contributions to workload? • Virtual disk workload—Are the workloads to each virtual disk appropriate? For example. the port (link) is not the bottleneck. At every level of analysis. A single port has a fixed upper limit of about 200 MB/s based on 2-Gb Fibre Channel protocol. It is easiest to confirm the problem at this level. ISL constraints. The Perfmon histogram function enables you to see the time varying metrics in a stationary position. data rates. sparing and leveling. If imbalance is significant and one controller is approaching some limit. and virtual disk workload imbalance. Determine controller workload imbalance. but most essential ones are. The difference between CPU utilization and data utilization is the overhead the controller is allocating to non-client data transfer activities. Verify latencies. such as external communication. including those from server performance tools. any imbalance should not be a problem. While the rebalancing concept is quite straightforward. and virtual volume management. This provides a top level view of what is going into and out of the array.

those workloads will see very few opportunities to access data already in cache. at the virtual disk level. In this case. This can be confirmed by comparing virtual disk read hit data rates with their corresponding prefetch data rates. Virtual disk Workloads are most thoroughly defined. The subsequent server response to a busy or queue full signal is also operating system dependent. sometimes significantly. Because transaction workloads such as database queries and updates usually have a random data access range that far exceeds the read cache capacity. the write request must wait for cache space before it can be completed. the processor will be the critical performance element for small transfer sizes and high request rates. depending on server application sensitivities. and the data path moves the data. This can also happen when the mirror port utilization becomes very high and cannot copy the data to the other controller in time to support the workload. cache bandwidth. often sub millisecond. Virtual disk latencies are not independent of each other. and indicate either the virtual disk is in write-through mode (unlikely) or the cache flush mechanism is approaching its limits (more likely) or high utilization on the mirror port. A quick look here can verify appropriate server loading balance. one virtual disk’s workload can significantly impact another’s latencies. the request is acknowledged as complete. and the interaction between workloads most easily shown. Cache flush operations and mirror port utilization depend only on the aggregate workload for the controller. Read hit and read miss statistics are reported separately. Therefore. Write latencies can increase unexpectedly on one controller for the battery condition presented in the “Wr Mode” description under the “EVA Virtual Disk performance object” section on page 9. there will be a controller event logged and viewable in HP StorageWorks Command View. Write latencies of a few milliseconds indicate that writes to the controller are below its flush capability. request latencies. sometimes sub millisecond. and write latencies increase. Usually. Each one represents a signal from the controller to the host to suspend further requests until some internal queue in the controller is sufficiently reduced. Thus. which includes elements such as bus bandwidth. This value is not so much a performance metric as it is an indication of how much load each host is offering to the array. Read hits usually have low. Host connection Host connection queue depth is a measure of the average number of requests in process for each host adapter. the request is not marked as complete until the data is in cache in both controllers. Large transfer sequential workloads often can have higher latencies. The processor manages all activity and communication in the controller. and fiber port parameters. Queue full responses are grouped in this metric and whichever is issued is operating system dependent. while for large transfer sizes. but usually less than a few milliseconds. and the data marked for flush to the back end. In write-back mode. the data path from server through controller to physical disk will have a greater impact on performance. Read latencies less than 15 ms should be acceptable for most random workloads. write latencies are always low. High queue depths that are deemed problematic should be traced to the virtual volumes and the back end systems that support those volumes. The presence of busies in the host connection metrics are usually an indication of a performance problem. Double-digit write latencies might still be appropriate. In either case. Latencies should be appropriate for the workload. For this scenario. writes are always accepted by cache.It is important to realize that there are two critical resources within the controller—the processor itself and the data path. read hits generally occur primarily for sequential read workloads where the sequential detect and prefetch algorithms are highly effective. When the write workload (or other back-end activity) reaches a level such that cache flushes cannot keep pace with the front-end write demand. The actual values will be workload-dependent. 31 .

the write performance will be close to Vraid 0. However. Physical disk It is usually difficult to extract good information from the raw physical disk output data because of the enormity of the data and short-term variability within a disk group. There is no association between virtual disks and physical disks in a single disk group. so if data is sampled at short intervals. and large transfers will take relatively long intervals to complete. Thus. 32 . the replication mode essentially reverts to synchronous mode. Cache flush is asynchronous with respect to front-end writes. HP Continuous Access EVA source volumes will almost always be impacted by replication activity. especially for low write workloads. Typically. Because these writes happen in parallel and asynchronously. which can drive up unrelated. but the owning controller must still flush the data. For each controller. When the volume being written to is a Vraid 5 volume and sequential activity is not detected. Proxy access to a virtual disk can have a negative impact with high workload intensity. one for each member of the mirror pair. Writes to either controller require mirror transfers anyway. All virtual disk storage is physically allocated across all physical disks in the corresponding disk group. which will behave differently under different conditions. this will be the aggregate of all data written to any virtual disk owned by that controller. Asynchronous mode will benefit light workloads but will become less beneficial as the write workload to a source volume becomes more intense. Large transfer sequential workloads can negatively impact random workload response times on a different virtual disk.Mirror ports are active for the owning controller when a write request is received and cache mirroring is enabled for that virtual disk or when a read request is received by a proxy controller for that virtual disk. it takes more aggressive post processing to glean useful information from the physical disk data. and it uses a mirror port to transfer the data between controllers. so all physical drives will share in all virtual disk activity. For synchronous transfer mode. Sequential writes to Vraid 5 volumes will have a minor parity impact. All virtual disks share storage system resources. then the mirror port must carry twice as much data as that written to the host port because parity must be transferred with the data to ensure data integrity through all failure scenarios. This might not be a problem. The sequential workload can easily drive up internal bus utilization. Vraid 5 requires parity management. Reads from a virtual disk through the proxy controller still require the owning controller to retrieve the information. When the internal link queue is full. Reads will be shared between the two mirror volumes. but non-sequential workloads will endure far more expensive back-end activity. the reported mirror port activity to a given virtual disk is the sum of write data to that virtual disk and read activity in support of its proxy counterpart. there should be essentially uniform activity across all equal size drives in a disk group when averaged over longer intervals. This condition forces queuing for internal system resources. Neither parity nor mirror redundancy data is included in the flush rates. Link speeds and link delay will impact this process. Vraid1 forces duplicate writes. random small transfer latencies. Because mirror port data rates are always the write activity (outbound) for each controller for the associated virtual disk. The same effect is seen on mirror traffic for the volume. but balancing the workload across controllers will require care if proxy workloads are involved. a write request cannot be completed until the data from the request is stored on the replication system and an acknowledgment returned. This condition results in elongated read response times and higher mirror port utilization. a visual scan might be effective in identifying a disk or subset of disks that have relatively high queue depths and latencies in sample after sample. Flush data rates represent the movement of data from cache to physical disk. Behavior can be markedly different for different RAID levels. so there is the potential for Vraid 1 read performance to exceed Vraid 0. the data rates from host writes and flush might not agree. A proxy controller will not flush cache for any host write requests.

as well as Write In and Write Out values. Copy Out data should be 0 for all volumes that are fully normalized between source and destination sites. so it will experience high queuing for HP Continuous Access tunnel resources with slower links and highly intermittent workloads. Low values would be expected for local links with relatively light write workloads to the source volumes. Bursts of write activity can result in high latencies even if the long-term link utilization is low. Controller workload To analyze controller workload. The front-end information should correlate well with the host port rollup for request rates. These should be in line with what is expected for the link at the given time. The parameters and behavior of the inter-site link will have a non-trivial effect on performance. unless a link disruption has triggered a full copy. provides a first look at ISL health. Write Out data rates include all write traffic to the corresponding remote mirror volumes. Snapclones. For example. This in turn is reflected directly to virtual disk latencies. you will see an average of four times the workload on the 146-GB disks than for the 36-GB disks. it would be helpful to examine physical disk subgroups of equal size drives Reported physical disk activity includes everything that happens behind the controllers. HP Continuous Access tunnel Look first at the round trip delay values (ping times) for active tunnels. Copy In and Copy Out values. In cases such as this. on a system with 36-GB disks and 146-GB disks. The larger disks will have proportionately larger I/O workloads because EVA allocates storage in proportion to the disk capacity. Mapping physical disk activity to front-end activity ranges from difficult to impossible. any write retries. it is necessary to combine host port data or virtual disk data to acquire all of the relevant information on each controller. and all I/O traffic to support Snapshots. and merge data from a HP Continuous Access EVA log on the source side. The first is when there are disks of unequal capacity within the disk group. the link becomes the bottleneck. Delays should be essentially symmetrical as reported by the source and destination systems. Comparing this value to the write data for all source volumes. Extracting controller level data from a virtual disk provides a richer set of information. and latencies. including parity and mirror redundancy writes as well as parity and data reads for parity calculations. and HP Continuous Access traffic. The EVA controller architecture provides far better burst capability than the link itself. If there is a single disk group in the target EVA.There are conditions under which there can be long-term workload skew to individual drives. along with ping time. Non-zero data rates here indicate that normalization is not complete or a new copy has been triggered by a previous link failure and subsequent restoration or by full copy request. Read hits and misses will be tallied separately for the virtual disk rollup but could be easily combined by the processing tool. If the write workload demand is greater than that rate. A low speed link can limit the rate at which data is transferred to the replication system. 33 . These times should vary mostly with load. data rates. should be essentially the same except for deviations caused by short sample intervals or other timing anomalies. sparing and leveling when appropriate. controller information can be viewed directly using the vdg command. Copy values will be 0 for all HP Continuous Access EVA groups in which the member volumes are fully normalized. It also includes background activity such as metadata management. Merge data is “catch up” data from a temporary link disruption. The former is easy to do either directly with a post processing tool or by reorganizing port data as described on page 27 and then summing the data for the corresponding controller. Higher values would be expected for heavy write traffic or for ISLs with long intrinsic delay times. Both front-end and back-end activity should be reviewed in a virtual disk rollup per controller.

that can add supporting information for sequential and random workload identification. If EVA4000. which would include activity such as write-ins or copy-ins from an HP Continuous Access EVA source system or writes on behalf of the proxy controller. mirror.Front-end activity is a clear and detailed picture of the workload presented by all servers attached to the storage array and its response to that workload. if the time series chart is created. The essence of this practice is to capture all performance logs of interest across the time intervals of interest for an uninterrupted period. process. Performance base lining The capability to capture. This process requires running the HP Command View EVAPerf software at regular intervals determined by the user and archiving the data. At that point. A time series for a system with light write workloads will show intermittent flush activity. Flush rates are designed to be adaptive based on the write workload intensity. which enables you to quickly determine how heavily loaded the disk group is. you can monitor the dynamic attributes of the disk group load. flush rates will become higher and steadier. it could indicate write workload to volumes that have cache mirroring disabled. When flush rates are consistently equal to the aggregate write rates. EVA6000. If transfer sizes are calculated. Disk group The most common metric for disk groups is the workload rates averaged over all disks in the disk group. As is the case for all metrics. Workload balance between controllers is clearly shown and charted over time if the time series data reduction is done. write latencies might increase significantly. and EVA8000 mirror rates are high compared to write rates for each controller. must be handled by that single controller. This is more often a consideration for data rates. it could indicate a high proxy read activity. Host port rollup to controller level clearly shows balance between controllers. as opposed to requests per second. In that case. the maximum write rate to that controller has probably been achieved. The benefits of this practice are many. The workload skew across drives within a disk group is good information to have under the heterogeneous disk group conditions as described on page 32. As the total write activity to the controller increases. If mirror rates are low compared with write rates. Back-end activity includes prefetch. Prefetch rates should correlate well with read hit rates. all activity for those virtual disks. One configuration in which controller balance can be an issue is with an HP Continuous Access EVA implementation. a few of which are mentioned here. and preserve workload and performance metrics for a storage array can be invaluable when encountering problems later or when modifications are considered. Controller balance should be a concern if one controller is nearing the upper limit for the current workloads. and flush data rates. Because all volumes in a single group must be presented to the same controller. including replication traffic. 34 . because it is much easier to drive the internal controller buses into saturation than to exceed the capability of a controller to process requests. some definite compromises on array performance might exist. It is not necessarily a problem except under extreme cases of skew and workload. and virtual disk rollup per controller provides back-end information as well.

• Long-term trends can be identified as relevant factors change. When something goes wrong • Quantify the performance change when a problem is observed by comparing current state against previous activity. Comparison techniques • Process control charts are commonly used in a manufacturing environment to track some specific set of parameters. and environment variables (more incident traffic on shared ISL). iostat. for example) review of the information will be done. Lower limits are difficult because most workloads become inactive part of the time or experience periods of less than normal demand or have high variability. the time references for all host systems must be close. or Perfmon server metrics). then immediate post processing of the data would be required. does the request rate increase or decrease? The introduction of a sequential workload on one controller can affect request latencies on the other. • Current array performance capacity assessment is essential when determining the need and character of potential upgrades. Queue depths might realistically employ an upper limit if the workload is well documented and reasonably predictable. Establishing a daily pattern is essential for characterizing the workload trends and variation throughout the day. if not synchronized. • If the problem is intermittent or periodic.Normal operation • Provide history workload and performance characteristics of the target system. Command View event logs. • With or without process control charts. This establishes a baseline of the system while it is healthy and therefore provides a reference for when it is not. 35 . If periodic (daily. This facilitates correlation with other events. If only archival information is required in the event of a problem. additional physical drives will not help when the dominant workload is large transfer sequential I/O. This technique employs means and standard deviations to determine if a process is within acceptable limits and graphs easily expose trends. If latencies increase. Creating upper thresholds for request rates or data rates is problematic because higher is usually better. You should be aware that. some of which are storage occupancy (more data on the subsystem). For example. then the data can be stored in raw form. the data enables you to determine if there is an associated change in workload. • Reference data enables you to identify the moment when a problem began to appear. Latencies tend to be the most reliable predictor for many workloads because lower is always better and there are common guidelines from many applications for that parameter. workload growth (more users on the e-mail system). if correlation is to be made with other reporting mechanisms (for example. • If performance degrades. historical sampling enables you to determine precisely when the problem happens and when it does not. sometimes you might want to set thresholds for visual or automated detection of a performance issue. Determining if there is an associated workload change at the same instant as the performance metric change is invaluable. • Periodic patterns can be determined from analysis of long-term data. Even monthly patterns are useful in some cases. Weekly patterns also help to distinguish abnormal behavior from normal variations.

registered trademarks of Microsoft Corporation. The information contained herein is subject to change without notice. Nothing herein should be construed as constituting an additional warranty. 5983-1674EN. Microsoft and Windows are U. Rev. HP shall not be liable for technical or editorial errors or omissions contained herein. L. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services.P. 1 06/2005 .S.© 2005 Hewlett-Packard Development Company.