Professional Documents
Culture Documents
Symmetrix Performance Workshop Lab Guide1 PDF
Symmetrix Performance Workshop Lab Guide1 PDF
Workshop
April 2012
Use, copying, and distribution of any EMC software described in this publication requires an applicable
software license.
EMC2, EMC, Data Domain, RSA, EMC Centera, EMC ControlCenter, EMC LifeLine, EMC OnCourse, EMC
Proven, EMC Snap, EMC SourceOne, EMC Storage Administrator, Acartus, Access Logix, AdvantEdge,
AlphaStor, ApplicationXtender, ArchiveXtender, Atmos, Authentica, Authentic Problems, Automated
Resource Manager, AutoStart, AutoSwap, AVALONidm, Avamar, Captiva, Catalog Solution, C-Clip,
Celerra, Celerra Replicator, Centera, CenterStage, CentraStar, ClaimPack, ClaimsEditor, CLARiiON,
ClientPak, Codebook Correlation Technology, Common Information Model, Configuration Intelligence,
Configuresoft, Connectrix, CopyCross, CopyPoint, Dantz, DatabaseXtender, Direct Matrix Architecture,
DiskXtender, DiskXtender 2000, Document Sciences, Documentum, elnput, E-Lab, EmailXaminer,
EmailXtender, Enginuity, eRoom, Event Explorer, FarPoint, FirstPass, FLARE, FormWare, Geosynchrony,
Global File Virtualization, Graphic Visualization, Greenplum, HighRoad, HomeBase, InfoMover,
Infoscape, Infra, InputAccel, InputAccel Express, Invista, Ionix, ISIS, Max Retriever, MediaStor,
MirrorView, Navisphere, NetWorker, nLayers, OnAlert, OpenScale, PixTools, Powerlink, PowerPath,
PowerSnap, QuickScan, Rainfinity, RepliCare, RepliStor, ResourcePak, Retrospect, RSA, the RSA logo,
SafeLine, SAN Advisor, SAN Copy, SAN Manager, Smarts, SnapImage, SnapSure, SnapView, SRDF,
StorageScope, SupportMate, SymmAPI, SymmEnabler, Symmetrix, Symmetrix DMX, Symmetrix VMAX,
TimeFinder, UltraFlex, UltraPoint, UltraScale, Unisphere, VMAX, Vblock, Viewlets, Virtual Matrix, Virtual
Matrix Architecture, Virtual Provisioning, VisualSAN, VisualSRM, Voyence, VPLEX, VSAM-Assist,
WebXtender, xPression, xPresso, YottaYotta, the EMC logo, and where information lives, are registered
trademarks or trademarks of EMC Corporation in the United States and other countries.
All other trademarks used herein are the property of their respective owners.
Copyright 2012 EMC Corporation. All rights reserved. Published in the USA.
Step Action
https://hostname:8443/spa
If it warns you about security or certificate issues, just confirm in a way that continues
the program launch.
Log in using the user and password your instructor has provided, or with the default:
It may take up to a minute after the initial launch for the program to become active. Be
patient.
Note that the default view when you log in to SPA is the default Dashboard Heat Map.
Examine the Dashboard and identify the components that are actively performing work.
a. Record any components that are color coded red indicating 100% utilization.
DA Directors: ______________________________________
Disks: ____________________________________________
DA Directors: ______________________________________
Disks: ____________________________________________
Select the Snapshot view and click on the local Symmetrix in the left.
Use the Utilization Distribution to find out what the overall condition of the Symmetrix
components is. Do any of the components have greater than 25% utilization? If so, which
components (Hint: click on the bar in the snapshot to show more details)?
FA Dirs: ______________________________
Cache: ______________________________
BE Dirs: _____________________________
FA: _________________________MB/s
Are there any Device and/or Storage groups listed in the Snapshot view. Use the Device
Groups I/Os Per Second & Response Time Distribution to find the device group (do not
include ungrouped devices) having the highest utilization. What rating is the groups
response time?
Use the navigation panel on the left to locate and click on the Storage Group that you
identified in the previous part. Make sure you are still viewing Snapshots.
Use the Storage Group Profile to find IO profile for this Group:
Use the navigation panel on the left to locate and click on the Storage Group that is
Virtually Provisioned. Make sure you are still viewing Snapshots.
Use the Storage Group Profile to find IO profile for this Group:
Open the sub-folders under the Symmetrix array in the navigation pane on the left and
click on the FE Directors folder. Make sure you are still viewing Snapshots.
Identify the Director(s) that are performing the most work (IOs Per Second and
Utilization. _________________________________
Identify the Port(s) that are performing the most work (MBs Per Second)._____________
Open the FE Director folder and choose just one front-end director identified in
step(6). Make sure you are viewing Snapshots.
What is the IO contribution (%) of this director to the overall workload on the Symmetrix:
______________________
What is the Throughput contribution (% and MB) of this director to the overall workload
on the Symmetrix: _____________________________
Are both the Front-end ports on this director equally utilized: ___________________
Click on the BE Directors folder, and make sure you are viewing Snapshots.
Use the Snapshot views to monitor the balance of the back-end directors.
Can you detect similar patterns of usage across two or more directors?
_______________________
If you detected imbalance on the front-end directors, but mostly balance on the back-
end directors, what does this say about the arrangement of data in the array?
Click the Disks folder, and make sure you are viewing Snapshots.
Use the Snapshot views to determine if any of the disk groups have an average utilization
of more than 25%:
_________________________________________________________________________
Do any of the disk groups have a peak utilization of more than 25%? ______________
Drill down under the Disks folder to click on a single disk group. Are the disks in the
group being utilized fairly evenly? __________________________________________
Compare a few of the disks in your group to see if they have similar traffic patterns.
10 Diagnostic view.
Select Diagnostic view. Observe the overall dashboard that is presented for the
Symmetrix.
Double-click on the Symmetrix and all the available Diagnostic view tabs are presented.
Explore the different components.
Select the FE Director tab. Then select the FE identified as busy in the earlier part of this
exercise. Switch to the Explore tab. Select both the Average Read Response Time and
Average Write Response Time and plot the graph.
Diagnostic view can help in quickly identifying changes that have occurred recently.
Select Real Time view. The overall metrics for the Symmetrix in the past hour with finer
granularity is displayed.
Select the Symmetrix. Then select FE Reqs/sec and CTRL select BE Reqs/sec. Plot the
graph.
Double-click on the Symmetrix. This will provide three tables for FE, BE and RDF
Directors. Take some time to explore the metrics available for these components in the
Real Time view.
Step Action
Click the Work Offline box in the login dialog and then OK to launch the
program.
At the Data Selection dialog, choose the Specific File radio button and
browse to locate and open the archive used in this part (listed at the start of the
exercise).
Click on the Vital Signs icon, and answer these questions using the graphs. You
might use the Window > Cascade or Window > Tile menus to view the Vital Signs
better.
What was the maximum I/O rate to the Symmetrix during the collection period?
_________
What was the maximum % Hit during the collection period? ________
Does the % Write tend to correspond with the % Hit measure? ________
3 Use the Window > Close All menu to close all the views.
Step Action
1 Launch Performance Manager using the desktop icon or the Windows Start
menu.
2 In the Data Provider dialog, Choose Symmetrix from the Class drop down
menu, or type in the word if it is not an available choice.
Click OK to save the selection and exit the Data Provider dialog.
If you did this correctly, a list of Symmetrix archives should appear in the Data
Providers dialog. If not, use the dialog to delete any mistakes and try again.
3 Use the Data Provider and Archives parts of the Data Selection dialognot
the Specific File partto launch the archive used in this part. You can now
retrieve any archives used in the class using the Data Provider and Archives
part of the Data Selection dialog.
You may leave this archive open for the next Part.
Symmetrix/000194900180/interval/20091124.btp
Step Action
1 Launch Performance Manager and open the archive used with this part, if you have
not already done so.
2 Plot the All Fibre Directors IO Rates graph. This graph (View) can be found in the
Dir-Fibre Folder. By default this graph will be plotted as a Line graph.
(a) Select the graph you opened (make sure its title bar is not gray) and click
the Graph Wizard icon ( ).
(c) Change the Fill Style for the Legend, Display, and Title, to None.
(f) Close the graph and then launch it again. Were the changes you made
retained? __________
Explore other features of the Graph Wizard throughout the rest of the lab
exercises.
(a) Right-click on the System folder and select New Data View.
(b) In the Data View Definition dialog, leave the Class as Symmetrix and
change the Identifier to an asterisk (*). What effect does changing the
Identifier to an asterisk have?
____________________________________________________________
(c) Select Dir-DA as the Category, ios per sec for the Metric, for All objects.
Check the Sum Across Selected Objects and click the Add to Contents
button.
(f) Enter a Name (this will appear under the Systems folder when the
definition is created), and a Title. If you are connected to a Repository,
check the Public Views box and click OK. What functionality does checking
the Public Views box provide?
_____________________________________________________________
___
(h) The Ribbon format does not add any utility when the two lines are so
dissimilar. Right-click on the View, choose Modify Data View, change the
Graph Style to Line, and click OK.
6 Metrics Tab.
(a) Click on the Dir-Fibre folder in the top left panel. Select any director. In
the metrics panel, click on ios per sec and CTRL-click on requests per
sec.
(b) Click on the Graph per Object button ( ). The resulting graph now
displays both ios per sec and requests per sec.
(d) In the Save Graph as Data View dialog, check both the Generalize Class
IDs and Generalize Objects boxes, and give it a name.
(f) What category in the Views tab did the new view appear under? Why?
(a) Launch the System View Total Throughput to-from hosts view (or any
similar line graph view) from the Views tab.
(b) Right-click on any part of the graph and enable Single Point Analysis mode.
(c) Select the Metrics tab and choose the Dir-Port folder in the top panel.
(d) Shift-click to choose all of the director ports that show a non-zero I/Os per
second value in the middle panel.
(e) Click on the ios per sec measure in the bottom panel.
(f) Click the Graph histogram from sorted metric button ( ). The resulting
histogram shows the I/O per second of each port at a given time in the time
graph.
(g) Click the time graph at the point of peak throughput. The histogram will
show the status of those objects at that time.
Record the top two ports and their I/O per sec at this time:
____________________________________________________________________
Experiment with this feature by adding another histogram from the Metrics tab
and changing the time.
When you are done experimenting, you can close all the graphs you created to end
Single Point Analysis mode.
8 Are the values in the graphs sometimes difficult to read? Open any Views or
Metrics graph, and then use the Table option. from the Tools menu to show
the time and data points that make up the graph. Try this now.
You can copy the data out of a table to spreadsheet or text editor, or you can just
save the graph as a CSV file. Select the graph and use File > Save as csv > Graph
to save the data. View the output in a spreadsheet program or Notepad.
Try the File > Save Graph as menu to save an image of the graph. Double-click
the image to view it.
Symmetrix/000284500356/daily/20021215d.btp
Workload Characteristics
Examine the archive used in this part and record the following information about the Workload Characteristicsthe
basic measures for the overall system. Use any graph from the Views or Metrics tab of Performance Manager to get
your answers. To get an average number for a measure that is constantly changing, display the measure as a
histogram.
Average Back-End I/O per second (sum I/O per second over all disk directors)
Examine the archive to detect usage changes over time. Consider the measures you looked at in the previous
part, especially the I/O per second measure. Are there noticeable changes in the characteristics at certain
times? Do your best to identify the time periods that differentiate the workload, and record them here. Also
note what evidence led you to conclude that a change in the traffic had happened here: increased I/O per
second, change in write percentage, etc.
Component Usage
Examine the archive to determine what components are being utilized. Answer each of the following
questions by examining the I/O per second measure for the indicated components
Step Action
2 Are any of the front end ports grouped? Do they seem to be sharing the
workload for one I/O stream? If so, list the ports:
4 List any physical disks that are mostly unused (if there are many, just write the
count out of the total number: 10/30):
5 Are any of the disks grouped? Do they seem to be sharing the workload for
one I/O stream? If so, list the disks (if there are many, just write the count out of
the total number: 10/30):
6 Are any of the devices grouped? Do they seem to be sharing the workload for
one I/O stream? If so, list the devices (if there are many, just write the count out
of the total number: 10/30):
IOSize: Symmetrix/000184501731/analyst/iosize02.btp
This Part uses the IOSize archive. This archive is not real-world data, nor is it intended to simulate any real-
world environment. It is simply a test of the effects of I/O characteristics on Fibre Channel and SCSI ports.
The activity was generated by single-threaded I/O generation programs (one for each device) that have only
one task: trigger a new I/O request immediately after the completion of the previous one. Four distinct
traffic cycles were issued, each signified by a change in one of the I/O characteristics.
Examine the archive and fill in the following table regarding the Symmetrix-wide activity.
Host Port:______________________________
Throughput
Host Port:______________________________
Throughput
Since these applications are simply waiting to send a new I/O request after the completion of the previous
one, any increase in traffic rate or volume indicates a performance improvement; any decrease, a reduction
in performance.
How do you explain the wide difference in I/O rate across the four cycles?
When was the actual data throughput at its highest: when the I/O rate was at its lowest or at its highest?
The applications used to generate the activity on both ports are the same. Suggest a reason why the
performance on both ports is not identical.
This archive was recorded at a site that uses Solaris hosts to support an Email application. The application is
replicated to disaster recovery site using four SRDF RA1 ports. Another four RA2 ports receive data from the
disaster recovery site.
Step Action
1 Look at the Dir-Fibre > All Fibre Directors IO Rates View (in the Views tab).
Do the hosts appear to be attached by 2 or more balanced ports? If so, list the port
pairings: ______________________________________________________________
Since director utilization is calculated from the I/O per second measure, the director
causing the highest utilization must be the one with the peak I/O rate that you
recorded earlier. From this information, determine how many I/O per second the
director would have to be processing to reach 100% utilization:
___________________________
4 Turn to the Metrics tab and plot the throughput for all of the Fibre ports.
What is the peak throughput of all ports, and when does it occur?
_________________
How much throughput would the port have to be processing to reach 100%
utilization?
_____________________________________________________________________
How can the port with the peak utilization not be attached to the director with the
peak utilization?
____________________________________________________________
Plot the measure that supports your answer. What did you plot?
______________________________________________________________________
This archive was taken from an array using 72GB RAID-1 drives. To help visualize the effects of adding a new
database application, a load operation (write, read back and verify) was performed while this archive was
created to capture the performance. Your analysis will explain the effects to the IT group and help them plan
for future applications to be added to the array
Analyze the front-end performance of the Automobile Design archive. Thoroughly examine any issue (good
or bad performance) related to the topics discussed in this module. Put off detailed analysis of the other
components until later in the classyou will revisit this archive again once you finish the other modules. Be
prepared to back up any claims with appropriate evidence.
You might want to start with a complete characterization as outlined during the Characterization exercise you
did previously. This will give you some idea of the basic environment of this archive, including the general
workload characteristics and component usage.
Following this, observe the Roadmap measures outlined in the module to detect issues and problems.
Finally, look for any minor issues discussed in the module (off-roadmap topics).
Here are some questions you should be able to answer once you have done your analysis:
4 What is the best strategy for adding new applications with regard to the front
end hardware?
24/7: Symmetrix/000190100720/interval/20080312.btp
This archive shows full-sized DMX array that is used for a variety of applications around the clock. In this
growing environment, the IT staff would be very interested in any recommendations for adding new
applications to the array.
Analyze the front-end performance of the archive. Thoroughly examine any issue (good or bad performance)
related to the topics discussed in this module. Put off detailed analysis of the other components until later in
the classyou will revisit this archive again once you finish the other modules. Be prepared to back up any
claims with appropriate evidence.
You should be able to answer these questions once you have finished your analysis:
4 What is the best strategy for adding new applications with regard to the front
end hardware?
5 If there are any issues, what recommendations do you have for resolving them?
Random: Symmetrix/000184501731/analyst/random01.btp
This Part uses the Random archive. This archive is not real-world data, nor is it intended to simulate any
real-world environment. It is simply a test of the effects of I/O characteristics on Fibre Channel and SCSI
ports. The activity was generated by single-threaded I/O generation programs (one for each device) that
have only one task: trigger a new I/O request immediately after the completion of the previous one. Three
distinct traffic cycles were issued, each signified by a change in one of the I/O characteristics.
Examine the archive and fill in the following table regarding the Symmetrix-wide activity.
Throughput
Since these applications are simply waiting to send a new I/O request after the completion of the previous
one, any increase in traffic rate or volume indicates a performance improvement; any decrease, a reduction
in performance.
1 How do you explain the differences in I/O rate across the three cycles? What
metrics support your explanation?
This archive was recorded at a site that uses Solaris hosts to support an Email application. The application is
replicated to disaster recovery site using four SRDF RA1 ports. Another four RA2 ports receive data from the
disaster recovery site.
Step Action
From this view alone, answer this question: Is the write hit percentage equal to
100%?
What is the peak total I/O rate, and when does it occur?
________________________
Which is predominant I/O type for this array: reads, sequential reads, or writes?
3 Use the Metrics tab to plot the slot collisions. Generate a graph that shows the
total slot collisions across the whole array.
4 Plot the Device>System bus Kbytes per sec for all devices, and view the result
in an Area graph.
What is the peak internal system throughput, and when does it occur?
_____________
5 Use the Metrics tab to plot the System>system max wp limit and
System>number write pending tracks on the same graph.
What is the system write pending limit for this array? _________________
6 Plot the Device>write pending count for all devices on the array. Since this is a
pre-DMX-2 array, use the technique discussed in class to determine the base
device write pending limit.
What is the base device write pending limit for this array?
_______________________
7 Plot the Dir-Fibre>ios per sec and Dir-Fibre>requests per sec for each of the
Fibre directors individuallyif you plot several directors in the same graph, it
may be too difficult to match the measures up.
What is the approximate ratio of requests / I/Os for the busiest (highest I/O) pair
of directors? _______________________________
Is large I/O size a factor in increasing the number of requests in this case?
__________
What pair of directors shows the highest (peak) requests to I/O ratio?
______________
Is large I/O size a factor in the large number of requests for these directors?
_________
This part uses the One Host archive. In this simulated environment, three hosts are connected to a single
Symmetrix. Each has a different application profile, and is maintained by a different application
administrator. At around 13:30 of this day, sun220 experienced a prolonged slowdown. Average response
time increased by 40% on this already heavily utilized server, causing a corresponding reduction in the
number of records processed. You will be able to identify this period by observing the drop in I/O rate for
sun220.
Analyze the archive and report your findings back to the administration team so that they are aware of the
cause of the slowdown and any potential solutions.
1 Describe the event (change in traffic issued from hosts) that caused the
performance problem:
3 Why did none of the other hosts connected to the same array report
performance issues during this period?
This archive was taken from an array using 72GB RAID-1 drives. To help visualize the effects of adding a new
database application, a load operation (write, read back and verify) was performed while this archive was
created to capture the performance. Your analysis will explain the effects to the IT group and help them plan
for future applications to be added to the array
Analyze the cache performance of the Automobile Design archive. Thoroughly examine any issue (good or
bad performance) related to the topics discussed in this module. Put off detailed analysis of the other
components until later in the classyou will revisit this archive again once you finish the other modules. Be
prepared to back up any claims with appropriate evidence.
You began analyzing the front-end performance of this archive in the previous Lab Exercise. Revisit your
findings. Then refer to the Roadmap measures to begin your analysis and consider the issues discussed in the
module.
Here are some questions you should be able to answer once you have done your analysis:
Step Action
4 If there are any issues, what recommendations do you have for resolving them?
24/7: Symmetrix/000190100720/interval/20080312.btp
This archive shows full-sized DMX array that is used for a variety of applications around the clock. In this
growing environment, the IT staff would be very interested in any recommendations for adding new
applications to the array.
Analyze the system and cache performance of this archive. Thoroughly examine any issue (good or bad
performance) related to the topics discussed in this module. Put off detailed analysis of the other
components until later in the classyou will revisit this archive again once you finish the other modules. Be
prepared to back up any claims with appropriate evidence.
You should be able to answer these questions once you have finished your analysis:
4 If there are any issues, what recommendations do you have for resolving them?
This archive was recorded at a site that uses Solaris hosts to support an Email application. The application is
replicated to disaster recovery site using four SRDF RA1 ports. Another four RA2 ports receive data from the
disaster recovery site.
Step Action
Would you say that any of the directors is heavily utilized? _________
The average line is very close to the max line in this graph. What does that
mean about the variance between the directorshow different are their
individual utilizations?
___________________________________________________________
Does the traffic on the DAs generally correlate with the front end traffic? Are
the peak and valley times roughly the same? ______________
What is the maximum utilization of any disk, and when does it occur?
______________
4 Use the Metrics tab to show a histogram of the Devices>total ios per sec for all
active devices.
5 Use the Metrics tab to show the Disks>total SCSI commands per sec
6 Pick the most heavily utilized disk, and plot the 6 measures which are summed
to generate the total SCSI commands per sec measure all on the same graph.
Do the same with the least heavily utilized disk.
What are the top two measures that make up most of the workload for the
heavily utilized disk?
___________________________________________________________
What is the top measure that makes up the workload for the lightly utilized disk?
_________________________________________________________________
Is XOR activity a large factor in the performance of any of the arrays disks?
________
7 Plot the Dir-DA>prefetched tracks per sec, Dir-DA>tracks not used per sec
and Dir-DA>tracks used per sec on the same graph for one of the disk
directors.
How can the tracks used per sec be higher than the prefetched tracks per
sec?
______________________________________________________________
This archive was taken from an array using 72GB RAID-1 drives. To help visualize the effects of adding a new
database application, a load operation (write, read back and verify) was performed while this archive was
created to capture the performance. Your analysis will explain the effects to the IT group and help them plan
for future applications to be added to the array
Analyze the back-end performance of the Automobile Design archive. You have already examined the front-
end and cache performance of this archive, now examine any issue (good or bad performance) related to the
topics discussed in this module. Refer to the back-end road map for guidance. Be prepared to back up any
claims with appropriate evidence.
Here are some questions you should be able to answer once you have done your analysis:
1 Are any back end directors and disks being over utilized?
4 If there are any issues, what recommendations do you have for resolving them?
SAP/Oracle: Symmetrix/000187401250/interval/20040707.btp
This exercise uses an earlier snapshot of the SAP/ORACLE archive to investigate some basic prefetch
metrics. The effectiveness of prefetching in the late night backup job is of the most concern.
Step Action
1 Compare the system-wide total ios per sec measure with the system-wide
prefetched tracks per sec measure.
Are any changes in the measures evident during the late night backup period?
What do these changes signify?
2 Compare the system-wide Prefetched tracks per second measure with the
system-wide Tracks not used measure. Since the second of these is not available
at the system level, you will have to plot it at the Dir-DA level. You will find it
easiest to create a custom view that sums the measures for all disk directors to
produce just one Tracks not used line.
Does the difference between prefetched tracks and tracks not used indicate an
improvement in the effectiveness of prefetching during the backup job?
24/7: Symmetrix/000190100720/interval/20080312.btp
Analyze the overall back end performance of this archive. Thoroughly examine any issue (good or bad
performance) related to the topics discussed in this module. Be prepared to back up any claims with
appropriate evidence.
You might have thoroughly examined the back end performance of this archive during previous exercises in
this Lab. If so, just make sure you have not missed anything.
Here are some questions you should be able to answer once you have done your analysis:
Step Action
1 Are any back end directors and disks being over utilized?
5 If there are any issues, what recommendations do you have for resolving them?
This exercise covers the Email SRDF archive. This lab will simply ask a lot of questions about the archive;
use Performance Manager to get the answers.
This 8730 archive was recorded at a site that uses Solaris hosts to support an Email application. The
application is replicated to disaster recovery site using four SRDF RA1 ports. Another four RA2 ports receive
data from the disaster recovery site. They are using SRDF/S to replicate all data.
Step Action
1 View the I/O traffic on the four RA1 ports in one graph, and the I/O traffic on the four
RA2 ports in another graph.
2 Compare the I/O traffic on the front-end ports to the traffic on the two RA groups.
Which front-end ports are responsible for the outgoing SRDF traffic?
_________________
Does the sum of the I/Os on the two front-end ports appear to be equal to the sum of
the I/Os on the RA group? ___________
________________________________________________________________________
3 Compare the I/O traffic of a few of the most-utilized devices to the traffic on the front-
end ports you considered in the previous step.
Can you quickly identify devices that are carrying the outgoing SRDF load?
Symmetrix/000284500356/daily/20021214d.btp
Symmetrix/000284500356/daily/20021215d.btp
In this Part, you will be investigating application slowdowns that are occurring late in the night. Activity from
around 18:00 to 03:00 is noticeably slower than the rest of the day.
Step Action
2 What effects are these events likely to be having on the Symmetrix? Support your
answer with evidence whenever possible.
3 What software solution[s] might help reduce the negative performance effects?