Professional Documents
Culture Documents
Rev. 9.21
HP Restricted
Use of this material to deliver training without prior written permission from HP is prohibited.
Copyright 2008, 2009 Hewlett-Packard Development Company, L.P.
The information contained herein is subject to change without notice. The only warranties for HP
products and services are set forth in the express warranty statements accompanying such products
and services. Nothing herein should be construed as constituting an additional warranty. HP shall
not be liable for technical or editorial errors or omissions contained herein.
This is an HP copyrighted work that may not be reproduced without the written permission of HP.
You may not use these materials to deliver training to any person outside of your organization
without the written permission of HP.
Intel® and Itanium® are trademarks and registered trademarks of Intel Corporation or its
subsidiaries in the United States and other countries.
Linux® is a U.S. registered trademark of Linus Torvalds.
Microsoft®, Windows®, and Windows NT® are U.S. registered trademarks of Microsoft
Corporation.
UNIX® is a registered trademark of The Open Group.
Printed in USA
HP StorageWorks Enterprise Virtual Array Advanced Troubleshooting
Lab guide
April 2009
HP Restricted — Contact HP Education for customer training materials.
Module 1 Lab 1 — Using the M5x14x Drive Enclosure EMU Serial Port
Objectives......................................................................................................1
Requirements ..................................................................................................1
Introduction ....................................................................................................2
Connecting the special RS232 cable..................................................................3
Starting a HyperTerminal session .......................................................................4
Using the EMU console commands....................................................................5
Note
If doing this lab remotely, the instructor will ensure that the required equipment is
available and that connections are made for you.
Note
Your lab will have some combination of the storage arrays listed below.
This lab allows you to become familiar with the most important EMU console
commands used for troubleshooting. You will be able to connect to the console port
and then explore all of the EMU menu and sub-menus available.
As you go through the lab, remember to:
Read and perform all of the lab steps that you can in the allowable time.
Concentrate on those tasks that might help you to diagnose and troubleshoot the
array.
Note
Skip this topic if you are doing the lab remotely.
Note
Your instructor should make this cable available.
2. Plug the RJ45 male connector end of this cable into any EMU RS232 ONLY slot.
Note
The RS232 ONLY slot is the topmost connector on the EMU.
3. Using the diagram below as a reference, plug the J3 Console Commands DB9
female connector into the COM1 port on one of your lab PCs. Optionally, you
could use a personal laptop for this exercise.
Note
Some cables are actually stamped with J2 and J3 on the different DB9 ends and
others are not.
5. Click OK.
Note
When using the EMU commands, you will sometimes see command time-outs.
This does not indicate a device failure. If this happens, retry the command.
Note
You will not be able to do this step if you are performing the lab remotely.
a. Physically remove a drive from the drive enclosure you are monitoring.
b. Select d once again, then explain the difference between this output and
the one captured before the disk was removed.
.........................................................................................................
.........................................................................................................
c. Reinsert the drive you just removed. Immediately enter and continue entering
d to continuously output the results of the Summary Of All Drive Info
command.
Answer the following questions:
1) When did the WWN of the drive become known?
...................................................................................................
2) When the drive was inserted, which device removed the bypass first?
...................................................................................................
...................................................................................................
7. Enter ESC and then E to enter the Error menu.
8. Select a. Record the latest error below.
................................................................................................................
Note
The duration value indicates the length in which the error was occurring.
Note
The internal revision numbers are not the spare part numbers printed on the I/O
modules themselves.
11. Select N – Display I2C NVRAM Resources from the System menu. View the
displayed output. This is another method to determine particular drive enclosure
information.
12. Return to the EMU main menu.
13. Select e – ESI Menu, then d – Summary of All Drive Info. Using the information
displayed, answer the following questions:
a. Which drive bays are populated?
.........................................................................................................
b. Which drive firmware version is being used?
.........................................................................................................
c. Which AL_PAs are being used on the shelf?
.........................................................................................................
.........................................................................................................
14. Observe this output and answer the following questions.
a. Explain the output of the bypass data for the disk in slot 9.
.........................................................................................................
.........................................................................................................
b. Is the slot bypassed by the EMU or by the disk itself?
.........................................................................................................
15. Return to the EMU main menu and select c - CRU Menu.
16. Select d – Drive Menu. The output should be similar to the following display.
--- Drive CRU Menu ---
b - Bypass reasons menu
c – Clear the drive present trace
i - Perform an INQUIRY command via EiESI
l - Set LED Behavior
p – Print drive present here
t – Print drive present tracing (On)
ESC - previous menu
-->
17. Return to the EMU main menu and select E - Error Menu. Output should be
similar to the following display.
--- Error Menu ---
a - Recent Alarm Log Listing ('A' for all)
d - Display Current Alarm Queue
f - Recent Fatal System Error Listing
l - Lookup/decode an error message
m - Mute alarm (Off)
r - Toggle the REMIND bit (Off)
t - Test errors and alarms
ESC - previous menu
-->
18. If you select a – Recent Alarm Log Listing (‘A’ for all), the last 10 events logged
by this EMU will display. If you select A, the last 63 events logged by this EMU
will display. Select a.
The output should be similar to the following display.
19. From the Error menu, select the l – Lookup/decode an error message option to
look up any of the previously listed TTNNEE numbers, for example, 0F0405 in
01 in the list above.
--- Error Menu ---
a - Recent Alarm Log Listing ('A' for all)
d - Display Current Alarm Queue
f - Recent Fatal System Error Listing
l - Lookup/decode an error message
m - Mute alarm (Off)
r - Toggle the REMIND bit (Off)
t - Test errors and alarms
ESC - previous menu
--> l
You might see a display in the following format for your error code:
Enter an error code in the form 'tteenn', letter codes (c, h) or
empty line to exit. --> 0F0405 Searching for: 0F.04.05…
Helptext Msg:
--> Transceiver 4 invalid character; check module.
LCD Msg:
--> 0F0405:I/O Xcvr 4 invalid character; check module.
Enter an error code in the form 'tteenn', letter codes (c, h)
or empty line to exit. -->
Note
If doing this lab remotely, the instructor will ensure that the required equipment is
available and that connections are made for you.
Note
Your lab will have some combination of the storage arrays listed below.
This lab allows you to become familiar with the most important M6412x serial
commands used for troubleshooting. You will be able to connect to the serial port on
an M6412x drive enclosure I/O module, explore the most important CLI commands
available, and download and analyze logs.
As you go through the lab, remember to:
Read and perform all of the lab steps that you can in the allowable time.
Concentrate on those tasks that might help you to diagnose and troubleshoot the
array.
Note
Skip this topic if you are doing the lab remotely.
Note
Your instructor should make this cable available. This cable is also used to
connect to the controller.
2. Plug the RJ45 male connector end of this cable into the I/O module of a
selected drive enclosure.
Note
The Mfg slot is in the right-most position on the I/O module.
3. Plug the DB9 female connector into the COM1 port on one of your lab PCs.
Optionally, you could use a personal laptop for this exercise.
5. Click OK.
4. Enter info to display shelf information. Note what you see here.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
5. Enter stat to display the dynamic status of the enclosure elements. Note what
you see here.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
6. Enter dbs 1 to display page 1 of the database. Note what you see here.
................................................................................................................
7. Enter wellRd 0 to read from the wellness log for the enclosure link monitor. Note
what you see here.
................................................................................................................
8. Enter ctsErrCnt a to display Cut-Through Switch error count absolutes in hex
format. Note what you see here.
................................................................................................................
9. Enter eeRead 0 to read from the EEPROM for the enclosure link monitor. Note
what you see here.
................................................................................................................
10. Enter fanTach to display the fan tachometer settings in 1-second intervals. Note
what you see here.
................................................................................................................
Enter fanTach again to disable it.
Note
For now, you will use Navigator only to import and view an M6412x shelf log.
You will examine all of the features of Navigator in a later lab.
The shelf log data files are used to examine specific hardware issues at the M6412x
shelf level. Each log contains information specific to one module on one loop in one
shelf. Each log file can contain data from multiple modules.
To analyze the shelf data in Navigator:
1. Double-click any of the one or more entries in the Workspace tab.
2. View the log data in the Reports tab.
3. Filter your log for each of the following entity types (click below the column
heading and use the down arrow):
Event Log
Power Supply
Fan
Temp Sensor
ELMo (I/O module)
Display
XCVR
Array Device
Console
Enclosure
4. Display the entire log again.
5. Filter your log for entity type ELMo events and note the types of events and their
meanings.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
6. Filter your log for entity type Array Device events and note the types of events
and their meanings.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
7. Filter your log for entity type Fan events and note the types of events and their
meanings.
................................................................................................................
................................................................................................................
8. Filter your log for event codes of type Reset and note their meanings.
a. Note why the resets were issued.
.........................................................................................................
b. Note if there were any unexpected resets.
.........................................................................................................
9. Determine if shelf IDs were set from the console or from the midplane.
................................................................................................................
Note
If doing this lab remotely, the instructor will ensure that the required equipment is
available and that connections are made for you.
Note
Your lab will have some combination of the storage arrays listed below.
Note
The HSV300 and HSV4x0 controllers use the same serial port cable as the
M6412x drive enclosure. The HSV1x0 and HSV2x0 controllers use a different
serial port cable from the EMU console port.
This lab is allows you to become familiar with the most important HSV controller
diagnostics used for troubleshooting. You will be able to view and set debug flags
and print flags, as well as input debug commands. You will also explore all of the
interfaces available for viewing and using HSV controller diagnostics.
As you go through the lab, remember to:
Read and perform all of the lab steps that you can in the allowable time.
Concentrate on those tasks that might help you to diagnose and troubleshoot the
array.
Note
Every lab configuration may be different, therefore keep in mind that the lab
exercises are not written for a specific configuration or one having a specific rack
infrastructure.
Setting flags and using debug commands to capture diagnostic outputs should rarely
be necessary when troubleshooting EVA storage systems. Typically, the controller
event log provides you with all of the information you need to determine the cause of
an EVA storage system failure.
There will be times when you will need to access and use debug flags, print flags,
and debug commands. For example, the PROMPT_FOR_GO debug flag may be set
at the factory and you may not have access to the OCP.
The subtopics describe how to access and use flags and commands.
Note
The steps below are written for the EVA 4000/6000/8000 and EVA
4100/6100/8100 series arrays, but are similar for the EVA 3000/5000 and
EVA 6400/8400 series. The process for print flags is similar.
1. On the master controller, press any OCP pushbutton while the default display is
in view.
Note
Determine the master controller by navigating through the System Information
menu.
Note
These changes do not go into effect until you restart the controller (see below).
Note
The steps below are written for the EVA 3000/5000, EVA 4000/6000/8000,
and EVA 4100/6100/8100 series arrays, but are similar for the EVA
3000/5000 and EVA 6400/8400 series.
1. Start a HyperTerminal session to the console output of the same HSV controller
as you set the debug flags for PROMPT_FOR_GO and PRINTF_TO_CONSOLE.
Use the port settings indicated in the following table.
Note
If using the lab remotely, you will not be able to note the OCP display.
This denotes that you have set the PROMPT_FOR_GO flag to 1. You must use
the console port and boot menu to change the value back to 0.
3. The controller should have stopped during the boot sequence and displayed the
boot menu, which includes the current settings of the debug and print flags. The
debug flags value should be 9, which you set through the OCP.
Note
The debug flag value is displayed as the sum of the bitmap values of the
individual flags. For example, a value of 02000009 is the combined setting of
VERBOSE_MODE (02000000), PRINTF_TO_CONSOLE (00000008), and
PROMPT_FOR_GO (00000001).
4. Enter f for debug flags, then enter g to bring up the prompt for debug flags.
Note
Though not shown here, you would perform similar steps for print flags.
Note
You will no longer be given any prompts because the PROMPT_FOR_GO flag is
now 0. Any time you want to return to the boot menu, press CTRL/p (halts the
controller) and CTRL/r (to restart the controller) on the PC keyboard.
Important
! You should normally return the debug flags to a value of 0 before leaving the
console port and boot menu.
Note
The steps below are written for the EVA 3000/5000, EVA 4000/6000/8000,
EVA 4100/6100/8100, and EVA 6400/8400 series storage systems.
1. Start a Command View EVA session to the storage system and go to the Field
Service Options page.
Note
You may have to stop and start the Command View EVA service to see both
controllers.
2. Select a storage system to work with, and then select the Open command line
interface option.
3. Check the debug flags by clicking Get Debug Flags, then clicking Execute.
Determine if the debug flags are zeroed.
4. Whether or not the debug flags are not zeroed, use the Set Debug Flags
command to zero them.
5. You are now ready to enter a sample debug command.
Enter the value 05 in the “Enter hex equivalent of command:” field and click
Execute.
Note
Value 05 is to generate the SCS_SHOW_CONFIG output. You will examine the
output of this and other debug commands in the next lab.
Note
In VCS 2.003 and later, the PRINTF_TO_CONSOLE flag is automatically set and
cleared by the four primary diagnostic commands: SCS_SHOW_CONFIG,
FCS_SHOW_CONFIG, FCS_LINK_ERRORS, and FCS_DELTA_LINKS. However, if
a controller crashes or resyncs during one of these commands, the
PRINTF_TO_CONSOLE flag may be left set.
SCS_SHOW_CONFIG request
If you have access to an HSV300 controller on an EVA4400, use the console port
and boot menu and the Field Service Options page to set and view debug and print
flags. Note that the web-based OCP (WOCP) on the EVA4400 does not allow
access to flags.
For the new HyperTerminal session, use a 115200 baud rate.
Note what you observe here.
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
.......................................................................................................................
Note
If doing this lab remotely, the instructor will ensure that the required equipment is
available and that connections are made for you.
Note
Your lab will have some combination of the storage arrays listed below.
Note
The HSV300 controller on the EVA4400 uses the same serial port cable as the
M6412 drive enclosure. The HSV1x0 and HSV2x0 controllers use a different
serial port cable from the EMU console port.
Being able to generate and analyze the outputs of various diagnostic tools is
essential for effective troubleshooting. This lab leads the student through the
generation and analysis of the outputs for the SCS_SHOW_CONFIG,
FCS_SHOW_CONFIG, FCS_LINK_ERRORS, and FCS_DELTA_LINKS commands.
Note
Remember that SCS_SHOW_CONFIG (05) and FCS_SHOW_CONFIG (29) data
can also be displayed in Navigator for easier reading and analysis.
Caution
Although this lab will step you through the generation and analysis of various
diagnostic outputs, debug commands should be utilized only as a final effort
when troubleshooting.
Use the following process to generate FCS and SCS diagnostic outputs:
1. Start a HyperTerminal session to the console output of the primary HSV
controller.
2. Select Transfer Capture Text and use a file name such as debug_outputs.txt
as the destination of the captured controller output.
3. Start a Command View EVA session to the storage system and go to the Field
Service Options page.
4. Select a storage system to work with, and then select the Open command line
interface option.
Note
In VCS 2.003 and later, the PRINTF_TO_CONSOLE flag is automatically set and
cleared by the four primary diagnostic commands: SCS_SHOW_CONFIG,
FCS_SHOW_CONFIG, FCS_LINK_ERRORS, and FCS_DELTA_LINKS. However,
if a controller crashes or resyncs during one of these commands, the
PRINTF_TO_CONSOLE flag may remain set.
5. Check the debug flags by clicking Get Debug Flags, then clicking Execute. If the
debug flags are not zeroed, use the Set Debug Flags command to zero them.
6. Enter the value 05 in the “Enter hex equivalent of command:” field and click
Execute.
Note
Value 05 is to generate the SCS_SHOW_CONFIG output.
SCS_SHOW_CONFIG request
7. The GUI should indicate Operation Succeeded and the HyperTerminal session
should show the output of the debug command. Click OK.
8. After waiting at least 1 minute, enter the value 29 into the “Enter hex equivalent
of command:” field and click Execute.
Note
Value 29 generates the FCS_SHOW_CONFIG output.
Caution
Do not issue debug commands in quick succession. Doing so may cause the
storage system to crash.
9. The GUI should indicate Operation Succeeded and the HyperTerminal session
should show the output of the debug command. Click OK.
10. After waiting at least 1 minute, enter the value 34 into the “Enter hex equivalent
of command:” field and click Execute.
Note
Value 34 generates the FCS_LINK_ERRORS output.
11. The GUI should indicate Operation Succeeded and the HyperTerminal session
should show the output of the debug command. Click OK.
12. After waiting at least 1 minute, enter the value 35 into the “Enter hex equivalent
of command:” field and click Execute.
Note
Value 35 generates the FCS_CLEAR_LINKS output.
13. The GUI should indicate Operation Succeeded and the HyperTerminal session
should show the output of the debug command. Click OK.
14. After waiting at least 1 minute, enter the value 36 into the “Enter hex equivalent
of command:” field and click Execute.
Note
Value 36 generates the FCS_DELTA_LINKS output.
15. The GUI should indicate Operation Succeeded and the HyperTerminal session
should show the output of the debug command. Click OK.
16. You are finished using the Command Line Interface area of the Field Service
Options page. Before exiting the page, always check to ensure that you have
not accidentally left any debug flags set.
a. Under the “Select command from list:” drop-down menu, select Get Debug
Flag. This should show a hex equivalent of command value of 10.
b. Click Execute. The output should display as 0x00000000.
c. If the displayed value is anything other than 0x00000000, you must
execute the Set Debug Flag option with an Argument 1 value of 0.
17. Exit the Field Service Options page.
18. Using HyperTerminal, select Transfer Capture Text Stop.
19. Use WordPad to open the c:\debug_outputs.txt file.
Note
Instructional comments in the form of Notes, similar to the one you are reading,
are included throughout the SCS_SHOW_CONFIG output. In addition, the more
important data fields of the output are bolded and edited with comments. Tables
have been added to describe various column headings.
SCell: Active MFC via Port 4 The storage system state is active, and the controller
communications are active via the Mirror port.
6-00508b-4000142c2x 0008c-000003d-[0008]x This is the storage system UUID
SCell Master - Quorum POIDs:801x 80ax 806x 807x 800x These are the quorum disk
POIDs (Physical Object IDs) associated with this storage system.
NSC: UUID = 5-00508b-40001464cx 00000-0000000-0000x This is the UUID of the primary
controller at the time this output was generated.
MEMBER_TAG = 6-00508b-4000142c2x 0008c-000004e-[0021]x This is the UUID of the
secondary controller at the time this output was generated. In some instances, the
MEMBER_TAG will show the UUID of the primary controller.
NOID table
FC Nodes: Temp ID ID
Lp ALPA Lp ALPA Type Node WWN FNB Noid State Usablty
0. 9ex 1. 9ex disk 2-000-0004cf-399f00x 33603c 900 Invalid Usable
2. adx 3. adx disk 2-000-0004cf-395002x 33b010 0 Valid Usable
0. 63x 1. 63x disk 2-000-0004cf-29e205x 3380fc 902 Invalid Usable
0. b1x 1. b1x disk 2-000-0004cf-bf3d08x 33580c 0 Valid Usable
2. b2x 3. b2x disk 2-000-0004cf-398608x 33a9ec 904 Invalid Usable
2. 9bx 3. 9bx disk 2-000-0004cf-29e110x 33b840 0 Valid Usable
0. 9bx 1. 9bx disk 2-000-0004cf-399e14x 336454 0 Valid Usable
0. 31x 1. 31x disk 2-000-0004cf-394819x 339b98 907 Invalid Usable
2. 31x 3. 31x disk 2-000-0004cf-2cab1bx 33eb6c 908 Invalid Usable
2. 73x 3. 73x disk 2-000-0004cf-29e31dx 33c8a0 0 Valid Usable
2. 75x 3. 75x disk 2-000-0004cf-39531ex 33c488 90a Invalid Usable
2. 72x 3. 72x disk 2-000-0004cf-39a029x 33caac 90b Invalid Usable
0. 59x 1. 59x disk 2-000-0004cf-241c2ax 338720 0 Valid Usable
2. 79x 3. 79x disk 2-000-0004cf-399f34x 33c070 90d Invalid Usable
2. b3x 3. b3x disk 2-000-0004cf-39a034x 33a7e0 90e Invalid Usable
0. 66x 1. 66x disk 2-000-0004cf-1fe247x 337ad8 90f Invalid Usable
0. 2ex 1. 2ex disk 2-000-0004cf-39604cx 339da4 0 Valid Usable
2. 5cx 3. 5cx disk 2-000-0004cf-399e57x 33d2dc 0 Valid Usable
2. 32x 3. 32x disk 2-000-0004cf-395359x 33e960 912 Invalid Usable
0. 5ax 1. 5ax disk 2-000-0004cf-395e6ax 338514 0 Valid Usable
2. aex 3. aex disk 2-000-0004cf-395b6fx 33ae04 0 Valid Usable
TrgRIdx Target RSS indication For example, 11x indicates the target RSS NOID
is 0x211
MgrAr[0] A nibble array for the first For example, a merge source of 9876a0x
8 positions in the RSS indicates the following:
group. index 0 — Not present
index 1 — Volume to position a in dest RSS
index 2 — Volume to position 6 in dest RSS
index 3 — Volume to position 7 in dest RSS
index 4 — Volume to position 8 in dest RSS
index5 — Volume to position 9 in dest RSS
MgrAr[1] A nibble array for the last For example, a merge target of 154x indicates the
8 positions in the RSS following:
group. index 8 — Comes from source RSS index 4
index 9 — Comes from source RSS index 5
index a — Comes from source RSS index 1
Member Volnoid The NOID of the volume 0x400 – 0x7FF
used in the RSS.
Capacity The capacity of the
member NOID given in
BLOCKs.
All RSSs:
RSS LDAD Free Members
Noid Noid PSEGS
Abnrml Missing MgrFlgs MbrMgrt SrcRIdx TrgRIdx MgrAr[0] MgrAr[1]
Member Blk
Volnoid Capacity
200x 0x 0. 5.
ffe0x ffe0x 0x 0x 0x 0x 0x 0x
401x 142264000.
40ax 142264000.
406x 142264000.
407x 142264000.
400x 142264000.
Null_DUB
Null_DUB
Note
This first listed RSS with NOID 200x is the quorum disk RSS. Notice it has five
members in it and that these are the exact five members that were initially listed
at the start of this SCS_SHOW_CONFIG output.
The constituent volumes of the quorum RSS are the only volumes that will ever be
listed in more than one RSS.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
201x Invalid RSS
ffffx ffffx 0x 0x 0x 0x 0x 0x
Note
It is not uncommon to have unused RSS NOID numbers (like 201x) strewn
throughout used RSS NOID numbers.
202x 101x 0. 8.
ff00x ff00x 0x 0x 0x 0x 0x 0x
40fx 142264000.
40ax 142264000.
410x 142264000.
40bx 142264000.
40cx 142264000.
411x 142264000.
412x 142264000.
413x 142264000.
Note
The previous output indicates that RSS 202x has eight members in it, all 72GB
capacity.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
203x 101x 0. 8.
ff00x ff00x 0x 0x 0x 0x 0x 0x
40dx 142264000.
40ex 142264000.
414x 142264000.
415x 142264000.
422x 142264000.
427x 142264000.
423x 142264000.
428x 142264000.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
204x 101x 0. 8.
ff00x ff00x 0x 0x 0x 0x 0x 0x
429x 142264000.
424x 142264000.
42bx 142264000.
42ax 142264000.
426x 142264000.
425x 142264000.
421x 142264000.
420x 142264000.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
206x 100x 0. 8.
fc0cx fc0cx 0x 0x 0x 0x 0x 0x
403x 142264000.
400x 142264000.
Null_DUB
Null_DUB
402x 142264000.
405x 142264000.
Note
This RSS illustrates that RSS index IDs do not have to run in consecutive numerical
order. In this case, RSS index values 0 and 1 are followed by 4, 5, 6, 7, 8, and
9, skipping 2 and 3.
406x 142264000.
407x 142264000.
408x 142264000.
401x 142264000.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Note
All possible RSS NOIDs are listed, even when most of them are not being used.
LDADs:
LDAD RSS
Noid noid
100x
206x
Note
This previous output indicates that the disk group with NOID 0x100 is comprised
of a single RSS.
101x
202x
203x
204x
205x
Note
This output indicates that the disk group with NOID 0x101 is comprised of a four
RSS groups.
LDAD Member
Noid Volnoid
100x
400x
401x
402x
403x
405x
406x
407x
408x
Note
The previous disk group NOID (0x100) output indicates it is comprised of only
eight volumes.
101x
40ax
40bx
40cx
40dx
40ex
40fx
410x
411x
412x
413x
414x
415x
416x
417x
418x
419x
41ax
41bx
41cx
41dx
41ex
41fx
420x
421x
422x
423x
424x
425x
426x
427x
428x
429x
42ax
42bx
LDs
LD LDAD Max L2MAP Cache LDSB Pres Host Pres Rlzd
Noid Noid LDA addr Redun State Flags Count Qscs Othr SCS
2000x 100x 13fffffx add8c00x Parity OnlThis 00100030x 1. 0. 0. 1.
2001x 101x 31ffffffx 0x Parity OnlOthr 00000010x 1. 0. 1. 0.
SCVDs
SCVD LDAD LD
Noid Noid Noid LDSB
1000x 100x 2000x d7da5c0x
1001x 101x 2001x d7dac60x
Note
Instructional comments in the form of Notes, similar to the one you are reading,
are included throughout the FCS_SHOW_CONFIG output. In addition, some of the
more important output data fields are bolded and edited with comments. Tables
are added to describe various column headings.
Note
The following OB and IB Task information is not of any known use in
troubleshooting the EVA.
DP-1A/B (loop 0)
IB Task 0: 00000000
IB Task 1: 00000000 !
IB Task 2: 00000000 "
IB Task 3: 00000000 #
IB Task 4: 00000000 $
IB Task 5: 00000000 %
IB Task 6: 00000000 &
IB Task 7: 00000000 '
IB Task 8: 00000000 (
IB Task 9: 00000000 )
IB Task 10: 00000000 *
IB Task 11: 00000000 +
DP-2A/B (loop 1)
IB Task 0: 00000000
IB Task 1: 00000000 !
IB Task 2: 00000000 "
IB Task 3: 00000000 #
IB Task 4: 00000000 $
IB Task 5: 00000000 %
IB Task 6: 00000000 &
IB Task 7: 00000000 '
IB Task 8: 00000000 (
IB Task 9: 00000000 )
IB Task 10: 00000000 *
IB Task 11: 00000000 +
DP-1A/B (loop 0)
OB Task 0: 00000000
OB Task 1: 00000000
OB Task 2: 00000000
OB Task 3: 00000000
OB Task 4: 00000000
OB Task 5: 00000000
OB Task 6: 00000000
DP-2A/B (loop 1)
OB Task 0: 00000000
OB Task 1: 00000000
OB Task 2: 00000000
OB Task 3: 00000000
OB Task 4: 00000000
OB Task 5: 00000000
OB Task 6: 00000000
DP-1A pcb 330a04 loop map: 41 devices, reporting group 4052 Link Up
ALPA POID GUB/FNB PRODUCT ID WWN SERIAL # REV CAP SHELF BAY OL NL UM
1 20 3A5150 [Not a disk] : 7 0
33 931 339780# BD07254498 CF3960BC 3EK0PJVW 3BE8 72 4: 1 4 0 0
32 91B 33998C# BD07254498 CF29D493 3EK0LVMZ 3BE8 72 4: 1 5 0 0
31 907 339B98# BD07254498 CF394819 3EK0P66E 3BE8 72 4: 1 6 0 0
2E 823 1ADF5A0 BD07254498 CF39604C 3EK0PCYP 3BE8 72 4: 1 7 0 0
2D 817 1AE12E0 BD07254498 CF393694 3EK0P6S2 3BE8 72 4: 1 8 0 0
2C 80B 1AE0C60 BD07254498 CF393D90 3EK0P7LA 3BE8 72 4: 1 9 0 0
2B 801 1ADD520 BD07254498 CF393C9B 3EK0P7DZ 3BE8 72 4: 1 10 0 0
4C 93A 338B38# BD07254498 CF1FE0D1 3EK0HV62 3BE8 72 5: 2 4 0 0
4B 936 33892C# BD07254498 CF3949CF 3EK0PDWB 3BE8 72 5: 2 5 0 0
DP-1B pcb 3313d4 loop map: 41 devices, reporting group 4052 Link Up
ALPA POID GUB/FNB PRODUCT ID WWN SERIAL # REV CAP SHELF BAY OL NL UM
1 20 3A5150 [Not a disk] : 7 0
33 931 339780# BD07254498 CF3960BC 3EK0PJVW 3BE8 72 4: 1 4 0 0
32 91B 33998C# BD07254498 CF29D493 3EK0LVMZ 3BE8 72 4: 1 5 0 0
31 907 339B98# BD07254498 CF394819 3EK0P66E 3BE8 72 4: 1 6 0 0
2E 823 1ADF5A0 BD07254498 CF39604C 3EK0PCYP 3BE8 72 4: 1 7 0 0
2D 817 1AE12E0 BD07254498 CF393694 3EK0P6S2 3BE8 72 4: 1 8 0 0
2C 80B 1AE0C60 BD07254498 CF393D90 3EK0P7LA 3BE8 72 4: 1 9 0 0
2B 801 1ADD520 BD07254498 CF393C9B 3EK0P7DZ 3BE8 72 4: 1 10 0 0
4C 93A 338B38# BD07254498 CF1FE0D1 3EK0HV62 3BE8 72 5: 2 4 0 0
4B 936 33892C# BD07254498 CF3949CF 3EK0PDWB 3BE8 72 5: 2 5 0 0
4A 92C 338D44# BD07254498 CF29E3B3 3EK0LSL7 3BE8 72 5: 2 6 0 0
49 824 1AE2CE0 BD07254498 CF3964AF 3EK0PG7N 3BE8 72 5: 2 7 0 0
47 818 1AE29A0 BD07254498 CF393EA4 3EK0P68F 3BE8 72 5: 2 8 0 0
46 80C 1AE4060 BD07254498 CF399FD1 3EK0Q634 3BE8 72 5: 2 9 0 0
45 802 1AE0FA0 BD07254498 CF940C92 3EK1J25A 3BE8 72 5: 2 10 0 0
66 90F 337AD8# BD07254498 CF1FE247 3EKYNH2R 3BE8 72 2: 3 4 0 0
65 921 337CE4# BD07254498 CF39379B 3EK0P7BK 3BE8 72 2: 3 5 0 0
63 902 3380FC# BD07254498 CF29E205 3EK0LYT4 3BE8 72 2: 3 6 0 0
5C 822 1AE46E0 BD07254498 CF3940DB 3EK0P7M0 3BE8 72 2: 3 7 0 0
5A 816 1ADFC20 BD07254498 CF395E6A 3EK0MFZS 3BE8 72 2: 3 8 0 0
59 80A 1ADD860 BD07254498 CF241C2A 3EK0JA10 3BE8 72 2: 3 9 0 0
79 948 336E90# BD07254498 CF3986F1 3EK0PXE4 3BE8 72 6: 4 4 0 0
76 935 3372A8# BD07254498 CFBF40C9 3EK2FTBV 3BE8 72 6: 4 5 0 0
75 829 1AE36A0 BD07254498 CF1FECBE 3EK0HYY4 3BE8 72 6: 4 6 0 0
74 81D 1AE3020 BD07254498 CF3937B4 3EK0P77F 3BE8 72 6: 4 7 0 0
73 811 1AE4D60 BD07254498 CF399FDD 3EK0PXCN 3BE8 72 6: 4 8 0 0
72 805 1AE4A20 BD07254498 CF3947DD 3EK0PECK 3BE8 72 6: 4 9 0 0
9E 900 33603C# BD07254498 CF399F00 3EK0Q2GC 3BE8 72 1: 5 4 0 0
DP-2A pcb 331da4 loop map: 38 devices, reporting group 4077 Link Up
ALPA POID GUB/FNB PRODUCT ID WWN SERIAL # REV CAP SHELF BAY OL NL UM
1 20 3A5150 [Not a disk] : 7 0
33 94A 33E754# BD07254498 CF241CF2 3EK0J9FH 3BE8 72 14: 8 4 77 77
32 912 33E960# BD07254498 CF395359 3EK0PG93 3BE8 72 14: 8 5 77 77
31 908 33EB6C# BD07254498 CF2CAB1B 3EK0NBX8 3BE8 72 14: 8 6 77 77
2E 820 1AE1CA0 BD07254498 CF241F9A 3EK0J8E6 3BE8 72 14: 8 7 77 77
2D 814 1AE05E0 BD07254498 CF2CAE84 3EK0N7WN 3BE8 72 14: 8 8 77 77
2C 808 1AE3360 BD07254498 CF3949B8 3EK0PEW2 3BE8 72 14: 8 9 77 77
4C 93B 33D900# BD07254498 CF2CACD5 3EK0NAVW 3BE8 72 17: 9 4 77 77
4B 924 33DB0C# BD07254498 CF393EA1 3EK0P7VQ 3BE8 72 17: 9 5 77 77
4A 91F 33DD18# BD07254498 CF395498 3EK0PG9M 3BE8 72 17: 9 6 77 77
49 826 1AE1960 BD07254498 CF393697 3EK0P6N8 3BE8 72 17: 9 7 77 77
47 81A 1AE2660 BD07254498 CF3935A4 3EK0P6VK 3BE8 72 17: 9 8 77 77
46 80E 1AE53E0 BD07254498 CF399EE5 3EK0Q3BP 3BE8 72 17: 9 9 77 77
66 94B 33CCB8# BD07254498 CF393FFA 3EK0PCZF 3BE8 72 16:10 4 77 77
65 940 33CEC4# BD07254498 CF39A0DE 3EK0Q642 3BE8 72 16:10 5 77 77
63 949 33D0D0# BD07254498 CF399FF1 3EK0Q423 3BE8 72 16:10 6 77 77
5C 825 1ADF8E0 BD07254498 CF399E57 3EK0Q4KB 3BE8 72 16:10 7 77 77
5A 819 1AE02A0 BD07254498 CF395379 3EK0PG75 3BE8 72 16:10 8 77 77
59 80D 1AE50A0 BD07254498 CF39A0E1 3EK0Q637 3BE8 72 16:10 9 77 77
79 90D 33C070# BD07254498 CF399F34 3EK0Q4LH 3BE8 72 15:11 4 77 77
76 92E 33C27C# BD07254498 CF3937B7 3EK0P71V 3BE8 72 15:11 5 77 77
75 90A 33C488# BD07254498 CF39531E 3EK0PG3N 3BE8 72 15:11 6 77 77
74 821 1AE5A60 BD07254498 CF241FEB 3EK0JAMA 3BE8 72 15:11 7 77 77
73 815 1ADF260 BD07254498 CF29E31D 3EK0LMYZ 3BE8 72 15:11 8 77 77
72 90B 33CAAC# BD07254498 CF39A029 3EK0Q3NV 3BE8 72 15:11 9 77 77
9E 945 33B634# BD07254498 CF399EE9 3EK0Q5H5 3BE8 72 13:12 4 77 77
9D 941 33B428# BD07254498 CF3943E1 3EK0NP18 3BE8 72 13:12 5 77 77
9B 82B 1ADEBE0 BD07254498 CF29E110 3EK0LXYK 3BE8 72 13:12 6 77 77
98 81F 1AE2320 BD07254498 CF394AA1 3EK0PFA0 3BE8 72 13:12 7 77 77
97 813 1AE3D20 BD07254498 CF39A0C3 3EK0PKH6 3BE8 72 13:12 8 77 77
90 807 1ADDEE0 BD07254498 CF3986AD 3EK0PAN5 3BE8 72 13:12 9 77 77
2 21 3A50D4 [Not a disk] : 7 0
B3 90E 33A7E0# BD07254498 CF39A034 3EK0Q2WP 3BE8 72 12:13 4 77 77
B2 904 33A9EC# BD07254498 CF398608 3EK0PXH8 3BE8 72 12:13 5 77 77
B1 82A 1AE0920 BD07254498 CF1FED8B 3EK0J1MZ 3BE8 72 12:13 6 77 77
AE 81E 1ADFF60 BD07254498 CF395B6F 3EK0PCQ2 3BE8 72 12:13 7 77 77
AD 812 1ADE560 BD07254498 CF395002 3EK0NGAQ 3BE8 72 12:13 8 77 77
AC 806 1ADDBA0 BD07254498 CF39A0CF 3EK0Q5HC 3BE8 72 12:13 9 77 77
DP-2B pcb 332774 loop map: 38 devices, reporting group 4077 Link Up
ALPA POID GUB/FNB PRODUCT ID WWN SERIAL # REV CAP SHELF BAY OL NL UM
1 20 3A5150 [Not a disk] : 7 0
B3 90E 33A7E0# BD07254498 CF39A034 3EK0Q2WP 3BE8 72 12:13 4 77 77
B2 904 33A9EC# BD07254498 CF398608 3EK0PXH8 3BE8 72 12:13 5 77 77
B1 82A 1AE0920 BD07254498 CF1FED8B 3EK0J1MZ 3BE8 72 12:13 6 77 77
AE 81E 1ADFF60 BD07254498 CF395B6F 3EK0PCQ2 3BE8 72 12:13 7 77 77
Note
Instructional comments, in the form of Notes like the one you are reading, are
included throughout the FCS_LINK_ERRORS output. In addition, tables are
added to describe various column headings.
DP-1A pcb 330a04 loop map: 41 devices, reporting group 4052 Link Up
ALPA POID WWN SERIAL # ENC BAY BAD LINK_FL LS_SYNC BAD_WORD BAD_CRC
1 20 [Controller] 7 0##
33 931 CF3960BC 3EK0PJVW 1 4 16C 3F45 9AFC 0
Note
It is not uncommon for the first populated disk bay to have higher error counter
values than the rest of the disk devices in the drive enclosure.
Note
The statistics for all four device-side loops are displayed separately. What follows
are the statistics for loop 1B.
DP-1B pcb 3313d4 loop map: 41 devices, reporting group 4052 Link Up
ALPA POID WWN SERIAL # ENC BAY BAD LINK_FL LS_SYNC BAD_WORD BAD_CRC
1 20 [Controller] 7 0
33 931 CF3960BC 3EK0PJVW 1 4 1 BF 4C6 0
32 91B CF29D493 3EK0LVMZ 1 5 1 9A 616 0
31 907 CF394819 3EK0P66E 1 6 1 98 607 0
2E 823 CF39604C 3EK0PCYP 1 7 1 8B 5FA 0
2D 817 CF393694 3EK0P6S2 1 8 1 8C 5EB 0
2C 80B CF393D90 3EK0P7LA 1 9 1 8E 5F6 0
2B 801 CF393C9B 3EK0P7DZ 1 10 1 93 5CE 0
4C 93A CF1FE0D1 3EK0HV62 2 4 1 A4 5EA 0
4B 936 CF3949CF 3EK0PDWB 2 5 1 92 5EB 0
4A 92C CF29E3B3 3EK0LSL7 2 6 1 8E 5EB 0
49 824 CF3964AF 3EK0PG7N 2 7 1 A1 5EB 0
47 818 CF393EA4 3EK0P68F 2 8 1 8E 5F7 0
46 80C CF399FD1 3EK0Q634 2 9 1 97 5FA 0
45 802 CF940C92 3EK1J25A 2 10 1 99 5ED 0
66 90F CF1FE247 3EKYNH2R 3 4 1 9D 5DE 0
65 921 CF39379B 3EK0P7BK 3 5 1 8D 5EE 0
63 902 CF29E205 3EK0LYT4 3 6 1 7F 5DF 0
5C 822 CF3940DB 3EK0P7M0 3 7 1 88 5DF 0
5A 816 CF395E6A 3EK0MFZS 3 8 1 8A 5DF 0
59 80A CF241C2A 3EK0JA10 3 9 1 8D 5DF 0
79 948 CF3986F1 3EK0PXE4 4 4 1 9A 5E6 0
76 935 CFBF40C9 3EK2FTBV 4 5 1 94 5EF 0
75 829 CF1FECBE 3EK0HYY4 4 6 1 8C 5ED 0
74 81D CF3937B4 3EK0P77F 4 7 1 8F 5DE 0
73 811 CF399FDD 3EK0PXCN 4 8 1 90 5ED 0
72 805 CF3947DD 3EK0PECK 4 9 1 92 5ED 0
9E 900 CF399F00 3EK0Q2GC 5 4 1 9A 5ED 0
9D 947 CF399FEF 3EK0Q618 5 5 1 84 5ED 0
Note
You will sometimes see an error indication that states *****Port 2 Loop Order
Invalid—Using ordering from port 3*****. This is indicated when a device-side
loop fails and a loop map for it is not available. When this happens, the
controller uses the loop map from the other loop.
DP-2A pcb 331da4 loop map: 38 devices, reporting group 4077 Link Up
ALPA POID WWN SERIAL # ENC BAY BAD LINK_FL LS_SYNC BAD_WORD BAD_CRC
1 20 [Controller] 7 0
33 94A CF241CF2 3EK0J9FH 8 4 26 37B 5546 0
32 912 CF395359 3EK0PG93 8 5 1 93 28C3 0
31 908 CF2CAB1B 3EK0NBX8 8 6 1 75 28A5 0
2E 820 CF241F9A 3EK0J8E6 8 7 0 29 121 0
2D 814 CF2CAE84 3EK0N7WN 8 8 0 1D 110 0
2C 808 CF3949B8 3EK0PEW2 8 9 1 78 2883 0
4C 93B CF2CACD5 3EK0NAVW 9 4 3 2E7 2693 0
4B 924 CF393EA1 3EK0P7VQ 9 5 1 75 28F5 0
4A 91F CF395498 3EK0PG9M 9 6 1 6B 28B6 0
49 826 CF393697 3EK0P6N8 9 7 1 66 289C 0
47 81A CF3935A4 3EK0P6VK 9 8 4F 169A7 1DD9 0
Note
The previous disk device has orders of magnitude higher signal errors than the
rest of the devices on this loop. It is likely that this device will be sending a large
number of LIP F8s in an effort to obtain a cleaner signal. Assuming there are
errors being logged on the 2A loop, candidates for replacement include the disk
just before this one (9, 7), this disk (9, 8) or the A-side I/O module. In the rare
event none of those replacements fixes the problem, you must consider replacing
the disk enclosure.
Note
Typically, HP recommends that you replace the disk to the left of the disk that is
reporting the errors. In this case, because the disk is in Bay 8, consider replacing
the A-side I/O module first. There is extra signal regeneration logic in the I/O
modules between Bays 7 and 8.
Note
Disk 11, 8 is also reporting fairly high signal errors as compared to the rest of the
loop. The analysis of the 2A loop from the controller event log indicates that this
A-side I/O module should be replaced as well as the A-side I/O module in shelf
number nine.
DP-2B pcb 332774 loop map: 38 devices, reporting group 4077 Link Up
ALPA POID WWN SERIAL # ENC BAY BAD LINK_FL LS_SYNC BAD_WORD BAD_CRC
1 20 [Controller] 7 0
B3 90E CF39A034 3EK0Q2WP 13 4 4 ED 9CC 0
B2 904 CF398608 3EK0PXH8 13 5 1 5D 753 0
B1 82A CF1FED8B 3EK0J1MZ 13 6 1 50 753 0
AE 81E CF395B6F 3EK0PCQ2 13 7 1 50 742 0
AD 812 CF395002 3EK0NGAQ 13 8 1 46 757 0
AC 806 CF39A0CF 3EK0Q5HC 13 9 1 4C 773 0
9E 945 CF399EE9 3EK0Q5H5 12 4 3 76 6DE 0
9D 941 CF3943E1 3EK0NP18 12 5 1 4D 72D 0
9B 82B CF29E110 3EK0LXYK 12 6 1 47 6F8 0
98 81F CF394AA1 3EK0PFA0 12 7 1 42 736 0
97 813 CF39A0C3 3EK0PKH6 12 8 1 4D 764 0
90 807 CF3986AD 3EK0PAN5 12 9 1 45 734 0
2 21 [Controller] 7 0 0 0 201 0
33 94A CF241CF2 3EK0J9FH 8 4 2 EA 6F7 0
32 912 CF395359 3EK0PG93 8 5 1 76 74C 0
31 908 CF2CAB1B 3EK0NBX8 8 6 1 6E 6FF 0
2E 820 CF241F9A 3EK0J8E6 8 7 0 1E 28B 0
2D 814 CF2CAE84 3EK0N7WN 8 8 0 1C 28A 0
2C 808 CF3949B8 3EK0PEW2 8 9 1 5B 71C 0
4C 93B CF2CACD5 3EK0NAVW 9 4 3 83 6B0 0
4B 924 CF393EA1 3EK0P7VQ 9 5 1 59 73B 0
4A 91F CF395498 3EK0PG9M 9 6 1 4D 72C 0
49 826 CF393697 3EK0P6N8 9 7 1 4E 72D 0
47 81A CF3935A4 3EK0P6VK 9 8 1 55 75B 0
46 80E CF399EE5 3EK0Q3BP 9 9 1 49 752 0
66 94B CF393FFA 3EK0PCZF 10 4 3 A7 6F7 0
65 940 CF39A0DE 3EK0Q642 10 5 1 46 755 0
63 949 CF399FF1 3EK0Q423 10 6 1 42 728 0
5C 825 CF399E57 3EK0Q4KB 10 7 1 51 746 0
5A 819 CF395379 3EK0PG75 10 8 0 48 6A5 0
59 80D CF39A0E1 3EK0Q637 10 9 0 49 6B0 0
79 90D CF399F34 3EK0Q4LH 11 4 3 B4 6EF 0
76 92E CF3937B7 3EK0P71V 11 5 1 46 736 0
75 90A CF39531E 3EK0PG3N 11 6 1 41 726 0
Note
The following area of the FCS_LINK_ERRORS output shows disk devices that
were once recognized on the loops, but that are no longer present or seen by the
controller.
Note
You might want to save an output of the FCS_DELTA_LINKS to file prior to
issuing the FCS_CLEAR_LINKS command.
Lab exercise 1
The following SCS_SHOW_CONFIG output is actual output from a storage system that
was having problems recognizing its own configuration. Analyze the output and
answer the associated questions.
SCell: Active NO MFC Port
6-00508b-400011173x 0000f-0000006-[0008]x
SCell Master - Quorum POIDs:800x 815x 81ex 82cx 832x 84bx 80ax
NSC: UUID = 5-00508b-400011173x 00000-0000000-0000x
MEMBER_TAG = 6-00508b-400011173x 0000f-0000018-[0020]x
FC Nodes: Temp ID ID
Lp ALPA Lp ALPA Type Node WWN FNB Noid State Usablty
2. b6x 3. b6x disk 2-000-000c50-28cd01x 337968 0 Valid Usable
0. 67x 1. 67x disk 2-000-000c50-2b4a02x 332828 901 Valid Usable
2. 27x 3. 27x disk 2-000-000c50-2b4804x 341ff8 0 Valid Usable
0. 71x 1. 71x disk 2-000-000c50-22c105x 3317e8 0 Valid Usable
0. 31x 1. 31x disk 2-000-000c50-28e805x 336518 0 Valid Usable
2. 81x 3. 81x disk 2-000-000c50-28ce07x 33ac30 0 Valid Usable
2. 45x 3. 45x disk 2-000-000c50-28d007x 33fd70 0 Valid Usable
2. abx 3. abx disk 2-000-000c50-28cd09x 338598 0 Valid Usable
2. 66x 3. 66x disk 2-000-000c50-28cd0ex 33d4d0 0 Valid Usable
2. 6ax 3. 6ax disk 2-000-000c50-28ca0fx 33ceb8 0 Valid Usable
0. 34x 1. 34x disk 2-000-000c50-28ce0fx 335f00 0 Valid Usable
2. 55x 3. 55x disk 2-000-000c50-28ca13x 33e308 0 Valid Usable
0. a5x 1. a5x disk 2-000-000c50-22a614x 32e728 0 Valid Usable
0. 84x 1. 84x disk 2-000-000c50-22b915x 32fd80 0 Valid Usable
2. 5ax 3. 5ax disk 2-000-000c50-598016x 33dcf0 0 Valid Usable
0. 4bx 1. 4bx disk 2-000-000c50-28ce1ax 3346a0 0 Valid Usable
0. 4ex 1. 4ex disk 2-000-000c50-28cc1bx 334088 0 Valid Usable
0. 65x 1. 65x disk 2-000-000c50-28cd20x 332c38 0 Valid Usable
0. 66x 1. 66x disk 2-000-000c50-2b4a21x 332a30 912 Invalid Usable
2. 33x 3. 33x disk 2-000-000c50-28ca25x 340db0 0 Valid Usable
0. 98x 1. 98x disk 2-000-000c50-2b4526x 32f560 0 Valid Usable
2. acx 3. acx disk 2-000-000c50-28ca26x 3393d0 0 Valid Usable
0. b4x 1. b4x disk 2-000-000c50-289927x 32d2d8 0 Valid Usable
0. 90x 1. 90x disk 2-000-000c50-28cd27x 32f970 0 Valid Usable
0. 54x 1. 54x disk 2-000-000c50-28ce2ax 333868 0 Valid Usable
0. 82x 1. 82x disk 2-000-000c50-22a62bx 32ff88 0 Valid Usable
0. 4ax 1. 4ax disk 2-000-000c50-22c62bx 3348a8 0 Valid Usable
1. In the previous output, how many disks are not grouped, and how many
controllers currently comprise the storage system?
................................................................................................................
................................................................................................................
2. Analyze the previous InUse dub listing. List any indications of problems along
with any associated analysis.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
All RSSs:
RSS LDAD Free Members
Noid Noid PSEGS
Abnrml Missing MgrFlgs MbrMgrt SrcRIdx TrgRIdx MgrAr[0] MgrAr[1]
Member Blk
Volnoid Capacity
200x 0x 0. 7.
ff80x ff80x 0x 0x 0x 0x 0x 0x
400x 286749488.
415x 286749488.
41ex 286749488.
42cx 286749488.
432x 286749488.
44bx 286749488.
40ax 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
3. Based on the quorum RSS membership previously shown, how many total disk
groups should be present in this storage system?
Note
The answer to this question can be confirmed using the data displayed
immediately following question 4 of this lab.
................................................................................................................
201x 100x 0. 6.
ffc0x ffc0x 0x 0x 0x 0x 0x 0x
402x 286749488.
403x 286749488.
400x 286749488.
401x 286749488.
40ax 286749488.
40bx 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
202x 100x 0. 6.
f3f0x f3f0x 0x 0x 0x 0x 0x 0x
406x 286749488.
407x 286749488.
408x 286749488.
409x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
405x 286749488.
404x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
203x 101x 0. 6.
ffe0x ffc0x 0x 0x 0x 0x 0x 0x
40cx 286749488.
40dx 286749488.
40ex 286749488.
40fx 286749488.
410x 286749488.
411x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
204x 101x 0. 6.
ffc0x ffc0x 0x 0x 0x 0x 0x 0x
412x 286749488.
413x 286749488.
414x 286749488.
415x 286749488.
416x 286749488.
417x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
205x 102x 0. 5.
ffc1x ffc1x 6x 3ex 5x 6x 9876a0x 0x
Null_DUB
419x 286749488.
41ax 286749488.
41bx 286749488.
41cx 286749488.
41dx 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
206x 102x 0. 11.
f800x f800x 5x 7c0x 5x 6x 32000000x 154x
41ex 286749488.
41fx 286749488.
420x 286749488.
421x 286749488.
422x 286749488.
423x 286749488.
41ax 286749488.
41bx 286749488.
41cx 286749488.
41dx 286749488.
419x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
207x 103x 0. 5.
ffc1x ffc1x 6x 3ex 7x 8x 9876a0x 0x
Null_DUB
425x 286749488.
426x 286749488.
427x 286749488.
428x 286749488.
429x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
208x 103x 0. 11.
f800x f800x 5x 7c0x 7x 8x 32000000x 154x
42ax 286749488.
42bx 286749488.
42cx 286749488.
42dx 286749488.
42ex 286749488.
42fx 286749488.
426x 286749488.
427x 286749488.
428x 286749488.
429x 286749488.
425x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
209x 104x 0. 8.
ff00x ff00x 0x 0x 0x 0x 0x 0x
430x 286749488.
431x 286749488.
432x 286749488.
433x 286749488.
434x 286749488.
435x 286749488.
436x 286749488.
437x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
20ax 104x 0. 6.
ffc1x ffc0x 0x 0x 0x 0x 0x 0x
438x 286749488.
439x 286749488.
43ax 286749488.
43bx 286749488.
43dx 286749488.
43cx 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
20bx 104x 0. 6.
ffc0x ffc0x 0x 0x 0x 0x 0x 0x
43fx 286749488.
43ex 286749488.
441x 286749488.
440x 286749488.
443x 286749488.
442x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
20cx 105x 0. 8.
ff08x ff00x 0x 0x 0x 0x 0x 0x
444x 286749488.
445x 286749488.
446x 286749488.
447x 286749488.
448x 286749488.
449x 286749488.
44ax 286749488.
44bx 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
20dx 105x 0. 7.
ff20x ff20x 9x 8x 11x dx 5000x 0x
44cx 286749488.
44dx 286749488.
44ex 286749488.
488x 286749488.
4a3x 286749488.
Null_DUB
4a5x 286749488.
44fx 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
20ex 105x 0. 8.
ff00x ff00x 0x 0x 0x 0x 0x 0x
49dx 286749488.
49cx 286749488.
49fx 286749488.
49ex 286749488.
4a1x 286749488.
4a0x 286749488.
4a2x 286749488.
4a7x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
20fx 105x 0. 8.
ff00x ff00x 9x 1x 15x fx 1x 0x
468x 286749488.
498x 286749488.
49ax 286749488.
497x 286749488.
490x 286749488.
491x 286749488.
492x 286749488.
493x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
210x 105x 0. 6.
ff2cx ff0cx 0x 0x 0x 0x 0x 0x
494x 286749488.
495x 286749488.
Null_DUB
Null_DUB
48cx 286749488.
48fx 286749488.
48ex 286749488.
48bx 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
211x 105x 0. 8.
ff00x ff00x ax 20x 11x dx 300000x 0x
485x 286749488.
484x 286749488.
487x 286749488.
486x 286749488.
47ax 286749488.
488x 286749488.
481x 286749488.
48ax 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
212x 105x 0. 7.
ff01x ff01x ax 80x 12x 13x 10000000x 0x
Null_DUB
480x 286749488.
482x 286749488.
47fx 286749488.
478x 286749488.
479x 286749488.
47dx 286749488.
47bx 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
213x 105x 0. 8.
ff20x ff00x 9x 2x 12x 13x 70x 0x
47cx 286749488.
47bx 286749488.
47ex 286749488.
475x 286749488.
474x 286749488.
477x 286749488.
476x 286749488.
473x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
214x 105x 0. 8.
ff00x ff00x 0x 0x 0x 0x 0x 0x
46dx 286749488.
46cx 286749488.
46fx 286749488.
46ex 286749488.
471x 286749488.
470x 286749488.
469x 286749488.
472x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
215x 105x 0. 7.
ff01x ff01x ax 2x 15x fx 0x 0x
Null_DUB
468x 286749488.
46ax 286749488.
467x 286749488.
460x 286749488.
461x 286749488.
462x 286749488.
463x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
216x 105x 0. 8.
ff00x ff00x 0x 0x 0x 0x 0x 0x
464x 286749488.
465x 286749488.
466x 286749488.
45dx 286749488.
45cx 286749488.
45fx 286749488.
45ex 286749488.
45bx 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
217x 105x 0. 8.
ff00x ff00x 0x 0x 0x 0x 0x 0x
455x 286749488.
454x 286749488.
457x 286749488.
456x 286749488.
459x 286749488.
458x 286749488.
489x 286749488.
4a4x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
218x 105x 0. 8.
ff44x ff00x 0x 0x 0x 0x 0x 0x
452x 286749488.
499x 286749488.
453x 286749488.
450x 286749488.
451x 286749488.
45ax 286749488.
4a8x 286749488.
496x 286749488.
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
Null_DUB
4. Is there anything unusual about the RSS memberships shown for the above-listed
disk groups?
................................................................................................................
................................................................................................................
................................................................................................................
LDADs:
LDAD RSS
Noid noid
100x
201x
202x
101x
203x
204x
102x
205x
206x
103x
207x
208x
104x
209x
20ax
20bx
105x
20cx
20dx
20ex
20fx
210x
211x
212x
213x
214x
215x
216x
217x
218x
LDAD Member
Noid Volnoid
100x
400x
401x
402x
403x
404x
405x
406x
407x
408x
409x
40ax
40bx
101x
40cx
40dx
40ex
40fx
410x
411x
412x
413x
414x
415x
416x
417x
102x
419x
41ax
41bx
41cx
41dx
41ex
41fx
420x
421x
422x
423x
103x
425x
426x
427x
428x
429x
42ax
42bx
42cx
42dx
42ex
42fx
104x
430x
431x
432x
433x
434x
435x
436x
437x
438x
439x
43ax
43bx
43cx
43dx
43ex
43fx
440x
441x
442x
443x
105x
444x
445x
446x
447x
448x
449x
Truncated Output
4a7x
44bx
4a8x
LDs
LD LDAD Max L2MAP Cache LDSB Pres Host Pres Rlzd
Noid Noid LDA addr Redun State Flags Count Qscs Othr SCS
SCVDs
SCVD LDAD LD
Noid Noid Noid LDSB
5. Analyze and comment on the previous listing of storage system LDs and SCVDs.
................................................................................................................
................................................................................................................
................................................................................................................
6. What is your analysis of the previous SCS_SHOW_CONFIG output?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Answers
1. In the previous output, how many disks are not grouped, and how many
controllers currently comprise the storage system?
Four disks are not grouped, and the storage system is comprised of a single
controller. Although temp NOIDs 901 and 93E indicate a state of Valid/Usable
(which is unusual and points to a possible metadata corruption), this indicates
the disk is grouped, yet if it were, the disk would not have a temp NOID. Also,
the InUse dub listing that later follows, does not reference the WWNs
associated with NOIDs 901 and 93E, further indicating those two disks are not
grouped.
2. Analyze the previous InUse dub listing. List any indications of problems along
with any associated analysis.
Seven of the disks are marked with an A in the MAIRFA flags area. An A in the
2nd position of MAIRFA indicates that the current disk state is Abnormal.
In addition, volume 46bx has a MAIRFA flags indication of ‘------‘. This disk has
an RSS ID of 0 with an INDEX setting of 0. This is not a valid combination for a
device that is in use. In-use devices have RSS IDs that are non-zero. Further, if
you were to look back into the FC Nodes output, this disk (with WWN ending
in CC82) shows up as having a temp NOID of 0, indicating the drive is
grouped. Therefore, volume 46bx is thought to be a part of a disk group, but it
is not part of any disk group.
3. Based on the quorum RSS membership previously shown, how many total disk
groups should be present in this storage system?
Seven quorum disks is an indication that there should be seven disk groups
present. However, when there are six disk groups present in the system (and the
default disk group has not been deleted), there will be two quorum disks present
in the default disk group. Therefore, in this situation, although there are seven
quorum disks, there are only six total disk groups.
4. Is there anything unusual about the RSS memberships shown for the above-listed
disk groups?
Three of the disk groups (NOIDS 0x102, 0x103, and 0x105) are comprised of
more than one odd-membered RSS group. Under normal circumstances within a
disk group, there should only ever be one RSS group with an odd number of
members in it. In this case, there are three different disk groups undergoing
merge/split operations.
5. Analyze and comment on the previous listing of storage system LDs and SCVDs.
Either no logical disks and storage system virtual disks have been carved out of
the six present disk groups, or the storage system metadata has been corrupted
such that the logical disk information cannot be read.
6. What is your analysis of the previous SCS_SHOW_CONFIG output?
There are seven disks in abnormal states and one that believes it is in a disk
group when in fact it is not.
Three different disk groups are comprised of more than one odd-membered RSS
group.
No LD or SCVD information can be displayed, or no LD of SCVD information is
any longer present on the system.
Analysis: The metadata on this system has been corrupted.
In this situation, it was due to passing a bad pointer to the merge/split recovery
subroutine, which was a bug in VCS 3.000/3.001 that was fixed in VCS 3.014.
This specific problem was a direct result of a bad power-on sequence. The
controllers were booted first, and then the disk enclosures. The controllers went
into a recursive bugcheck state.
In the process of dealing with disks that were ‘missing’, the VCS code ran into a
bug and this was the result.
The following brief output is from the controller console port that the previous outputs
were generated from. Review the bolded area.
Note the seven RSS members that indicate that they are in merging state. These are
the same seven disks that appeared as Abnormal previously in the
SCS_SHOW_CONFIG output.
Lab exercise 2
The following FCS_LINK_ERRORS output is actual output from a storage system that
was having loop errors on both loop pairs. Analyze the output and answer the
associated questions.
DP-1A pcb 32cf94 loop map: 42 devices, reporting group 4052 Link Failed Delta
ALPA POID WWN SERIAL # ENC BAY BAD LINK_FL LS_SYNC BAD_WORD BAD_CRC
1 0 [Controller] 7 0##
33 0 CF3960BC 3EK0PJVW 1 4 0 D 5B 0
32 0 CF29D493 3EK0LVMZ 1 5 0 D 5C 0
31 0 CF394819 3EK0P66E 1 6 0 E 4C 0
2E 0 CF39604C 3EK0PCYP 1 7 0 F 4C 0
2D 0 CF393694 3EK0P6S2 1 8 0 C 4C 0
2C 0 CF393D90 3EK0P7LA 1 9 0 16 4C 0
2B 0 CF393C9B 3EK0P7DZ 1 10 0 D 4C 0
4C 0 CF1FE0D1 3EK0HV62 2 4 0 9 4D 0
4B 0 CF3949CF 3EK0PDWB 2 5 0 E 4C 0
4A 0 CF29E3B3 3EK0LSL7 2 6 0 E 4C 0
49 0 CF3964AF 3EK0PG7N 2 7 0 13 1004C 0
47 0 CF393EA4 3EK0P68F 2 8 0 F 4E 0
46 0 CF399FD1 3EK0Q634 2 9 0 E 61 0
45 0 CF940C92 3EK1J25A 2 10 0 A 40 0
2 0 [Controller] 7 0
B3 0 CF1FE1CF 3EKYNDZH 6 4 0 A A2 0
B2 0 CF39866F 3EK0PXYD 6 5 0 E 4E 0
B1 0 CFBF3D08 3EK2FS3L 6 6 0 E 4D 0
AE 0 CF39869B 3EK0PM5F 6 7 0 13 4C 0
AD 0 CF3964D6 3EK0MPTB 6 8 0 D 4E 0
AC 0 CF394AAD 3EK0PER0 6 9 0 E 4E 0
9E 0 CF399F00 3EK0Q2GC 5 4 0 9 4E 0
9D 0 CF399FEF 3EK0Q618 5 5 0 13 1004C 0
9B 0 CF399E14 3EK0Q20K 5 6 0 E 4C 0
98 0 CF399EA1 3EK0Q50Q 5 7 0 11 4C 0
90 0 CF3949BF 3EK0PF0N 5 9 13A 3CC0E 60100 0
8F 0 CF395E94 3EK0PJM2 5 10 0 B 4C 0
79 0 CF3986F1 3EK0PXE4 4 4 0 8 4E 0
76 0 CFBF40C9 3EK2FTBV 4 5 0 A 4C 0
75 0 CF1FECBE 3EK0HYY4 4 6 0 C 1004C 0
74 0 CF3937B4 3EK0P77F 4 7 0 9 4C 0
73 0 CF399FDD 3EK0PXCN 4 8 0 A 52 0
72 0 CF3947DD 3EK0PECK 4 9 0 5 4C 0
71 0 CF394AA1 3EK0PFA0 4 10 0 12 4C 0
66 0 CF1FE247 3EKYNH2R 3 4 0 12 4C 0
65 0 CF39379B 3EK0P7BK 3 5 0 10 4C 0
63 0 CF29E205 3EK0LYT4 3 6 0 B 1004C 0
5C 0 CF3940DB 3EK0P7M0 3 7 0 9 4C 0
5A 0 CF395E6A 3EK0MFZS 3 8 0 11 4C 0
59 0 CF241C2A 3EK0JA10 3 9 0 6 4C 0
56 0 CF393DA2 3EK0P88L 3 10 0 C 4C 0
7. What can you conclude from the analysis of the loop 1A error statistics?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
DP-1B pcb 32cf94 loop map: 42 devices, reporting group 4052 Link Up Delta
ALPA POID WWN SERIAL # ENC BAY BAD LINK_FL LS_SYNC BAD_WORD BAD_CRC
1 20 [Controller] 7 0 0 0 0 0
33 826 CF3960BC 3EK0PJVW 1 4 81 DF1 C15CB 0
8. How would you analyze the error statistics for drive 1, 4 on loop 1B?
................................................................................................................
................................................................................................................
10. How would you analyze the error statistics for drive 6, 4?
................................................................................................................
................................................................................................................
11. What can you conclude from the analysis of the loop 1B error statistics?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
DP-2A pcb 32d964 loop map: 36 devices, reporting group 4077 Link Up Delta
ALPA POID WWN SERIAL # ENC BAY BAD LINK_FL LS_SYNC BAD_WORD BAD_CRC
1 20 [Controller] 7 0 2B 1E2 39A 0
33 82C CF241CF2 3EK0J9FH 8 4 0 7 35 0
32 846 CF395359 3EK0PG93 8 5 0 4 F 0
31 83A CF2CAB1B 3EK0NBX8 8 6 0 6 F 0
2E 820 CF241F9A 3EK0J8E6 8 7 0 5 F 0
2D 814 CF2CAE84 3EK0N7WN 8 8 0 2 F 0
2C 808 CF3949B8 3EK0PEW2 8 9 0 4 F 0
4C 82F CF2CACD5 3EK0NAVW 9 4 0 6 2D 0
4B 849 CF393EA1 3EK0P7VQ 9 5 0 3 F 0
12. What can you conclude from the analysis of the loop 2A error statistics?
................................................................................................................
................................................................................................................
13. What does the first line of the loop 2B output indicate?
................................................................................................................
DP-2B pcb 32d964 loop map: 36 devices, reporting group 4077 Link Failed
Delta
ALPA POID WWN SERIAL # ENC BAY BAD LINK_FL LS_SYNC BAD_WORD BAD_CRC
1 0 [Controller] 7 0
33 0 CF241CF2 3EK0J9FH 8 4 0 2 2D 0
32 0 CF395359 3EK0PG93 8 5 0 3 1E 0
31 0 CF2CAB1B 3EK0NBX8 8 6 0 2 1E 0
2E 0 CF241F9A 3EK0J8E6 8 7 0 3 1F 0
2D 0 CF2CAE84 3EK0N7WN 8 8 0 2 1E 0
2C 0 CF3949B8 3EK0PEW2 8 9 0 5 1E 0
4C 0 CF2CACD5 3EK0NAVW 9 4 0 1 1E 0
4B 0 CF393EA1 3EK0P7VQ 9 5 0 3 1E 0
4A 0 CF395498 3EK0PG9M 9 6 0 1 1E 0
49 0 CF393697 3EK0P6N8 9 7 0 4 1E 0
47 0 CF3935A4 3EK0P6VK 9 8 0 7 16 0
46 0 CF399EE5 3EK0Q3BP 9 9 0 1 1F 0
66 0 CF393FFA 3EK0PCZF 10 4 0 4 21 0
65 0 CF39A0DE 3EK0Q642 10 5 0 2 1E 0
63 0 CF399FF1 3EK0Q423 10 6 0 4 1E 0
5C 0 CF399E57 3EK0Q4KB 10 7 0 1 1E 0
5A 0 CF395379 3EK0PG75 10 8 0 2 21 0
59 0 CF39A0E1 3EK0Q637 10 9 0 1 1E 0
2 0 [Controller] 7 0
B3 0 CF39A034 3EK0Q2WP 13 4 0 6 22 0
B2 0 CF398608 3EK0PXH8 13 5 0 8 1E 0
B1 0 CF1FED8B 3EK0J1MZ 13 6 0 3 1E 0
AE 0 CF395B6F 3EK0PCQ2 13 7 0 4 1E 0
AD 0 CF395002 3EK0NGAQ 13 8 0 2 1E 0
AC 0 CF39A0CF 3EK0Q5HC 13 9 0 2 1E 0
9E 0 CF399EE9 3EK0Q5H5 12 4 0 4 1E 0
9D 0 CF3943E1 3EK0NP18 12 5 0 6 1E 0
9B 0 CF29E110 3EK0LXYK 12 6 0 2 1E 0
90 0 CF3986AD 3EK0PAN5 12 9 C F2FE 1506F7 0
14. How would you analyze the error statistics for drive 12, 9 on loop 2B?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
79 0 CF399F34 3EK0Q4LH 11 4 0 3 1C 0
76 0 CF3937B7 3EK0P71V 11 5 0 7 1E 0
75 0 CF39531E 3EK0PG3N 11 6 0 2 1E 0
74 0 CF241FEB 3EK0JAMA 11 7 0 4 1E 0
73 0 CF29E31D 3EK0LMYZ 11 8 0 7 1E 0
72 0 CF39A029 3EK0Q3NV 11 9 0 3 1E 0
15. What can you conclude from the analysis of the loop 2B error statistics?
................................................................................................................
................................................................................................................
................................................................................................................
FNBs not on loops - information is stale - oldest first
SIDE WWN SERIAL # ENC BAY BAD LINK_FL LS_SYNC BAD_WORD BAD_CRC
A CF29E2E6 3EK0LN1P 105 108 16 15E6 1 0
B CF29E2E6 3EK0LN1P 105 108 0 1 B 0
A CF399C72 3EK0Q4L6 112 107 0 A 1E 0
B CF399C72 3EK0Q4L6 112 107 0 6 F 0
A CF39A0C3 3EK0PKH6 112 108 0 0 0 0
B CF39A0C3 3EK0PKH6 112 108 0 0 0 0
16. Using the output shown above, determine which three disk drives were removed
from the storage system.
................................................................................................................
................................................................................................................
................................................................................................................
Answers
1. What does the first line of this FCS_LINK_ERRORS output indicate?
Loop 1A was down at the time this output was generated. If Loop 1A is down,
the controllers will obtain the drive error statistics (maintained on each drive for
both loops) for both loops using the 1B loop.
2. Should you be concerned that drive 2, 7 has a much larger number of
BAD_WORDs than the other members of enclosure 2?
Larger numbers of errors can be generated because there are different drive
types or different drive firmware versions being used. In this specific situation, an
FCS_SHOW_CONFIG output was checked and showed that the drive type
BD07254498 and firmware (3BE3) were the same as every other drive in the
enclosure (so drive type was not the cause).
Because the link synchronization and state are being maintained, outside of
potentially monitoring this drive, no action should be taken.
3. Should you be concerned that drive 5, 5 has a much larger number of
BAD_WORDs than the other members of enclosure 5? Also, do you notice
anything interesting about the actual number of BAD_WORDs detected?
Again, because the link synchronization and state are being maintained, outside
of potentially monitoring this drive, no action should be taken. What is
interesting is that the number of BAD_WORDS is identical to the number of
BAD_WORDS that were detected by the drive in 2, 7.
4. How would you analyze the error statistics for drive 5, 9?
Drive 5, 9 is indicating some serious problems. The number of detected link
failures and loss of syncs is much higher than the rest of the drives on the
enclosure and on loop 1A. The cause could be the drive itself, the previous
drive, the A-side I/O module, or in rare cases, the shelf enclosure itself.
Typically, first attempt to replace the drive just before 5, 9, but notice there is no
drive present in location 5, 8. The drive in slot 5, 8 was previously removed in
an attempt to resolve this problem. At this point, the likely candidate for
replacement is the A-side I/O module in shelf number five.
5. Should you be concerned that drive 4, 6 has a much larger number of
BAD_WORDs than the other members of enclosure 4?
Same answer as the previous two drives indicating a large number of
BAD_WORDs. Again, notice the number of BAD_WORDS detected was
identical to the number of BAD_WORDS detected by the drives in slots 2, 7 and
5, 5.
12. What can you conclude from the analysis of the loop 2A error statistics?
Loop 2A is functional and is not indicating any problems at this time.
13. What does the first line of the loop 2B output indicate?
Loop 2B was down at the time this output was generated. If Loop 2B is down,
the controllers will obtain the drive error statistics (maintained on each drive for
both loops) for both loops using the 2A loop.
14. How would you analyze the error statistics for drive 12, 9 on loop 2B?
Drive 12, 9 is indicating some major signaling problems. The number of Loss of
Syncs is orders of magnitude higher than the rest of the drives on the Enclosure
and on the loop.
Notice there is no drive present in locations 12, 7 and 12, 8. Both of these
drives were previously removed in an attempt to resolve high LS_SYNC problems
with drive 12, 8. At this point, the B-side I/O module in enclosure 12 should be
replaced.
15. What can you conclude from the analysis of the loop 2B error statistics?
The B-side I/O module in enclosure 12 should be replaced.
16. Using the previous output, determine which three disk drives were removed from
the storage system.
The drives in the following bays: 5, 8, 12, 7, and 12, 8.
This lab allows you to explore many of the features of Navigator, including viewing
configuration and event data in various formats.
As you go through the lab, remember to:
Read and perform all of the lab steps that you can in the allowable time.
Concentrate on those tasks that might help you to diagnose and troubleshoot the
array.
In this section, you can explore how to use all aspects of the Navigator GUI,
including how to use the general GUI functions, the menus, and the tabs.
Note
On your keyboard, use the Ctrl key to select multiple columns that are not
contiguous and use the Shift key to select a range of contiguous columns.
Note
This can be helpful for grouping portions of the display.
9. Click any column heading and note the small sort descriptor (small triangle) as
you toggle between ascending and descending order.
10. In the blank row used for filtering, enter a letter or a number in a column or
multiple columns to display a filtered list. Delete the value to return to the full list.
Edit
Delete (only in Workspace view) — Deletes the row where your cursor is
located
Find — Find where specified data is located
1) Open an event table.
2) Select Edit Find, and when the dialog opens, enter a search string
and select Find All.
Note
You can restrict your search to a specific column by using the drop-down list on
the right.
3) Select Find Next and Find Prev to scroll through the search list
simultaneously with the event table.
View
Normal — Displays the default view of the data
By Group — Groups data by its attributes (columns)
1) Select View By Group, and drag a column heading to group by that
column.
Note
A fuller impact of this option is obvious when you open an event log (through the
Events tab, see below) and view the data by group.
Note
Return to normal view and zoom to your customary level.
Tools
Configuration Data — Creates a separate window for viewing the
configuration data of any storage array
Note
This option is described later in detail.
Reports — Displays a list of available reports. The reports use the event data
loaded through the Events tab. User-defined reports and analysis are supported.
a. Open an event table.
b. Select any of the available reports.
Note
For each report, make note of the description of the report at the bottom of the
page.
c. Sort the data for any report by clicking the column headers.
d. Drag the columns of the report to different locations.
Graphed Reports — Displays a list of available graphed reports based on the
data currently shown through the Reports tab.
a. Create a graphed report for Port Stats.
b. Review and use some of the toolbar options by placing your cursor over a
button, looking at the tooltips, and clicking the button.
c. Right-click anywhere within the graph and choose Data Editor. Navigator
displays a table showing the graph’s data.
1) Position the cursor over data in the table or over the graph to highlight
the related table or graph information.
2) Use the up/down or left/right scrollbar to view more data.
d. Review the data area below the graph by placing your cursor over portions
of the graph.
Analysis — Allows selection of a type of analysis of the event table.
a. Open an event table.
b. Using the drop-down list, select each type of analysis and review the results.
Important
! The analysis information allows you to locate start and finish sequences in the
event log.
When you use Navigator, you can view the configuration information from any active
storage system as long as you have some basic information about that storage
system. The subtopics describe how to obtain configuration data using Navigator.
Note
Your instructor will provide any missing information.
Important
! If you receive an EMClientAPITNG error indicating an invalid username or
password, your management server may have been loaded with SmartStart V7.2
or above.
From a command prompt, navigate to c:\Program Files\Hewlett-
Packard\SANworks\Element Manager for StorageWorks HSV\Bin\, and enter
elmsetup.exe -pA:administrator –f where administrator equals the
password for the HP Command View API Administrator account. Any password
can be substituted for administrator.
Go to the Services window and restart the HP Command View EVA service. The
new password takes effect only after you restart this service.
A dialog box appears that contains a list of storage arrays that the server knows
about. The arrays managed by the server appear with a normal background,
while those arrays managed by other servers appear with a gray background.
5. Choose an array by either double-clicking the entry or clicking the entry and
then pressing Get.
After completing this lab, you should be able to relate Controller Event Log entries to
many common and uncommon EVA storage system events.
This lab allows you to generate and then correlate Controller Event Log entries. As
you go through the lab, remember to:
Note
If you are doing the lab remotely, you will not be generating event logs.
Read and perform all of the lab steps that you can in the allowable time.
Concentrate on those tasks that might help you to diagnose and troubleshoot the
array.
Carefully read and follow all of the steps in this lab. You will be performing many
actions and be required to note when those actions took place. It is important that
your lab group works as a team and does not rush while performing all of the lab
activities that follow.
Note
If you are doing the lab remotely, you will not be performing the actions to
generate event logs.
Important
! If you are doing this lab remotely, you will not be performing the actions to
generate the event logs. Those logs will be supplied for you on the course DVD.
Read through this section but do not perform any actions. Continue with the
section called Capturing and correlating the event log entries, where you will
later be directed to come back to do the correlation exercise in this section
(under the topic called Actions to generate and then correlate).
This lab steps you through the process of creating, modifying, and deleting many
different object types. In addition, you will emulate a small number of storage system
component failures.
Important
! It is critical that you do not deviate from this lab in any way until all the requested
lab actions have been completed.
The following items are a general outline of how this lab is designed to work:
Perform each of the actions listed in the section called Actions to generate and
then correlate one at a time and record the exact time you performed the action.
Once all actions are completed, capture the Controller Event Log, and then
translate it using Navigator.
Go back through each of the actions listed in the lab section called Actions to
generate and then correlate. This time, using the timestamps you wrote down
when first performing the actions, correlate each of the Controller Event Log
entries with the action they corresponded to.
Effectively, you will be going through the following lab section twice. The first time, to
generate an action and record the exact time that action was performed, and the
second time, to correlate each event log entry with the action taken.
Important
! While completing the lab actions, it is possible that accidental actions will be
performed—actions that the lab did not specifically request you to perform.
If you perform an accidental action, record this action along with the time it was
performed in the section called Accidental actions performed while completing
this lab at the end of this lab.
Note
All lab participants should consider synchronizing their watches to the time set
on the management server and make sure that time matches that of the server
you are browsing from.
3. Using the Command View EVA GUI on the uninitialized storage system, select
Use management server date/time and Re-sync controller time with the SAN
management time, then click Save changes.
Note
Having an accurate time setting is critical to this lab. Failure to perform this step
could make the analyzing of your event log entries later more difficult.
Note
It is highly recommended that you always select Re-sync controller time with the
SAN management time. This enables you to correlate the Controller Event Log
entry times with the Management Server Event Log entry times, as well as
generate periodic events to determine if the storage system is still logging events.
Note
Consider recording the Time Action Started or Time Action Completed entries
below in 24-hour time format (for example, 15:21:03 instead of 3:21:03 PM).
This is the time format utilized by the translation tools.
Note
If this is your first time performing Step 1 in this lab section, go to Step 2. The
space provided below with headings titled Controller Event Code or Event time:
is used when you return to this step to correlate the Controller Event Log event
entries with the actions.
Note
As an alternative to documenting the event information in the format provided
below, you can insert a new column into your translated Excel spreadsheet
outputs and place your comments there.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
2. Create a second disk group with 14 members and a disk failure protection level
of Single.
Time action started: ……………Time action completed:……………….…
Note
Throughout this lab, for multiple events that have the same event number, only list
the event number once. For example, event 0x096c000f will be listed 14 times
for the disk group creation action just performed. Do not write down all 14 times
this event entry was logged.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
3. Delete the 14-member disk group created in the previous step.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
4. Create another disk group with 14 members and a disk failure protection level of
Single.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
5. Add another member to the 14-member disk group making it a 15-member disk
group.
Time action performed: ………………
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
6. Ungroup one member from the 15-member disk group—when prompted, select
the Ungroup and wait option.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
7. Change the Disk failure protection Requested level: from Single to Double on
the 14-member disk group, then click Save changes. When the action is
complete, use the GUI to ensure the Actual level changed to Double.
Time action performed: ………………
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
8. Change the Occupancy Alarm level for the 14-member disk group to 90%,
then click Save changes.
Time action performed: ………………
................................................................................................................
Note
This is the first event that will have no Controller Event Log entries. You will see an
event code of #2052 in the Management Server Event Log for this action.
9. Sync up the controller time with the management server time using System
options Set time options Re-sync controller time with the SAN
management time.
Time action performed: ………………
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
10. Add a Windows host to the system.
Time action performed: ………………
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
Note
Ensure you set the correct Host OS type for the operating system you are using.
................................................................................................................
................................................................................................................
12. Delete the Windows host entry from the system.
Time action performed: ………………
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
13. Add the Windows host back into the system.
Time action performed: ………………
................................................................................................................
................................................................................................................
14. Add a second HBA port to the Windows host.
Time action performed: ………………
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
15. Create a 5GB VRAID5 virtual disk from the Default Disk Group, prefer it to the
“A” controller, and do not present it to any hosts.
Time action performed: ………………
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
16. Create a 3GB VRAID5 virtual disk from the Default Disk Group, prefer it to the
“A” controller, and do not present it to any hosts.
Time action performed: ………………
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
17. Create a 1GB VRAID0 virtual disk from the Default Disk Group, prefer it to the
“B” controller, and do not present it to any hosts.
Time action performed: ………………
18. Change the requested capacity for the VRAID0 virtual disk from 1GB to 2GB.
Time action performed: ………………
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
19. Delete the 5GB VRAID5 virtual disk.
Time action performed: ………………
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
20. Present the 2GB VRAID0 virtual disk to the Windows host.
Time action performed: ………………
................................................................................................................
Notes:
................................................................................................................
21. Unpresent the 2GB VRAID0 virtual disk from the Windows host.
Time action performed: ………………
................................................................................................................
Notes:
................................................................................................................
22. Create a Demand-allocated snapshot of the 2GB VRAID0 virtual disk, but do not
present it to any hosts.
Time action performed: ………………
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
23. Delete the Demand-allocated snapshot.
Time action performed: ………………
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
24. Restart the slave controller using the slave controller OCP: Shutdown Options
Restart Yes. Pause for 3 or 4 minutes before proceeding to the next lab step.
Time action performed: ………………
Note
It is not necessary to list the entire series of events associated with the controller
reboot. Space is provided below to capture the majority of them.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
25. Having paused for 3 or 4 minutes since the last lab action, shut down your slave
controller followed by your master controller using the Shut down Shutdown
Options Power down GUI option from the “Controller Shutdown” section of
the Shutdown Options page. When both controllers are completely shut down,
switch the On/Off switches on the back of the controllers to the Off position.
Time action performed: ………………
Note
It is not necessary to list the entire series of events associated with the system
shutdown. Space is provided below to capture many of them.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
26. Switch the On/Off switches of both controllers to the On position and take
about a 5-minute break while your controllers boot and re-establish
communication with the management server.
Time action performed: ………………
Note
The entire series of events associated with the system startup is much more than
will fit on this page. Space is provided below to capture the events that you find
most interesting.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
Caution
For the rest of the lab, pay extra attention when performing all actions and when
noting the times that those actions took place.
27. Use the Locate function to locate all of the disks in the Default Disk Group and
observe the disk positions. Cancel the Locate, and then physically remove one of
the Default Disk Group members you just located.
Note
When recording the time for this event, record the time the disk was physically
removed from the system. You are emulating a member failure of the VRAID0
virtual disk.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
28. Within a small time period after you physically remove the disk in the previous
step, the VCS code pops up a window asking if you would like to Continue with
no changes or Start deletion process. For this lab exercise, select Start deletion
process and record the time when you select this option below.
Note
Because of the selection you just made, the VRAID0 virtual disk information is
irretrievably lost. Never select this option unless all other attempts to resolve your
problem have been attempted. Often, one or more loop and/or disk issues can
be resolved allowing the virtual disks to resume normal operations.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
29. Physically reinsert the disk that was pulled out previously. Wait 1 or 2 minutes
for the system to bring the disk online, then check the GUI to ensure it is ready
for use.
Time action performed: ………………
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
30. Regroup the reinserted disk into the Default Disk Group.
Time action performed: ………………
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
31. Create a 1GB VRAID0 virtual disk from the Default Disk Group, prefer it to the
“B” controller, and do not present it to any hosts.
Time action performed: ………………
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
32. Physically remove the disk device that was just added and grouped into the
Default Disk Group.
Note
This is the same disk that was removed earlier from the storage system.
Note
When recording the time for this event, record the time the disk was physically
removed from the system. You are once again emulating a member failure of the
VRAID0 virtual disk.
Note
It is not necessary to list the entire series of events associated with the disk
removal. You may want to indicate only those events (if there were any) that were
different from the last time you performed the disk removal.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
33. Within a short time period after physically removing the disk in the previous
step, the controllers pop up a window asking if you would like to Continue with
no changes or Start deletion process. For this lab exercise, select Continue with
no changes and record the time when this option was selected below.
Time action performed: ………………
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
34. Physically reinsert the disk that was pulled out previously. Wait a minute for the
system to bring the disk online, and check the GUI to ensure it is ready for use.
How many disks are seen in the Default Disk Group?
................................................................................................................
What is the status of the VRAID0 virtual disk?
................................................................................................................
Time action performed: ………………
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
What did you learn about the difference between “Continue with no changes” and
“Start deletion process?”
................................................................................................................
................................................................................................................
Did what you learn apply only to VRAID0 virtual disks, or does it also apply to
VRAID1 and VRAID5 virtual disks?
................................................................................................................
................................................................................................................
35. Physically remove one of the EMUs from a drive enclosure shelf and wait 2
minutes.
Note
The drive enclosure shelf should now show up in the Unmappable Hardware
folder in the GUI navigation pane.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
36. After waiting for 2 minutes, physically reinsert the EMU that was removed in the
previous lab step.
Note
It may take several minutes for the Command View EVA GUI to rediscover the
reinserted EMU.
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
37. Physically remove a Fibre Channel cable from one of the I/O modules and wait
2 minutes.
Time action performed: ……………… .........Cable removed:..……….…..………
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
38. After waiting for 2 minutes, reconnect the Fibre Channel cable to the I/O
module.
Time action performed: ………………
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Notes:
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
To capture and correlate the event log entries, perform the following:
1. Capture the Controller Event log from the storage system.
Important
! If you are doing this lab remotely, you do not need to capture these logs, they
will be on the course DVD. There will be two: one named
Controller_Event_Log_Pod30_GetEventFileSC.5767168 and one named
Controller_Event_Log_Pod30_tmpScEventFile000001.idi.
2. Translate the Controller Event Log using your translation tool of choice.
If you are doing the lab remotely, choose one of the following files to translate:
a. Translate the Controller_Event_Log_Pod30_tmpScEventFile000001.idi log
beginning with a date of March 18, 2009, at 16:06:30, and ending at
18:37:59.
b. Translate the Controller_Event_Log_Pod30_GetEventFileSC.5767168 log
beginning with a date of May 1, 2004, at 17:37:18, and ending at
18:22:59.
Note
If you are doing this lab locally, note that although the storage system was
uninitialized when you started this lab, the management server still has older
events stored that are not related to your lab activities. Therefore, when
translating your Controller Event Log, only translate those events starting with the
timestamp associated with your initializing the storage system.
3. Open an Excel spreadsheet output of your events and sort them all by
Date/Time.
Note
Any other additional formats of the Controller Event Log output can be opened
and used as well.
4. On your course DVD, in a directory called Labs and extra related lab materials,
there are a few documents related to this exercise. Locate the following
documents:
<NAPP or EVE>_Pod30_main.xls — This is an Excel spreadsheet example
output of the above actions performed on a storage system. The entire
output (nearly 500 events) is fully documented with information regarding
the meaning of many of the events. These comments have been added to
the first column of the spreadsheet. Many items of particular interest are in
bold font. Open this spreadsheet and have it available for viewing before
you continue your lab. Do not actually refer to it unless you need to refer to
it during the correlation exercise.
Important
! The use of this commented output is a mandatory part of the lab. Compare your
results to the results of the output from the <NAPP or EVE>_Pod30_main.xls
spreadsheet.
Note
There was an action or two that did not generate any Controller Event Log
entries. For those actions, use the Management Server Event Log to find an
associated event for the action.
Important
! You do not need to complete this section if you are doing the lab remotely.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Allow as much time as possible to thoroughly analyze each presented case study.
A total of 13 case studies are presented in this lab. Your instructor will choose two or
three of these to work through with you as classroom exercises.
For case studies 1, 2, 5, 6, and 7:
Use Navigator to translate the binary files on your student CD.
Thoroughly analyze the translated output.
Answer the questions in your lab guide for each case study. Check the answers
as directed by the instructor. He will provide instructions for finding the answers
to the questions.
Customer scenario
A customer had the following problem:
Device-side loops were failing.
The problem was first noticed at this time and date:
As soon as the storage system was booted on 4/16/04 at 19:33:00
The following customer and service actions were taken:
None
The problem is now:
Not fixed
Configuration information
The configuration is the following:
This is a 2C/12D configuration
No loop switches are installed
This system was just initialized
One disk group containing all installed disks was created
Two virtual disks were created (one preferred to each controller)
Heavy I/O was being issued from a single Windows server
Questions
1. Sequence number 93 — Why did the leveling of capacity in a Disk Group start?
................................................................................................................
................................................................................................................
2. Sequence number 99 — Is ALPA 0xEF a valid device ALPA for a single rack EVA
configuration? If not, how can you explain an N_Port login failure to it?
................................................................................................................
................................................................................................................
3. Sequence number 103 — Which I/O module transceiver is not detecting any
laser signal?
................................................................................................................
4. Sequence number 119 — How many directed LIPs were issued by the 4252
controller to all devices on loop 1B prior to the 4252 controller loop 1B failure?
................................................................................................................
5. Sequence number 193 — If the 4252 controller loop 1B port is currently in a
Failed state, how is it possible that it is still reporting loop receiver errors?
................................................................................................................
................................................................................................................
................................................................................................................
6. Sequence number 195 — Within the last minute, how many loop receiver errors
were detected by the 4247 controller from ALPA 6A?
................................................................................................................
7. Sequence number 209 — What is the likely cause of this status change for an
enclosure?
................................................................................................................
................................................................................................................
8. Sequence number 217 — What does this event indicate is about to happen?
................................................................................................................
................................................................................................................
9. Sequence number 316 — How long has it been since the indicated transceiver
last lost its laser signal?
................................................................................................................
................................................................................................................
10. Sequence number 409 — Why is a new loop pair 1 device map being
generated?
................................................................................................................
................................................................................................................
11. Sequence number 412 — Is this event in any way related to the loop errors that
are being reported on loop 1B? Also, based on this reported error, should this
disk be replaced?
................................................................................................................
................................................................................................................
12. Sequence number 418 — How much time has passed since the system was
initially booted?
................................................................................................................
13. Sequence number 419 — What is your analysis of the customer loop problems?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Customer scenario
A customer reported a storage system with device-side loops that are failing. This is
the same system that was used for case study #1. The following service actions were
taken:
The disk in 1/1 was ungrouped and removed from the storage system on
4/23/04 at 18:04:00
The B-side I/O module in Enc 3 was replaced on 4/23/04 at 18:08:00
A few minutes later, the loop 1B device ports were restarted from the CV EVA
GUI
The time was set from the CV EVA GUI at 18:26:00
No other service actions were taken
The problem is now:
Still not fixed; in fact, things appear to have gotten worse
Configuration information
The configuration is the following:
This is a 2C/12D configuration
No loop switches are installed
One disk group containing all installed disks was created
Two virtual disks have been created (one preferred to each controller)
Heavy I/O was being issued from a single Windows server
Questions
1. Sequence number 10557 — Why was this directed LIP issued to all devices on
loop 1B?
................................................................................................................
................................................................................................................
2. Sequence number 10643 — To which device did the 4252 controller issue the
directed LIP?
................................................................................................................
3. Sequence number 10656 — How much time passes between this event and the
next event?
................................................................................................................
4. Sequence number 10657 — The 4252 controller is detecting loop receiver errors
from which devices?
................................................................................................................
................................................................................................................
5. Sequence number 10674 and 10675 — Which transceivers have detected a loss
of laser condition? Are either of these transceivers ones that were reporting this
condition in case study #1?
................................................................................................................
................................................................................................................
6. Sequence number 10676 — What does this enclosure status change indicate?
................................................................................................................
................................................................................................................
7. Sequence number 10687 — What is the likely cause of this directed LIP to all
loop 1A devices?
................................................................................................................
................................................................................................................
8. Sequence number 10687 — How long has it been since the I/O module in
enclosure 3 was replaced?
................................................................................................................
9. Sequence number 10774 — What is the likely reason this non-data exchange
has timed out?
................................................................................................................
................................................................................................................
................................................................................................................
10. Sequence number 10853 — Which enclosures have all of their disks transition
to the SPOF state? Can you determine which controller (or both) is having
problems communicating on both loops to the indicated SPOF disks?
................................................................................................................
................................................................................................................
................................................................................................................
11. Sequence number 11008 — Why is the 4252 controller issuing a directed LIP to
the disk in enclosure 1 Bay 2?
................................................................................................................
12. Sequence number 11011 — Which sequence of events led up to the volume
transitioning to the MISSING state?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
13. Sequence number 11036 — Are all virtual disks from the indicated disk group
now inoperative?
................................................................................................................
14. Sequence number 11064 — How long has it been since the volume went
missing?
................................................................................................................
15. Sequence number 11107 — Which sequence of events led up to the disk drive
disappearing?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
16. Sequence number 11132 — What does this event indicate?
................................................................................................................
................................................................................................................
17. Sequence number 11134 — Is it normal for a device in a reconstructing state to
transition to a reverting state?
................................................................................................................
................................................................................................................
................................................................................................................
18. Sequence number 11162 — Why is a new device map generated here?
................................................................................................................
................................................................................................................
................................................................................................................
19. Sequence number 11169 — Why does the indicated leveling event take place?
................................................................................................................
................................................................................................................
20. Sequence number 11203 — Which service actions should be taken on the disk
in enclosure 1 Bay 2?
................................................................................................................
21. Sequence number 11714 — Why was there no leveling start event for this
leveling finished event?
................................................................................................................
................................................................................................................
................................................................................................................
22. Sequence number 11904 — How many hours have passed since the last
indication of any problems?
................................................................................................................
23. Sequence number 12210 — How many attempts were made to reconstruct the
data from this volume before the reconstruction was finally successful?
................................................................................................................
24. Sequence number 12212 — What is the likely cause of the indicated split
occurring?
................................................................................................................
................................................................................................................
25. Sequence number 12222 — At this point, which service actions should be
taken?
................................................................................................................
................................................................................................................
................................................................................................................
Customer scenario
A customer had the following problem:
A disk device failed.
The problem was first noticed:
On 2/9/04
The following customer and service actions were taken:
The failed disk device was removed and replaced with a new disk device on
2/9/04 at approximately 14:36:00.
The problem is now:
Resolved. However, the customer would like to know what caused the disk to
fail.
Configuration information
Not relevant for this case study.
Questions
1. Sequence number 1310 — What does this event indicate?
................................................................................................................
................................................................................................................
................................................................................................................
2. Sequence number 1311 — Based on this first indicated check condition, should
this drive be replaced?
................................................................................................................
................................................................................................................
................................................................................................................
3. Sequence number 1313 — How many days have passed since the last “retry
count exhausted” event?
................................................................................................................
4. Sequence number 1328 — Should the drive in 3/9 be replaced?
................................................................................................................
................................................................................................................
5. Sequence number 1338 — How many exchange time-outs to the drive in 3/9
took place prior to this event being logged?
................................................................................................................
6. Sequence number 1339 — Which loop remains good for access to the drive in
3/9?
................................................................................................................
7. Sequence number 1341 — How long after the drive in 3/9 went SPOF did it
return to a Normal status?
................................................................................................................
................................................................................................................
8. Sequence number 1334 — Is this the first error logged by the 11d5 controller?
................................................................................................................
9. Sequence number 1345 — How many hours have passed since the last check
condition error? Which loop is this check condition error being reported on?
................................................................................................................
................................................................................................................
10. Sequence number 1347 — Because the error is reported on loop 1B, does this
indicate the problem is likely related to the FC loops or the disk drive itself?
................................................................................................................
11. Sequence number 1351 — Which type of error is logged against the drive in
3/9 and how many of these are allowed before a drive should be replaced?
................................................................................................................
................................................................................................................
12. Sequence number 1397 — The drive is rendered inoperable. When this event
takes place, will the controllers attempt to migrate the data off of the disk?
................................................................................................................
................................................................................................................
13. Sequence number 1406 — What does this event indicate?
................................................................................................................
................................................................................................................
................................................................................................................
14. Sequence number 1411 — Does this event indicate the drive in 3/9 is now
ready for normal data access?
................................................................................................................
................................................................................................................
15. Sequence number 1423 — What was the current state of the indicated volume
prior to it transitioning to the reconstructing state?
................................................................................................................
16. Sequence number 1426 — What is the likely cause of the drive map being
updated?
................................................................................................................
................................................................................................................
17. Sequence number 1427 — Which mistake was just made (indicated by the
action that just took place)?
................................................................................................................
................................................................................................................
18. Sequence numbers 1429/1430 — What do these two events indicate?
................................................................................................................
................................................................................................................
19. Sequence number 1435 — Is this a normal event, or something to be concerned
about?
................................................................................................................
................................................................................................................
20. Sequence number 1436 — How long did the reconstruction of the MISSING
volume take?
................................................................................................................
................................................................................................................
21. Sequence number 1438 — What is the likely cause for the split occurring?
................................................................................................................
................................................................................................................
22. Sequence number 1439 — The leveling event starts as a result of the completion
of which action?
................................................................................................................
................................................................................................................
23. Sequence number 1445 — Why did the leveling event take place?
................................................................................................................
................................................................................................................
24. Sequence number 1447 — Is this an anticipated event given that the old disk in
3/9 was physically removed from the storage system?
................................................................................................................
................................................................................................................
25. Sequence number 1465 — Which events over the last few days are missing?
Why are these events missing?
................................................................................................................
................................................................................................................
Customer scenario
A customer has the following problem:
They noticed A-side loop problems and other A-side storage system anomalies.
The problem was first noticed:
On 10/08/03
The following customer and service actions were taken:
None
The problem is now:
Not resolved.
Configuration information
The configuration is the following:
This is a 2C/12D configuration
Loop switches are installed
Questions
1. Sequence number 28 — What is the reason for the controller resync taking
place?
................................................................................................................
................................................................................................................
2. Sequence number 402 —This is the 2nd boot of the 11d5 controller. Why is this
the first time there were any FP1 or FP2 errors reported? That is, why were there
no errors on FP1 or FP2 during the first boot of the controllers?
................................................................................................................
................................................................................................................
................................................................................................................
3. Sequence number 58 — Are any events ever logged to indicate the transition of
fabric ports FP1 and FP2 from the Failed to the Normal state?
................................................................................................................
................................................................................................................
................................................................................................................
Note
In later VCS versions, the transition states to Normal would most likely have been
logged.
4. Sequence number 59 — How many hours have passed since the last event was
logged, and what does this event indicate?
................................................................................................................
................................................................................................................
................................................................................................................
5. Sequence number 67 — How many days have passed since the last event was
logged? How can these periods of no events being reported be avoided?
................................................................................................................
................................................................................................................
6. Sequence number 69 — Are there any indications why this directed LIP was
issued?
................................................................................................................
7. Sequence number 70 — Another directed LIP. Is this directed LIP issued to the
same or a different device-side loop?
................................................................................................................
................................................................................................................
................................................................................................................
8. Sequence number 71 — What does this event indicate?
................................................................................................................
................................................................................................................
9. Sequence number 406 and 407 — Which controller is issuing these directed
LIPs? What does that indicate about the current problem?
................................................................................................................
................................................................................................................
10. Sequence number 77 — The errors appear to be common between loop 1A
and loop 2A. What are the common elements between these two loops?
................................................................................................................
................................................................................................................
11. Sequence number 78 — Is the indicated power supply physically located nearer
the A-side or B-side device loop?
................................................................................................................
12. Sequence number 95 — How many power supplies have lost their AC power?
Over what timeframe did they lose AC power? What are the possible sources of
this AC power loss?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
13. Sequence number 99 — What does this event most likely indicate?
................................................................................................................
................................................................................................................
14. Sequence number 110 — How many days have passed since the last errors on
loop 1A were reported?
................................................................................................................
15. Sequence number 112 — What does this event mean, and what subsequent
action is taken by the controller to fix the anomaly?
................................................................................................................
................................................................................................................
16. Sequence number 116 — Can either controller communicate via FC to the drive
in enclosure 3 Bay 4?
................................................................................................................
17. Sequence number 117 — What is the cause of the loop map being
regenerated?
................................................................................................................
................................................................................................................
18. Sequence number 118 — How many days have passed since the A-side of the
storage system last lost AC input power?
................................................................................................................
19. Sequence number 139 — What happened to the drive in enclosure 3 Bay 4? Is
the reason for the bypass listed? How could you determine the actual cause for
this drive being bypassed?
................................................................................................................
................................................................................................................
................................................................................................................
20. Sequence number 142 — What does this event indicate?
................................................................................................................
................................................................................................................
21. Sequence number 190 — This event implies that the AC power was lost to the A-
side of the storage system for how long a period of time?
................................................................................................................
22. Sequence number 197 — Days have passed since the last indication of a power
problem. Which service actions should be taken?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Customer scenario
A customer had the following problem:
They noticed numerous check conditions taking place in the controller event log
and wanted an analysis done.
The problem was first noticed:
On 3/17/04
The following customer and service actions were taken:
On 3/23/04 at 10:18:09, the drive in 6/7 was physically removed from the
storage system.
On 4/13/04 at 16:45:54, the drive in 10/9 was ungrouped from a disk group.
The problem is now:
?
Configuration information
Not relevant.
Questions
1. Sequence number 8092 — How many events are logged in association with the
controller syncing up its time with the SMA?
................................................................................................................
2. Sequence number 8121 — Based on the number and type of check conditions
within the last 24 hours, is the drive in 10/9 a candidate for replacement?
................................................................................................................
3. Sequence number 8139 — Is there any previous indication of problems on the
drive in 6/7?
................................................................................................................
................................................................................................................
4. Sequence number 8143 — What does this event indicate?
................................................................................................................
................................................................................................................
5. Sequence number 8147 — Does this event indicate that the same drive removed
from 6/7 was just reinserted back into the storage system?
................................................................................................................
................................................................................................................
6. Sequence number 8194 — Is the drive in 5/11 a candidate for replacement?
................................................................................................................
................................................................................................................
7. Sequence number 8223 — How many days have passed since the last check
condition error was reported by the drive in 10/9?
................................................................................................................
8. Sequence number 8282 — How many check condition errors took place on the
drive in 6/8 prior to this event being logged?
................................................................................................................
9. Sequence number 8285 — Should the drive in 6/8 be replaced?
................................................................................................................
10. Sequence number 8385 — What caused the migration of data off of the 10/9
drive to start taking place?
................................................................................................................
................................................................................................................
................................................................................................................
11. Sequence number 8388 — What is the cause for the start of this leveling event?
................................................................................................................
................................................................................................................
12. Sequence number 8428 — What caused the migration of data off of the 6/8
drive to start taking place?
................................................................................................................
................................................................................................................
................................................................................................................
13. Sequence number 8456 — Does this event take place during or after the 6/8
drive data migration?
................................................................................................................
................................................................................................................
14. Sequence number 8484 — Can drives that have been rendered inoperable
have their data migrated off of them?
................................................................................................................
15. Sequence number 8487 — Is this the first time the 0103 disk group transitioned
to a state of “Disk Group with no redundancy is inoperative”? Why?
................................................................................................................
16. Sequence number 8490 — The volume begins to be reconstructed. How long
after the volume went missing did this event take place?
................................................................................................................
17. Sequence number 8495 — Why didn’t the disk group indicate the start of a
leveling event after the successful completion of reconstruction?
................................................................................................................
................................................................................................................
................................................................................................................
18. Sequence number 8495 — What are the possible reasons for the merge event
taking place?
................................................................................................................
................................................................................................................
................................................................................................................
19. Sequence number 8559 — Why are we still getting SMA communication
failures to the drive in 6/8? How long has the data already been reconstructed
off of the 6/8 drive?
................................................................................................................
................................................................................................................
................................................................................................................
20. Sequence number 8559 — Has the merge (which started on sequence number
8495) completed? If not, why not?
................................................................................................................
................................................................................................................
21. Sequence number 8559 — Which service actions should be taken on this
storage system?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Customer scenario
A customer had the following problem:
Their storage system went offline and approximately 10 minutes later came back
online.
The problem was first noticed:
On 3/29/04 at approximately 3:30 AM
The following customer and service actions were taken:
None
The problem is now:
Fixed, but under investigation
Analyze
The customer would like root cause analysis for the outage.
Configuration information
The configuration is the following:
2C18D configuration
Loop switches are present
Questions
1. Sequence number 13749 — Is the directed LIP activity isolated to loop 1B?
................................................................................................................
2. Sequence number 13630 — The 020E controller has issued directed LIPs to both
loops of the loop 1 pair. What is the only device that could be causing both
controllers on both loops to issue directed LIPs to all devices?
................................................................................................................
................................................................................................................
3. Sequence number 13753 — This event indicates the 0219 controller is about to
perform which action (within the next couple of seconds or milliseconds)?
................................................................................................................
................................................................................................................
4. Sequence number 13759 — What is the cause for this controller resync?
................................................................................................................
................................................................................................................
5. Sequence number 13655 — How many loop switches were detected by the two
controllers? How many should have been detected?
................................................................................................................
................................................................................................................
6. Sequence number 13788 — How many directed LIPs were issued to the 1A loop
before the 0219 controller transitioned it to the Failed state?
................................................................................................................
7. Sequence number 13689 — Is this a standard controller startup message?
................................................................................................................
8. Sequence number 13805 — How many times has the 0219 controller marked
the 1A loop as Failed since the controller began booting?
................................................................................................................
9. Sequence number 13814 — Prior to controller 0219 failing the 1B loop for the
3rd time since boot, did it indicate it was going to enable the 1A loop?
................................................................................................................
................................................................................................................
10. Sequence number 13702 — At this point, all four loop pair 1 loops have failed
how many times since they controllers were booted?
................................................................................................................
................................................................................................................
11. Sequence number 13816 — What does the error message “An HSV110
controller has failed” indicate?
................................................................................................................
................................................................................................................
................................................................................................................
12. Sequence number 13704 — Since the errors were first logged against the drive
in 2/4, how much time has elapsed?
................................................................................................................
13. Sequence number 13705 — In a normal storage system startup sequence, which
boot events are typically logged prior to this event?
................................................................................................................
................................................................................................................
14. Sequence number 13718 — This is the first message associated with the LID
recovery code attempting to recover access to the devices on the indicated loop
port. Is this event logged by both controllers or just the master controller?
................................................................................................................
................................................................................................................
15. Sequence number 13721 — How much time passed between the LID recovery
code start and the bypassing of the troublesome drive in 2/4?
................................................................................................................
16. Sequence number 13724 — Why did this controller resync take place?
................................................................................................................
................................................................................................................
17. Sequence number 13736 — With the drive in 2/4 now bypassed, how many
loop switches were detected by the controllers on boot?
................................................................................................................
18. Sequence number 13737 — Which volume just transitioned to the Missing state?
................................................................................................................
................................................................................................................
19. Sequence number 13838 — Which device has just transitioned from the Failed
to the Normal state?
................................................................................................................
................................................................................................................
20. Sequence number 13872 — Which device has just transitioned from the Failed
to the Normal state? Now that both have transitioned to this state, how much
time passes before the storage system transitions to the online state?
................................................................................................................
................................................................................................................
21. Sequence number 13873 — How much time has passed since the controllers
originally resynced due to a VRAID1 inoperative condition?
................................................................................................................
................................................................................................................
22. Sequence number 13874 — Is this event cause for concern?
................................................................................................................
................................................................................................................
................................................................................................................
23. Sequence numbers 13878 through 13881 — Did the two disk groups level at the
same time, or did they level one after the other?
................................................................................................................
................................................................................................................
24. Sequence number 13882 — What are the possible causes for this event being
logged? Based on the rest of the controller event log, should any service actions
be taken on the indicated disk?
................................................................................................................
................................................................................................................
................................................................................................................
25. Sequence number 13889 — At this time, what service actions should be taken
on this storage system?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Customer scenario
A customer had the following problem:
Their storage system went offline and approximately 10 minutes later came back
online.
The problem was first noticed:
On 3/29/04 at approximately 3:28 AM
The following customer and service actions were taken:
On 3/29/04 at approximately 13:41 a drive was removed and replaced.
The problem is now:
Fixed, but under investigation
Analyze
The customer would like root cause analysis for the outage.
Configuration information
The configuration is the following:
2C18D configuration
Loop switches are present
Questions
1. Sequence number 13802 — Which drive is the apparent cause for the directed
LIPs on loop 1A?
................................................................................................................
2. Sequence number 13811 — How many directed LIPs to ALPA FF were done to
loop 1A prior to the 1334 controller telling the Failed loop 1B to enable itself?
................................................................................................................
................................................................................................................
3. Sequence number 13814 — What is the likely component that is causing both
loop 1A and loop 1B to fail?
................................................................................................................
................................................................................................................
4. Sequence number 13817 — Why was the disk on 5/4 rendered inoperable?
................................................................................................................
................................................................................................................
5. Sequence number 13819 — Why is the 1334 controller resyncing? Is there any
event notification yet of the 12F9 controller resyncing?
................................................................................................................
................................................................................................................
................................................................................................................
6. Sequence number 13829 — When the 1334 controller boots, how many loop
switches does it find? What does this indicate?
................................................................................................................
................................................................................................................
................................................................................................................
7. Sequence number 13872 — At this point, for controller 1334, what is the current
state of the two loop pair 1 loops?
................................................................................................................
................................................................................................................
................................................................................................................
15. Sequence number 13630 — Which recovery process is this event the start of?
................................................................................................................
................................................................................................................
16. Sequence number 13632 — Which device was bypassed by the LID recovery
code? Would this be the device you would have guessed would have been
bypassed?
................................................................................................................
................................................................................................................
17. Sequence number 13632 — How long did the LID recovery code take to bypass
the LID drive?
................................................................................................................
18. Sequence number 13650 — Is it possible for a volume to transition from the
Missing to the Reverting state without first transitioning to the Reconstructing
state?
................................................................................................................
19. Sequence number 13776 — Why was this event reported?
................................................................................................................
................................................................................................................
20. Sequence number 13901 — How long has it been since the 1334 controller
transitioned to the Failed state?
................................................................................................................
................................................................................................................
21. Sequence numbers 13906, 13907 and 13908 — Why are 3 disks becoming
quorum disks? Can you determine where all system quorum disks will be
physically located after these three events take place?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
22. Sequence number 13911 — If the customer had VRAID5 virtual disks carved out
of disk group 0102, is all of that data now forever lost?
................................................................................................................
................................................................................................................
23. Sequence number 13922 — How many virtual disks transitioned to the Failed
state?
................................................................................................................
24. Sequence number 14037 — Approximately how long is it taking for the volumes
to complete their reverting processes?
................................................................................................................
................................................................................................................
25. Sequence number 14085 — Approximately how long has it been since the first
controller resynced (the system being brought back online)?
................................................................................................................
................................................................................................................
26. Sequence number 14087 — How many drives are running 3BE7 firmware?
Which caution exists with drives that are running this firmware version?
................................................................................................................
................................................................................................................
................................................................................................................
27. Sequence number 14184 — Which VRAID types will now be online to disk
group 0103?
................................................................................................................
................................................................................................................
28. Sequence number 14185 — Which disk group is the indicated virtual disk likely
to be carved out of?
................................................................................................................
................................................................................................................
29. Sequence number 14232 — Have all of the virtual disks, which were initially
marked as Failed, now transitioned to the Normal state?
................................................................................................................
................................................................................................................
30. Sequence number 14293 — Which volume is being reconstructed? Which disk
group does this volume belong to?
................................................................................................................
................................................................................................................
31. Sequence number 14302 — Have all of the disk groups transitioned to a
Normal state?
................................................................................................................
32. Sequence number 14328 — Why would the EMU be attempting to assign a
hard address to any device at this time?
................................................................................................................
................................................................................................................
................................................................................................................
33. Sequence number 14332 — What is this event a likely result of?
................................................................................................................
................................................................................................................
34. Sequence number 14332 — Are any service actions required on this storage
system at this time?
................................................................................................................
Customer scenario
A customer had the following problem:
OCP of one controller displays “FC Loop Misconfigured: Restart”
OCP of the other controller displays “STsys has been lost”
The problem was first noticed:
On 30-Mar-2006 at approximately 13:00
The following customer and service actions were taken:
EMUs reseated
The problem is now:
Resolved, but customer would like to know the cause
Analyze
Configuration information
The configuration is the following:
EVA6000 2C8D.
Customer scenario
A customer experienced the following problem:
Single Port on Fibre events logged on Loop 2B
All devices in shelf 13 showing single path on Loop 2A
The problem was first noticed:
On 11-12 September.
The following customer and service actions were taken:
No actions were taken by the customer or Services.
Analyze.
Configuration information
Configuration information: Not relevant.
Customer scenario
A customer experienced the following problem:
Multiple disk failures in his system
The problem was noticed:
On 10 October.
The following customer and service actions were taken:
No actions were taken by the customer or Services.
Analyze.
Configuration information
The configuration is the following:
Configuration information: Not relevant.
Customer scenario
A customer experienced the following problem:
LID recovery with VCS v3.014
The following customer and service actions were taken:
No actions were taken.
Analyze.
Configuration information
Configuration information: Not relevant
Customer scenario
A customer experienced the following problem:
Loop 1A and 2A logging
Exchange complete with missing data in the same time frame
The following customer and service actions were taken:
Actions taken: None.
Analyze.
Configuration information
Configuration information: Not relevant.
Customer scenario
A customer had the following problem:
End users of mail server reported intermittent hangs of up to 2–3 minutes.
These hangs were occurring 4–-5 times a day.
The problem was noticed:
May, 2008
The following customer and service actions were taken:
Checked all GroupWise (mail server) applications, NetWare, and network
configurations and logs
Checked EVA logs through Navigator, SAN switch commands
Collected EVAPerf data for 24 hours and analyzed with PerfMonkey
Replaced both Fibre Channel cables, and HBAs and SFPs on SAN switches
Upgraded Qlogic driver and upgraded SAN switch firmware
The problem is now:
Resolved, but needs to be confirmed
Analyze.
Configuration information
The configuration is the following:
EVA4100 2C1D configuration
XCS 6.110
14 146GB 10K rpm disks running in latest firmware
Two servers attached (one file server and one mail server)
Customer scenario
A customer had the following problem:
Many disks were failing
The problem was first noticed:
May, 2008
The following customer and service actions were taken:
Reseating failed drives did not fix the problem.
Issued debug commands 34/35/36 to track suspected disks (1B loop would go
down).
The output from debug command 36 indicated many bad words on enclosure 2
starting at Bay 5.
Suspecting drives downstream, the drive in enclosure 2, Bay 4 was ungrouped,
which appeared to clear up the problem.
When the drive was replaced and grouped around noon, the 1B loop would not
stay up and disk errors were logged in the controller event log.
The problem is now:
Resolved.
Analyze.
Configuration information
The configuration is the following:
EVA5000, 2C12D configuration
VCS 3.028
This lab allows you to explore many of the features of EVAPerf, including command
line and GUI features. The lab activities concentrate on how to use the tool, not how
to analyze the data that you retrieve.
Note
Examples in this lab pertain to the EVA8000, however, they are as easily
applied to the EVA 4400/6400/8400 series.
Performance analysis is complex. Often, wrong conclusions are drawn from extracted
data due to an insufficient understanding of the operations of all components
involved in the analysis.
Without baseline performance numbers for a given system’s operation, it can be very
challenging to determine if a system’s performance is at or near optimal.
The performance of the EVA is impacted by many factors, some of which include:
The number of disks in a disk group
The total number of disk groups
The speed (10K or 15K RPM) of the drives used in a disk group
The number of virtual disks carved out of a disk group
The total number of LUNs being served by the storage system
What type of host I/O operations are taking place (random versus sequential
I/O patterns and large versus small I/O data transfer sizes)
The types of ongoing EVA operations, such as:
Snapshots being used
Snapshots or snapclones being created
Data reconstruction or migration taking place
Virtual disks or other objects being created
The performance of the system beyond the EVA is impacted by many factors, some of
which include:
The number of hosts accessing the storage system
The fabric infrastructure used to access the storage system
Host computer properties, such as:
Application types and the number of applications being used
Operating system type being used
Number and speed of processors
Amount and types of cache
Amount and types of RAM
Number and types of HBAs being used
Number of and bus type on the host computer
Note
See the performance analysis white paper on the course CD for more
information about EVA performance analysis.
EVAPerf enables you to monitor and display EVA performance metrics for:
Replication data
Host connection data
Host port statistics
Physical disk data
Port status
Storage array data
Storage controller data
Virtual disk data
You can display performance metrics graphically in the Windows Perfmon utility or
you can display metrics in tabular form in a command prompt window (using
EVAPerf from the command line). You can also output the metrics in CSV (comma-
separated value) or TSV (tab-separated value) format for use with external
applications, such as Microsoft Excel.
EVA Data Collection Service (evapdcs.exe). Use to gather data from EVAs that
are visible to the host and store it in memory cache. You can then use either
evaperf.exe or evapmext.dll to retrieve and view the information.
The service is set to manual when you install EVAPerf. When you run
evaperf.exe, the service starts and remains running until you reboot the host.
You can also start and stop the service using Windows Service Manager.
EVAPerf TLViz Formatter. Three files are installed to support the TLViz Formatter:
EVAPerf-TLViz-Format.exe — A file that formats the EVAPerf all command
output so you can view it with the TLViz tool.
EVADATA.MDB — A Microsoft Access database template used to view the
all command output in a database. The data from the all command
resides in individual tables.
MSADODC.OCX — A file necessary to operate the TLViz Formatter
interface. This should be located in c:\windows\system32.
Friendly names
EVAPerf queries the management server for storage system information from all EVAs
it is managing. The storage system information queried includes:
Storage system WWNs
Virtual disk WWNs
Host connection identifiers
Disk group identifiers
This information is stored in a text file named fnames.conf.
Note
The friendly names file must reside in the same directory as the EVAPerf tools.
The fnames.conf file is used to associate the storage system WWN, virtual disk
WWN, host connection, and disk group name information, with human-readable
names. These friendly names improve the readability of performance reports and
data.
2. When prompted for the password, enter the account password that was created
during Command View EVA installation.
EVAPerf verifies that it can access the Command View EVA before adding the
information to the fnamehosts.conf file.
If you enter evaperf fnh without arguments, it displays a list of known
management servers running Command View EVA.
3. Locate the fnamehosts.conf file and open it with a text editor.
Note
At this point, you should only have a file named fnamehosts.conf.
Note
Every time you update the friendly names file, a backup (fnames_conf.bak) is
made of the previous version.
5. Locate and open the friendly names file, then verify that it is updated with the
new virtual disks.
A sample output is given below.
Note
If you reinstall EVAPerf, the fnames.conf file is removed. Therefore, save a copy
of the file in a different directory before uninstalling EVAPerf. After the newer
version of EVAPerf is installed, copy the saved fnames.conf back to the
installation directory.
To use the short name you entered in this file, add the –cn modifier to a command
you enter in the EVAPerf CLI. The short name is substituted when a long name is
encountered.
Note
To use the short names you entered in this file, add the –cn modifier to a
command you enter in the EVAPerf CLI. The short name is substituted when a
long name is encountered.
You can use Windows Perfmon to monitor and view EVA performance data.
The following example shows one of several ways to access and configure Perfmon
to display EVA performance metrics.
Note
It is assumed that you are somewhat familiar with Perfmon.
4. In the Select counters from computer drop-down list box, select the computer
where the EVA performance tools are running.
5. In the Performance object drop-down list box, select an HP EVA object to
monitor (for example HP EVA Storage Array).
6. Click All counters to select every listed counter, or select only those counters you
are interested in.
Note
If you click a counter in the list, you can click the Explain button for a brief
explanation of the meaning of the selected performance counter.
7. Click All instances to select every listed performance object type, or select those
instances you are interested in.
8. Click the Add button to add the counters to the window.
9. Click Close to close the Add Counters dialog box.
10. Using IOmeter or Winthrax, generate as much I/O activity as possible from your
host to a storage system virtual disk.
Note
Open IOmeter or Winthrax from a shortcut on your desktop or course CD.
Note
To add other objects, repeat the steps or use the plus “+” icon. To remove
metrics, select the metric from the list and click the remove icon (to the right of the
+ icon) in the toolbar.
5. When the next dialog appears, select the Add Counters button.
6. Select the EVA object that you want to log.
7. Select the counters and instances for the selected object, then click Add.
8. Repeat the above steps for each object you want to log.
9. Click the Close button.
10. Select the Log Files tab to specify the log file type (select a type of text file), then
click the Configure button.
11. Specify the log file location and name, and then click OK.
12. Select the Schedule tab to schedule running the Perfmon job, then click OK.
13. After running the job for a while, view the log from the Perfmon system monitor
by selecting the View Log Data button (Ctrl/L).
14. Select Log files as the data source and then click Add; then, when prompted for
the file, locate the file you created and click OK.
You can use a command prompt window to display EVA performance data in a
tabular format. You can also output the tabular data in CSV (comma-separated
value) or TSV (tab-separated variable) format.
The following example assumes that you are familiar with command prompt use. The
example shows running EVAPerf to display storage cell tabular data.
1. Click Start Run, enter cmd, and then click OK.
A command prompt window appears.
2. Change to the directory where the EVA Performance Tools are installed (for
example, c:\Program Files\Hewlett-Packard\EVA Performance Monitor).
3. To display only a summary of storage system information, enter evaperf ls.
The window displays a summary of current storage system data. This window is
not refreshed.
Important
! See Appendix B for a list of supported qualifiers when running the command
prompt option.
Note
To stop the continuous display, use Ctrl-c.
Answers
1. evaperf all
2. evaperf as –csv
3. evaperf cs –cont 5 –dur 10
4. evaperf cs –cont 5 –dur 20 –tsv -fo name.txt
5. evaperf hc
6. evaperf hps –tsv
7. evaperf luns -nh
8. evaperf pdg
9. evaperf pdg –cont 5 –dur 20 –KB –fo name.txt
10. evaperf ps –tsv –ts1
11. evaperf vd –cont 5 –tsv –fvd vdisk1 vdisk2
12. evaperf ls –csv -fo name.txt
Note
Allow the test to run for a couple of minutes to get a more accurate determination
of average MB/s performance.
5. Create a disk group with as many disks as possible with a disk failure protection
level of Single.
6. Create one 5GB VRAID5 virtual disk from the disk group and set the Preferred
path/mode as Path B-Failover only.
7. Create one 10GB VRAID1 virtual disk from the disk group and set the Preferred
path/mode as Path A-Failover only.
8. Present both created virtual disks to the Windows host.
9. Using Winthrax or any other I/O generation utility, generate as much I/O
activity as possible to the two virtual disks.
10. Using EVAPerf, determine how many MB/s you are able to transfer on average
between the Windows server and the EVA storage system. Write the achieved
number in MB/s below.
................................................................................................................
11. Calculate the difference in MB/s between the two tests and write the result
below.
................................................................................................................
Note
When this same test was performed on a first generation EVA lab system with 75
10K RPM disks, the difference between the two tests was 38MB/s.
HP EVA DR tunnels
The HP EVA DR tunnels object reports the intensity and behavior of the link traffic
between source and destination arrays. The counters for this object display
information only if there is at least one active DR group on the array. Otherwise, only
the header appears. You can display metrics in either MBs or KBs.
Although some arrays allow up to four open tunnels on a host port, only one tunnel is
active for a single DR group. Multiple DR groups can share the same tunnel. Statistics
for each tunnel are reported by both the source and destination arrays, but the
directional counters are complementary.
The counters are:
Round Trip Delay — The average time, in milliseconds, for a signal (ping) to
travel from the source to the destination and back. In replication traffic, the
signal is queued behind data transmissions, which increases the round trip
delay. If the destination controller is busy, the value also increases. Round trip
delay is reported for all active tunnels.
Copy Retries — The number of copies from the source EVA that were
retransmitted due to a failed copy transmission. Each retry creates a 128KB
copy. Retries are reported by both the source and destination arrays.
Write Retries — The number of writes from the source EVA that were
retransmitted due to a failed write to the destination EVA. Each retry creates an
8KB copy. If the write contains multiple 8KB segments, only the failed segments
are retransmitted. Retries are reported by both the source and destination
arrays.
Copy In MB/s — The rate at which data is copied to an array to populate the
members of a DR group with data when an initial copy or full copy is
requested.
Copy Out MB/s — The rate at which data is copied from an array to populate
the members of a DR group with data when an initial copy or full copy is
requested.
Write In MB/s — The rate at which data is written to an array because of write
activity to the members of the source array. The write activity includes host
writes, merges, and replication retries. A merge is an action initiated by the
source array to write new host data that has been received and logged while a
replication write to the destination array was interrupted, and now has been
restored.
Write Out MB/s — The rate at which data is written from an array because of
write activity to the members of the source array. The write activity includes host
writes, merges, and replication retries.
Note
For each counter, the results are an average of all disks in the disk group.
Note
On the HSV100 series of controllers, only average latency—the average of read
and write latencies—is reported. On the HSV200 series of controllers, separate
metrics are provided for read and write latency.
Average Read Req/s — The number of read requests (per second) sent to
physical disks.
Average Read MB/s — The rate at which data is read (per second) from
physical disk.
Average Read Latency — The average time it takes for a disk to complete a
read request. This average is weighted by requests per second. (HSV200
controller series only).
Average Write Req/s — The number of write requests (per second) sent to
physical disks.
Average Write MB/s — The amount of data written (per second) to physical
disks.
Average Write Latency — The average time it takes for a disk to complete a
write request. This average is weighted by requests per second. (HSV200
controller series only)
Number of Disks — The number of disks in the disk group.
Write Req/s — The number of write requests per second completed to a virtual
disk that were received from all hosts. Write requests may include transfers from
a source array to this array for data replication and host data written to
snapshot or snapclone volumes.
Write Data Rate — The rate at which data is written to the virtual disk by all
hosts and includes transfers from the source array to the destination array.
Write Latency — The average time it takes to complete a write request (from
initiation to receipt of write completion).
Flush Data Rate — The rate at which data is written to a physical disk for the
associated virtual disk. The sum of flush counters for all virtual disks on both
controllers is the rate at which data is written to the physical drives and is equal
to the total host write data. Data written to the destination array is included.
Host writes to snapshots and snapclones are included in the flush statistics, but
data flow for internal snapshot and snapclone normalization and copy-before-
write activity are not included.
Mirror Data Rate — The rate at which data travels across the mirror port to
complete read and write requests to a virtual disk. This data is not related to the
physical disk mirroring for VRAID1 redundancy. Write data is always copied
through the mirror port when cache mirroring is enabled for redundancy. In
active/active controllers, this counter includes read data from the owning
controller that must be returned to the requesting host through the proxy
controller. Reported mirror traffic is always outbound from the referenced
controller to the other controller.
Prefetch Data Rate — The rate at which data is read from the physical disk to
cache in anticipation of subsequent reads when a sequential stream is detected.
Note that a sequential data stream may be created by host I/O and other I/O
activity that occurs because of a DR initial copy or DR full copy.
Note
The following commands are from EVAPerf version 9.0.
Command options
Command options are included in the following table.
Command Description
all (or nall) Displays a summary of the array status by running the following commands
together: ls, as, cs, vd, vdg, hc, ps, hps, pd, pdg, drg, and drt
as Displays array status
cs Displays controller status
cvconfig Configures Command View EVA login parameters to be used for state data
collection
dginfo Disk group configuration
drg Displays data replication groups.
drt Displays data replication tunnel statistics
dpw wwn Deletes the password for the specified array from the host’s Windows
registry. The password is not deleted from the array.
fnh Displays a list of known friendly-name Command View EVA hosts and adds
a friendly name host to the friendly name host list
fn Reloads friendly-names from known Command View EVA hosts to the
fnames.conf file
h, help, or evaperf Displays help for EVAPerf
hc Displays host connections. The Port column in the output does not display
data for the HSV2x0 series of controllers (-a appears in the Port column).
hist Displays historical information and state data
hps Displays host port statistics
ls Displays visible EVA storage systems visible to the host
luns Displays LUNS visible to this host
mof Displays output for the ls, as, cs, vd, vdg, hc, ps, hps, pd, and pdg
commands and saves the output for each command in a separate file. The
-csv and -od modifiers are required.
pd Displays physical disk data
pda Displays statistics for physical disk activity
pdg Displays the total physical disk data by disk group
pfa Sets the array filter list in Windows Perfmon
pfd Deletes the filter configurations for Windows Perfmon
pfh Displays help for the Windows Perfmon filter commands
pfs Displays the filter configuration for Windows Perfmon
pfvd Sets the virtual disk filter list in Windows Perfmon
ps Displays port status
Command Description
rc Resets the error counters reported by the ps command
Server host port Configures EVAPerf RPC server for remote client.
[username]
sfn Show friendly name map
showcvconfig Displays Command View EVA login parameters used for state data
collection
spw wwn password Sets the password for an array. This password must match the password
entered on the controller OCP of the array.
vd Displays virtual disk statistics for only those virtual disks that have been
presented to a host
vdg Displays total LUN activity by disk group
vdts Displays virtual transfer size histograms. This command is only available
on the HSV2x0 series of controllers.
vdtsg [lunwwn] Graphs virtual disk transfer size histograms for all LUNs or a given
WWN. This command is only available on the HSV2x0 series of
controllers.
vdrl Displays virtual disk read latency histograms (HSV2x0 only)
vdrlg [lunwwn] Graphs virtual disk read latency histograms for all LUNs or a specific
WWN. This command is only available on the HSV2x0 series of
controllers.
vdwl Displays virtual disk write latency histograms. This command is only
available on the HSV2x0 series of controllers.
vdwlg [lunwwn] Graphs virtual disk write latency histograms for all LUNS or a specific
WWN. This command is only available on the HSV2x0 series of
controllers.
verifycvconfig Verifies if Command View EVA is accessible for state data collection
vpw Verifies array passwords for use with EVAPerf
Command Description
-cn Substitutes friendly names from the fnames.dict file.
-cont n Runs an EVAPerf command continuously. You can specify the interval by adding a
number (n). Otherwise, the default interval is one second. Press Ctrl+c to exit from
this mode.
-csv Displays data in CSV (comma-separated value) format and automatically includes
a time stamp. The time stamp format can be modified using the –ts1 or –ts2
modifiers.
–dur n Specifies the duration of a continuous mode session. For example, if you enter
evaperf vd –csv –cont –dur 30, virtual disk data is displayed in CSV
format at one second intervals for a total of 30 seconds.
Command Description
-fd keyword Displays data that contains the specified keywords. You must enter at least one
keyword. To enter multiple keywords, separate each keyword with a space. For
example, if you enter evaperf –fd test prelim good, the data that
displays contains the words test, prelim, and good.
-fnid Display the WWN, group, DRM group, and host name along with corresponding
friendly names.
–fo filename Copies output to a file as well as displaying it in the command prompt.
You can combine this modifier with –cont and –dur for a fixed-time data capture.
For example, if you enter evaperf vd –cont 5 –dur 30 –fo
capture.log, virtual disk data is displayed in CSV at five second intervals for a
total of 30 seconds and is also written to the capture.log file.
-fvd vdisk Limits virtual disk data collection to the specified virtual disk(s). You must enter at
least one virtual disk. You can also combine this modifier with –sz to limit data
collection to the specified array(s). For example, if you enter evaperf vd –fvd
test1 test2 –sz server1, data is collected for virtual disks test1 and test2
on array server1 only. You can use this modifier with the vd, vdrl, vdwl, and vdts
commands.
-KB Displays output data in kilobytes per second (1024). The default is megabytes per
second (1,000,000).
-nfn Specifies that friendly names should not be used.
-nh Specifies that no headings be included in CSV (comma-separated value) output.
-nots Specifies that a time stamp not be included in the CSV output.
-od Specifies the directory in which the output files from the mof command are saved.
-sz Filters the arrays that are interrogated.
-tlc Displays TLC-compliant data for the mof command.
-tsv Displays output in tab-separated variable format with a time stamp.
-ts1 Adds a time stamp to the –csv output in the following format: Fri Jul 23 16:23:05
2004.
-ts2 Adds a time stamp to the –csv output in the following format: 23/Jul/2004
16:23:05 2004. This is the default format.
-us Display times in microseconds (the default is milliseconds). Latencies are
displayed in milliseconds (ms) by default. Time that is less than one millisecond
appears as zero. Use the –us option to show times in microseconds for more
accuracy.
This lab allows you to familiarize yourself with the most important aspect of EVA
performance assessment data collection, that is, using and manipulating the TLViz-
formatted output files.
You will be able to take a given EVAPerf output file in comma-separated value (CSV)
format, use it as input to TLViz Formatter, and generate individual EVAPerf CSV files.
You can then examine, format, and graph these files in Excel. Another exercise
allows you to use the TLViz Formatter to build the Access database and run queries.
As you go through the lab, remember to:
Read and perform all of the lab steps that you can in the allowable time.
Concentrate on those tasks that might help you to diagnose and troubleshoot
performance problems with the array.
Before generating individual TLViz output files, it is useful to view the output of the
evaperf all command in a single Excel file.
Perform the following and answer the questions:
1. Locate and unzip the All Files.zip file.
Note
This file is available as part of the course files in the folder called 03-EVAPERF
Formatter.
2. Copy the EVA2.csv file to an empty directory. This is the raw comma-separated
value output from the evaperf all command.
3. Open EVA2.csv with Excel.
a. Which information is stored in this file?
...........................................................................................................
...........................................................................................................
b. How would you track a specific virtual disk or disk group read miss latency
over the entire file capture?
...........................................................................................................
...........................................................................................................
4. Close the file but do not save changes.
Note
If needed, your instructor will point you to the location of the TLViz Formatter.
e. PhyDiskStats.csv
...........................................................................................................
...........................................................................................................
f. PhysicalDisk.csv
...........................................................................................................
...........................................................................................................
g. PresentedEVA.csv
...........................................................................................................
...........................................................................................................
h. PresentedLUNS.csv
...........................................................................................................
...........................................................................................................
i. Tunnel.csv
...........................................................................................................
...........................................................................................................
j. Vdisk.csv
...........................................................................................................
...........................................................................................................
k. Vdiskstats.csv
...........................................................................................................
...........................................................................................................
You should have noted that each file contains an object class like Vdisk, host
ports, physical disks, and so on. Within each file are the specific counters
(MB/s, and so on).
Note
If you have any question on what a specific file metric represents you can look
these up in the White paper called “Performance analysis of the HP
StorageWorks Enterprise Virtual Array storage systems using HP StorageWorks
Command View EVAPerf software tool,” located at
http://h71028.www7.hp.com/ERC/downloads/5983-1674EN.pdf
8. Does this format of data allow you to more simply follow a specific virtual disk
or disk group over time? If not, why?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
9. What other files are created?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Note
The other files that are created will be discussed later in this lab.
This portion of the lab allows you to format and graph the virtual disk output file in
Excel.
Use the following procedure to format and view the virtual disk file:
1. Open Vdisk.csv in Excel.
2. Format time
a. Select column A (Time).
b. Right-click and select Format Cells.
c. Select the Number tab.
d. Under Category, select Time, and under Type, select a time which displays
seconds.
e. Click OK to apply changes.
3. Format the header row:
a. Select row 1.
b. Right-click and select Format Cells.
c. Select the Alignment tab.
d. Under “Text control”, select Wrap text.
e. Click OK to apply changes.
4. Lock the header row:
a. Select row 2.
b. From the menu, select Window → Freeze Panes.
5. Filter data:
a. Select row 1.
b. From the menu select Data → Filter → AutoFilter.
6. Take some time to see how your changes have impacted the file.
Note
At this point, both columns should be highlighted.
c. From the menu, select Insert → Chart to begin the chart wizard.
d. Under “Chart type”, select Line, then Next.
e. Select the defaults in the rest of the wizard, then Finish.
5. Change the controller by selecting controller F03S in the row, and view the
change in the chart.
6. Close Excel.
This portion of the lab allows you to build the Access database through the TLViz
Formatter. Use the following procedure:
1. Copy the file EVADATA.mdb into the directory that contains the formatted files
from the previous section.
2. Start the TLVIZ Formatter.
3. For the input file, browse to and select the same file as before, EVA2.csv.
If all the required input files are in the directory, the Build Access Database
button should be enabled.
4. Click the Build Access Database button and wait until there is a done status at
the lower left.
5. Close the TLVIZ Formatter.
This portion of the lab allows you to run pre-canned queries against the Access
database. Use the following procedure:
1. Start Access and open the Access database.
2. Run the query DiskGroup Percent Read by selecting Open or by double-clicking
the query.
a. How many disk groups are on this EVA?
...........................................................................................................
b. What is the percentage of read for both disk groups?
...........................................................................................................
c. Why is knowing the percentage of read useful in analyzing EVA
performance issues?
...........................................................................................................
3. Run the query Avg-KBperRead-Avg-KBPerWrite.
a. What is the average read and write KB for each disk group?
...........................................................................................................
b. Do these values seem odd?
...........................................................................................................
c. Does EVAPerf use base 2 or base 10 numbers for megabytes?
...........................................................................................................
d. Why is knowing the size of I/O transfers useful?
...........................................................................................................
4. Run the VDISK-Top10TotaRequsts query. What does this query tell us?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
5. Run the VDISK-Top10ReadMissReq query. What does this query tell us?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
6. Run the VDISK-Top10WriteReq query. What does this query tell us?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
7. Close Access.
e. PhyDiskStats.csv
Time, Disk Group, Average Drive Queue Depth, Average Drive Latency
(ms),Average Read Req/s, Average Read MB/s, Average Read Latency
(ms), Average Write Req/s, Average Write MB/s, Average Write Latency
(ms) ,Number of Disks, Ctlr, Node
f. PhysicalDisk.csv
Time, ID, Drive Queue Depth, Drive Latency (ms), Read Req/s, Read MB/s,
Read Latency (ms), Write Req/s, Write MB/s, Write Latency (ms), Enc, Bay
Grp ID, Ctlr, Node
g. PresentedEVA.csv
Time, Total Host Req/s, Total Host MB/s, Node
h. PresentedLUNS.csv
Time, Device, Path ID, Target ID, LUN, Product ID, Product Rev, Ctlr, Serial,
Hardware Ver, Name, Node
i. Tunnel.csv
Time, Ctlr, Tunnel Number, Host Port, Source ID, Dest. ID, Round Trip Delay
(ms), Copy Retries per second, Write Retries per second, Copy In MB/s,
Copy Out MB/s, Write In MB/s, Write Out MB/s, Node
j. Vdisk.csv
Time, ID, Read Hit Req/s, Read Hit MB/s, Read Hit Latency (ms),Read Miss
Req/s, Read Miss MB/s, Read Miss Latency (ms), Write Req/s, Write
MB/s, Write Latency (ms), Flush MB/s, Mirror MB/s, Prefetch MB/s, Group
ID, Online To, Mirr, Wr Mode, Ctlr, LUN, Node
k. Vdiskstats.csv
Time, Disk Group, Total Read Hit Req/s, Total Read Hit MB/s, Average
Read Hit Latency (ms), Total Read Miss Req/s, Total Read Miss MB/s,
Average Read Miss Latency (ms), Total Write Req/s, Total Write MB/s,
Average Write Latency (ms), Total Flush MB/s, Total Mirror MB/s, Total
Prefetch MB/s, Ctlr, Node
8. Does this format of data allow you to more simply follow a specific virtual disk
or disk group over time? If not, why?
Yes, each class of object like Vdisk are all now grouped together, however, each
row contains a specific object. There are two rows for each object like Vdisk.
While better than the evaperf all output, it still is hard to review a single LUN
between samples.
9. What other files are created?
Files that start with TLVIZ-. These are files that can be directly opened with the
TLViz Viewer utility.
The mapping files, one for the virtual disks and one for the physical disks. These
files give a mapping from the TLVIZ display name to the Vdisk name or physical
disk shelf and bay.
5. Run the VDISK-Top10ReadMissReq query. What does this query tell us?
This query reports the EVA, group, LUN, and total read miss requests, along with
the average and maximum read latencies.
Read misses are typically caused by random I/O and can impact the read miss
response times on the server. If read miss latencies are high, the user or
application on the server can experience “pauses” or sluggish performance.
Depending on the nature of the application, read misses can be normal, but
configuring the application and disk group sometimes helps. Many times, with
applications like SQL or Oracle databases, queries with improper indexes can
cause large amounts of read miss activity. Many times, changing the query is
the simplest resolution to high read miss latencies.
6. Run the VDISK-Top10WriteReq query. What does this query tell us?
This query reports the EVA, group, LUN, and total write requests, along with the
average and maximum write latencies.
Writes and the RAID type can double or quadruple physical disk activity.
VRAID1 has two physical disk writes for each host write. VRAID5 has two reads
and two writes for each random write. Knowing which volumes are heavy writes
and the type of VRAID is important to understanding which LUNs are
contributing to the workload of the physicals disks.
This lab allows you to get started with EVA performance data analysis by using and
manipulating the TLViz Formatter output files. You will be able to take any individual
TLViz file in CSV format, review and analyze its contents, remove or add items to it,
do comparisons of data using graphs, and save charts.
As you go through the lab, remember to:
Read and perform all of the lab steps that you can in the allowable time.
Concentrate on those tasks that might help you to diagnose and troubleshoot
performance problems with the array.
You must install the TLViz Viewer on your laptop or PC before using the utility. Follow
these instructions to install and start the TLViz Viewer:
1. Create a temporary directory.
2. Download the TLVIZ_1609_Kit.zip file to the directory.
Note
This file should be in the course files. If needed, your instructor will point you to
its location.
Note
The TLViz version should minimally be TLViz_V16-14.
The data from the EVAPerf array status (as) command is useful in giving the customer
a picture of the load on the array. The data represents the total MB/s and IOPs as
seen by the array. This file and output is useful in gauging the host workload on an
array. These counters do not account for multiple disk groups nor do they
differentiate between read or write MB or IOPs. They also do not show cache
hit/miss ratios.
Perform the following exercises and answer the questions:
1. Using TLViz Viewer, open the TLVIZ-Array.csv file.
Note
This file is available as part of the course files in the folder called 04-TLVIZ.
Note
Hint: use the Safe IOPs spreadsheet.
...........................................................................................................
d. What is the peak Req/s?
...........................................................................................................
2. Select both items (use the CTRL key) and zoom in to the time between 11:20 and
11:23. To zoom, use your mouse to click and drag a rectangular boundary on
the screen.
Note
If you need to unzoom the display, select the Undo Zoom/Scroll button.
The data from the EVAPerf controller status (cs) command is useful for determining
the utilization of the EVA Power PC. The data is represented by total percentage of
CPU utilization and percentage of data. Subtracting the percentage of data from the
percentage of CPU yields the overhead of the array. This overhead is typically from
1–2%. Operations that contribute to overhead are leveling, splits, merges, VRAID5
parity calculations, and other background processes.
The counters are useful in a couple of ways:
First, you would like to see utilization under 40% per controller. This is because if
an EVA loses a controller, the total workload of the array should be capable of
running on just one controller.
Workload should be evenly distributed between controllers as much as possible.
This can be checked by comparing the percent CPU between controllers and the
percent data between controllers. If one controller has substantially higher
activity, then balancing the Vdisks between controllers can even out the
workload.
Last, if overhead (percent CPU minus percent data) is greater than 1–2%, this is
typically an indication of “something” else going on within the array.
Configuration and controller logs should be checked along with other EVAPerf
data.
Perform the following exercises and answer the questions:
1. Using the TLViz Viewer, open the TLVIZ-CPU.csv file.
a. Which four items are displayed?
...........................................................................................................
b. What is the difference between PercentCPU and PercentData?
...........................................................................................................
2. Select and map PercentCPU and PercentData for controller T04K.
a. Why is there a difference between the PercentCPU and the PercentData?
...........................................................................................................
b. Approximately how much of a difference is there?
...........................................................................................................
Note
Next you will repeat the above for the second controller.
6. Add a new item for the other controller representing the total overhead.
a. From the menu, select Options → Add New Item to List Box.
b. Select PROC T04K – PercentCPU for item #1.
c. For the operator, select Subtract.
d. Select PROC T04K – PercentData for item #2.
e. Enter the new item name, Controller T04K Overhead.
7. Map the overhead of both controllers.
a. Which controller has more overhead?
.........................................................................................................
b. Why would one controller have a higher percentage of overhead than the
other?
.........................................................................................................
8. Close the TLVIZ-CPU.csv file.
The data from the EVAPerf host port statistics (hps) command is valuable for
reviewing load balancing between controllers and ports. Statistics for IOPs and MB/s
are indications of load. Queues and latencies are useful in gauging the EVA’s ability
to respond to load.
Most people like to start with the host port statistics when reviewing EVAPerf data.
Because EVA performance is based on configuration and load of a disk group, and
an EVA can have multiple disk groups, it is best to start with the virtual disk group
statistics first and look at host port load balancing second.
Perform the following exercises and answer the questions:
1. Using the TLViz Viewer, open the TLVIZ-HostPort.csv file.
Which data is represented in this file?
................................................................................................................
2. From the menu, select Modify Item List → Remove Item(s) NOT containing string,
and use the string “ReadMB”.
Note
The search string is not case sensitive.
6. From the menu, select Modify Item List → Revert to Previous Item List.
What happens to the display?
................................................................................................................
Note
Next you will repeat Steps 2 through 5 for write MB.
7. From the menu, select Modify Item List → Remove Item(s) NOT containing string,
and use the string “WriteMB”.
On the left side, which items are left?
................................................................................................................
8. Select all of the WriteMB.
How well load-balanced is the demand across all four ports during the peak
utilization?
................................................................................................................
9. Stack all of the WriteMB.
How does the stacked value compare to the TLVIZ-Array.csv MB/s?
................................................................................................................
10. Un-stack the items.
11. From the menu, select Modify Item List → Revert to Previous Item List.
12. Remove all items not containing the string “Req”.
What is left in the display?
................................................................................................................
13. Stack all of these items.
How does this stacked value correspond to the TLVIZ-Array.csv host requests?
................................................................................................................
14. Un-stack these items.
How well-balanced are the requests between ports?
................................................................................................................
15. From the menu, select Modify Item List → Revert to Previous Item List.
16. Remove all items not containing the string “Latency”.
17. Review the latencies for each port.
How do the latencies for reads and writes compare for each port?
................................................................................................................
18. Close the TLVIZ-HostPorts.csv file.
The data from the EVAPerf host connections (hc) command is valuable for
determining whether EVA write buffers can handle workload. This is quantified by the
number of busies.
Perform the following exercises and answer the questions:
1. Using the TLViz Viewer, open the TLVIZ-HostStats.csv file.
a. Which two items are represented in this file?
.........................................................................................................
b. What is the difference between a host queue item and a busy?
.........................................................................................................
c. Under which circumstances would the EVA respond back with a busy?
.........................................................................................................
2. Select busies for host LAB9.
How many entries are there for this host?
................................................................................................................
3. Open the HostStats.csv file in Excel (not TLViz Viewer).
a. Format this file as per the following:
1) Time in seconds.
2) Headers wrap.
3) Data filter is auto filter.
4) Locked header on row 2.
b. Select only host LAB9.
c. Scroll down to a time when LAB9 has 11 or more queued I/Os per port.
Note
The simple answer is this is a bug with the TLViz Formatter. The TLViz Formatter
builds its items list based on HostName+Port. Since both ports show up with “-“,
the formatter cannot tell which port is which, therefore overwriting the values of
one with another. This will be fixed in a future release of the formatter.
5. From the menu, select Modify Item List → Remove Items with Zero Values.
Are there any busies displayed in the items list?
................................................................................................................
6. Close the HostStats.csv and TLVIZ-HostStats.csv files.
The data from the EVAPerf port status (ps) command is valuable for determining
issues on the bus. There should be no increases in any of the loop counters during a
run. The items collected in the loop statistics are only cleared by a reboot, resync or
evaperf rc command.
Note
The values are counters and typically will increase over time as errors are
logged. Some EVAPerf traces will indicate a value like 65535 throughout the
whole trace. It has not been determined if these values have an upper limit and
will not increment beyond the limit.
The data from the EVAPerf physical disk group statistics (pdg) command is valuable
for determining if the EVA load on a disk group is within guidelines. Typically, you
want to see the following:
Average read and write requests combined:
Under 120 for 10K drives
Under 170 for 15K drives
Disk queues close to an average of 1
Average read and write MB/s combined should not exceed 4MB/s with
average IOPs above these numbers
The physical disk group command can be useful, but due to EVA sampling time
issues, it can also be very misleading. Depending on the number of physical disks,
sampling rate, and version of VCS, XCS, and EVAPerf, sometimes these statistics
show little or no load. As a best practice, it is advisable to check a sample of
physical disk activity to see if you can trust these statistics.
Note
On an EVA3000 or EVA5000 running VCS V3.X, you will find that physical disk
statistics are reported against only one controller. Also, on VCS V3.X, read and
write latencies are not reported. Instead, you will get a drive latency. This is
normal and is a summation of both controllers. On EVAs running active-active
code, you will get statistics from both controllers and no drive latency statistics,
however, VCS V4.X and XCS V5.X report per-drive read and write latencies.
You will see this discrepancy in a later lab.
Note
Now you will repeat the steps above for controller T04K.
Note
Now you will determine the load, latency, and queue depth for both controllers
using the FATA disk group.
11. For the FATA disk group, select the following from controller F03S:
a. Average read MB/s
b. Average read Req/s
c. Average read latency
12. For the FATA disk group, select average queue depth on controller F03S.
During the peak of activity, what is the queue depth?
................................................................................................................
13. For the FATA disk group, select the following from controller T04K:
a. Average read MB/s
b. Average read Req/s
c. Average read latency
14. For the FATA disk group, select average queue depth on controller T04K.
During the peak of activity, what is the queue depth?
................................................................................................................
15. Answer these questions:
a. How is the average workload of the FATA compared to the default disk
group?
.........................................................................................................
b. Is the workload for the FATA drives within guidelines?
.........................................................................................................
16. Close the TLVIZ-PhyDiskStats.csv file.
The data from the EVAPerf physical disk (pd) command presents a lot of data. By
using TLViz and modifying the items you display, you can reduce the data to more
manageable comparisons. Normally, the physical disk output of EVAPerf is one of
the last things you should look at other than checking that the physical disk group
statistics are valid.
Note
On the EVA3000 or EVA5000 running VCS V3.X, you will find that physical disk
statistics are reported against only one controller. Also, on VCS V3.X, read and
write latencies are not reported. Instead, you will get a drive latency. This is
normal and is a summation of both controllers. On EVAs running active-active
code, you will get statistics from both controllers and no drive latency statistics,
however, VCS V4.X and XCS V5.X report per-drive read and write latencies.
12. Compare the latencies between PD-5 and PD-42 by performing the following
tasks:
a. Select both latencies from the items list.
b. From the menu, select Options → Toggle Moving Average.
c. Zoom in on both disks.
Does PD-5 have higher latency, and, if so, what could possibly cause this?
.........................................................................................................
.........................................................................................................
d. Open EVA1_16_113_12_35.xml in EVA-CD. Find the bay and shelf used by
PD-5.
Which role does this drive have?
.........................................................................................................
13. Close the TLVIZ-Physical-Disk.1.csv and PDDISK-MAP.txt files.
The data from the EVAPerf tunnel statistics (drt) command is valuable when
troubleshooting replication issues.
Perform the following exercises and answer the questions:
1. Using the TLViz Viewer, open the TLVIZ-Tunnel.csv file.
2. Scroll down the item list.
a. Are there any statistics recorded from tunnels?
.........................................................................................................
b. What is the difference between CopyInMB and WriteInMB?
.........................................................................................................
c. Is there any activity on these tunnels?
.........................................................................................................
d. If not, why does the RT-Delay increase?
.........................................................................................................
e. Do these increases correspond to the times when the EVA was under heavy
load?
.........................................................................................................
3. Close the TLVIZ-Tunnel.csv file.
The data from the EVAPerf virtual disk group statistics (vdg) command is an excellent
starting point to look at load and responsiveness. The number of drives within a disk
group will dictate the maximum load that a disk group can support and still deliver
acceptable latencies. Typically, a good starting point is to look at just latencies and
see if there are times when the following instances occur:
Read miss latencies exceeding 20ms
Write latencies exceeding 8ms
If there are samples that do exceed these, then chances are that the load is
exceeding the design capabilities of the configuration. When this happens, you must
look at load (MB/s, IOPs) along with background work (flushing, prefetch, leveling,
splits, mergers) to see which workload is causing the bottleneck.
Typically, the rule of thumb is that MirrorMB of around 80 on an EVA3000 or
EVA5000 will start to impact write latencies. You can see that you are at or near the
maximum bandwidth for the mirror port due to proxy read activity associated with
active-active configurations. You can see this proxy read in more detail looking at the
individual virtual disk files (TLVIZ-VDisk-#.csv).
Perform the following exercises and answer the questions:
1. Using the TLViz Viewer, open the TLVIZ-VDiskStats.csv file.
2. Modify the items list to display only the default disk group.
3. Select total backend requests from both controllers.
4. Stack these values.
How do the maximum values compare to the TLVIZ-Array.csv requests per
second?
................................................................................................................
5. Add read hit requests from both controllers into this value.
a. Now how do these values compare to the TLVIZ-Array.csv total requests?
.........................................................................................................
b. Why are read hit requests missing from total backend requests?
.........................................................................................................
6. Un-stack the display and compare the total backend requests between
controllers.
Are IOPs fairly balanced between controllers?
................................................................................................................
7. Add a new item to the display by using Options → Add New Item to List Box.
a. Select controller F03S total backend requests for item #1.
b. Select Add for the function.
c. Select controller T04K total backend requests for item #2.
d. Give the new item a name of Default Total Backend Requests.
8. Calculate Safe IOPs for the default disk group:
a. Open the EVADATA.mdb file and run the DiskGroup Percent Read query.
Record the percent read for both disk groups here.
.........................................................................................................
b. Open the EVA1_16_113_12_35.xml file in EVA-CD and record the number
and type of drives used in the default disk group.
.........................................................................................................
c. Open the Safe IOPs spreadsheet and input these numbers for number of
drives and percent read. Look at the appropriate disk type VRAID5 for total
IOPs.
What is the Safe IOPs number? Do the total backend requests exceed the
calculation? If so, by how much?
.........................................................................................................
9. Compare the new item (Default Total Backend Requests) to read miss latencies
on both controllers:
a. As requests go up, do latencies also go up?
.........................................................................................................
b. Are these latencies higher than 15ms when the Safe IOPs calculation is
exceeded?
.........................................................................................................
16. Compare the Array Total MirrorMB with the write latency from each disk group
on each controller.
As MirrorMB goes up, do latencies also go up?
................................................................................................................
17. Compare Array Total MirrorMB to write MB from each disk group on each
controller.
Why is mirror port MB so much higher than write MB?
................................................................................................................
18. Close the TLVIZ-VDiskStats.csv file.
The data from the EVAPerf virtual disk statistics (vd) command contains all the values
of load and latency for each virtual disk within the EVA. The data is useful in
determining which Vdisks or LUNs contribute heavily to a disk group’s load. Many
times, identifying the Vdisk and load pattern can help with performance isolation or
resolution because an application may be contributing to the load. Therefore,
changing the application’s behavior may be more cost-effective than adding more
drives to a disk group.
Note
The TLVIZ-Vdisk-#.csv output is limited to 10MB each so you can more easily
open the files with TLViz. The consequences of this limitation are that it is hard to
profile a virtual disk over a large time period.
6. For the default disk group, select average queue depth on controller F03S.
a. During the peak of activity, what is the queue depth?
It is 69 at 11:27.
b. Why is knowing the queue depth helpful in gauging workload?
If there is a bottleneck, it helps determine if the workload is sufficient for the
array or if server-side tuning of queue depths is needed.
8. What are the average read requests at around 11:21?
6
9. Given these are 10K drives, are we exceeding the capability of the drives?
No
10. For the default disk group, select average queue depth on controller T04K.
During the peak of activity, what is the queue depth?
Zero
12. For the FATA disk group, select average queue depth on controller F03S.
During the peak of activity, what is the queue depth?
Zero
14. For the FATA disk group, select average queue depth on controller T04K.
During the peak of activity, what is the queue depth?
Zero
15. Answer these questions:
a. How is the average workload of the FATA compared to the default disk
group?
Much lower
b. Is the workload for the FATA drives within guidelines?
Yes
16. Compare the Array Total MirrorMB with the write latency from each disk group
on each controller.
As MirrorMB goes up, do latencies also go up?
Yes
17. Compare Array Total MirrorMB to write MB from each disk group on each
controller.
Why are mirror port MB so much higher than write MB?
By reviewing EVA CD, the customer has V4.004 and is using VRAID5. Proxy
reads and VRAID5 will cause higher mirror port traffic.
This lab allows you to use some additional analysis tools for EVA performance data
analysis. The current set of additional tools consists of PerfMonkey and EVApnggen.
The first part of the lab allows you install, view, and use the PerfMonkey interface.
The second part of the lab allows you to install and use EVApnggen to generate
performance charts.
As you go through the lab, remember to:
Read and perform all of the lab steps that you can in the allowable time.
Concentrate on those tasks that might help you to diagnose and troubleshoot
performance problems with the array.
For this portion of the lab, you will be installing PerfMonkey and then viewing and
using the tool.
Note
The latest version of this file is 2.5.14, available through PerfMonkey 2.5.14.zip. If
the instructor makes the setup file available, just copy setup.exe.
Note
If these components are not available, ask your instructor how to acquire them.
4. Run PerfMonkey by using the start menu, for example, select All Programs
PerfMonkey PerfMonkey.
Note
This file is available as part of the course files in the folder called 04-TLVIZ.
2. Expand the levels of the Counter Explorer hierarchy to view the counters.
3. Practice selecting a single counter, multiple counters, or a group of counters.
4. Double-click any counter to display the data.
5. Create a new calculated counter by doing the following:
a. Select Tools Add Calculated Counter.
b. Drag the counter PercentCPU for controller F03S to Value 1 and drag the
counter PercentData for the same controller to Value 2.
c. Choose subtraction for the operator and give the name of the new counter,
for example, F03S Overhead.
d. Click the Create button and note the calculated counter in the Counter
Explorer.
4. Change the color theme to WarmTones, show the 95th percentile, and close the
properties display.
5. Clear the current chart by using the toolbar.
6. Chart the calculated counter you created earlier for F03S Overhead.
7. Right-click in the chart to add an annotation.
8. Save the file as a .png file.
9. Open the .png file you saved, view it, then close the file.
10. Plot counters on different scales by doing the following:
a. Open TLVIZ-VDiskStats.csv.
Note
This file is available as part of the course files in the folder called 04-TLVIZ.
For this portion of the lab, you will be installing EVApnggen and then using the tool
to generate charts.
Installing EVApnggen
To install EVApnggen:
1. Create a folder on your PC or laptop to hold EVApnggen installation files.
2. Copy the installation zip file to your folder.
Note
The latest version of this file is EVAPNGGEN1_13.zip. If the instructor makes the
csvpng.exe and evapnggen.bat files available, just copy them to your folder.
Generating charts
To generate charts:
1. Copy EVApnggn.bat and csvpng.exe to the folder containing your EVAPerf data
you used earlier.
Note
This file is available as part of the course files in the folder called 04-TLVIZ.
2. From the command line prompt, navigate to the folder and enter EVApnggen.
Note the command line output.
Note
Optionally, within Windows Explorer, double-click EVApnggen.bat. This will start
the command line.
3. In Windows Explorer, navigate to your folders and note the files that were
created.