You are on page 1of 50

IBM TotalStorage

DS4000 series skill transfer

2009 IBM Corporation

IBM TotalStorage

Agenda
DS4000 series hardware introduction DS4000 troubleshooting DS4000 hardware maintenance DS4000 firmware package DS4000 data collection DS4000 material

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4000 hardware introduction


Product list

DS4300 (FAStT600) DS4500/DS4400 (FAStT900/FAStT700) DS4700 DS4800

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4300 (FAStT600)
Base mode

Up to two 2 Gbps hot swappable RAID controllers with 512 MB of battery backed cache (256 MB per controller). Support for up to three IBM TotalStorage DS4000 EXP700/EXP710 Expansion Units. Support for one storage partition in standard configuration. There is an option to expandup to 4, 8, or 16 storage partitions. Increased cache from 256 MB per controller on base DS4300 to 1 GB per controller on Turbo. Support for up to seven IBM TotalStorage EXP710 Expansion Units. EXP810 Enclosures can also be used behind the DS4300. Host interface on base DS4300 is 2 Gbps. Turbo auto senses to connect to 1 Gbps or 2 Gbps. Eight storage partitions standard, with upgrade to 16 or 64.

Turbo mode

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4300 (FAStT600)
Green Power LED:

This LED indicates that the DC power status is OK. When a storage server component fails (such as a disk drive, fan, or power supply), this LED will be on.

Amber General-System-Fault LED:

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4300 (FAStT600)
Host loop LED (green):

This LED should be on, which means that the host connection loop is good. If it is off, the following problems might have occurred:
The host loop is down, not turned on, or not connected. A SFP has failed, or the host port is not occupied. The RAID controller circuitry has failed, or the RAID controller has no power.

Cache activity LED (green):

This LED is on when the data is in cache. If it is off, one of the following situations has occurred:
There is no data in cache. The cache option is not selected for the array. The cache memory has failed, or the battery has failed.

Battery charged LED (green):

Normally, this LED should be on. If it is off, it indicates a battery fault. The LED blinks while the battery is charging or performing a self-test. The LED will be on if nothing is plugged into the expansion port, or the expansion is powering off. Normally on when the drive-side Fibre Channel loop is operating normally.

Expansion port bypass LED (amber):


Expansion loop link LED (green):

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Host side connection

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Drive side connection

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4500 (FAStT900)
Dual, redundant 2 Gbps RAID controllers with 2 GB of Rambus cache memory (1 GB per RAID controller). The data in the cache is protected by battery backup for at least seven days. Supports connecting up to sixteen EXP100 or EXP710 or up to fourteen EXP810 enclosures Has 16 storage partitions standard, with upgrade option to 64.

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4500 (FAStT900)

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4500 (FAStT900)

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4500 (FAStT900)
Speed LED (green):

This LED is on when the selected link speed is 2 Gbps and a link is up. This LED is off when the DS4500 RAID Controller works on 1 Gbps. This LED should normally be off. If on it indicates a fault of the mini-hub or one of the SFP modules. There is one bypass LED for each SFP module. This LED should normally be off if no SFP module is installed. But if a SFP module is present, and a link error is detected (for example, no cable or faulty cable, or host not powered on) it will go on. This LED should be normally on. It might be off if there are link errors.

Fault LED (amber):

Two Bypass LEDs (amber):

Loop good LED (green):

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Host side connection

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Drive side connection


Only one port on each minihub of the DS4500 on the drive side is ever used. We recommend removing all the SFP modules on the minihub ports that are not connected to any device

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4700
The IBM System Storage DS4700 Express storage server uses 4 Gbps technology Model 70 contains 2 GB of cache memory (1 GB per controller), four 4 Gbps FC host ports (two ports per controller), and four shortwave small form-factor pluggable (SFP) Model 72 contains 4 GB of cache memory (2 GB per controller), eight 4 Gbps FC host ports (four ports per controller), and six shortwave SFP Supports up to six EXP810 Both models 70 and 72 have selectable storage partitions up to 128

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4700
Locate LED (white or blue)

On: Indicates storage subsystem locate. Off: This is the normal status. On: The service action can be performed on the component with no adverse Off: This is the normal status. On: There is a corresponding needs attention condition flagged by the controller Off: This is the normal status. On: The subsystem is powered on. Off: The subsystem is powered off.

Service action allowed LED (blue)

consequences.

Service action required LED (amber)

firmware. Some of these conditions might not be hardware related.

Power LED (green)


July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4700

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4700
LED #1-2 (green): Host channel speed LED #3 (blue): Serviced action allowed LED #4 (amber): Need attention LED #5 (green): Caching active LED #8-11 (amber): Drive channel bypass LED #9-10 (green): Drive channel speed LED #12 (green/yellow): Numeric display (enclosure ID/diagnostic display)

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4700
Service action allowed (blue)

Off: Normal status. On: Safe to remove. On: Battery charged and ready. Blinking: Battery is charging. Off: Battery is faulted, discharged, or missing. Off: Normal status. On: Controller firmware or hardware requires attention.

Battery charging (green)

Needs attention or service action required (amber)

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4700
Power supply fan LED (AC power) (green)
Off: Power supply fan is not providing AC power. On: Power supply fan is providing AC power.

Serviced action allowed (blue)


On: Safe to remove. Off: Normal status.

Needs attention (amber)


Off: Normal status. On: Power supply fan requires attention.

Power supply fan Direct Current Enabled (DC power) (green)


Off: Power supply fan is not providing DC power. On: Power supply fan is providing DC power.

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Host side connection

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Drive side connection

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4800
The 1825-80A and 1815-82A come with 4 GB of total cache The 1815-84A has 8 GB of total cache 1815-88A has 16 GB of total cache. Supports up to 16 EXP710 FC-only enclosures for a total of 224 disks. Supports up to 14 EXP810 enclosures for a total of 224 disks. Supports up to 512 host storage partitions.

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4800

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4800

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Host side connection

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Drive side connection


The DS4800 supports four redundant drive channel pairs on which to place expansion enclosures
Ports 4 and 3 on controller A are channel group 1. Ports 2 and 1 on controller A are channel group 2. Ports 1 and 2 on controller B are channel group 3. Ports 3 and 4 on controller B are channel group 4.

The two ports on each drive channel group must run at the same speed. .

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Drive side connection


The best sequence (Figure 3-52) to populate drive channel pairs is:

1. Controller A, port 4/controller B, port 1 (drive channel pair 1) 2. Controller A, port 2/controller B, port 3 (drive channel pair 3) 3. Controller A, port 3/controller B, port 2 (drive channel pair 2) 4. Controller A, port 1/controller B, port 4 (drive channel pair 4)

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4000 troubleshooting
Basic tools

Recovery Guru Major Event Log (MEL) RLS etc..

Other tools

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Recovery Guru
If there is an error condition on your DS4000, the Recovery Guru will explain the cause of the problem and will provide necessary actions to recover. It will guide you to perform specific actions, depending on the event

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Major Event Log


The Major Event Log (MEL) is the primary source for troubleshooting a DS4000 storage server. To access the MEL select Advanced Troubleshooting View Event Log. By default only the last 100 critical events are shown, but you can choose how many events you want to have listed. The maximum number you can set is 8191. If you want to troubleshoot your system, use the full event log, as it includes information about actions that took place before the actual critical event happened, thus giving you the complete history of the problem.
July 1, 2007
2009 IBM Corporation

IBM TotalStorage

DS4000 hardware maintenance


Disk replacment Battery replacement Remember DO backup(ASD,profile) before normal maintenance Remember DO data backup before maintenace that maybe harmful

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4000 disk replacement


Failed drive/Bypassed drive

Check Recovery Guru, verify the problem and recovery method Replace the drive according to the service guide
Plug out the failed drive (Usually amber LED will be on) Wait for about 30 sec Plug in the new drive Waiting for the reconstruction complete

Impending failure drive


Check Recovery Guru, verify the problem and recovery method Option 1: waiting for the drive failed Option 2: Directly replace the drive
Un-assign the hot spare Manually failed the drive Plug out the failed drive (Usually amber LED will be on) Wait for about 30 sec Plug in the new drive Waiting for the reconstruction complete Re-assign the hot spare
2009 IBM Corporation

July 1, 2007

IBM TotalStorage

DS4000 disk replacement


If Multiple drive failed at almost the same timestamp

Collect data and waiting for L2s action plan Recommend to order another one, if failed again, maybe some logical error occurred. Collect data for L2 review

If reconstruction failed

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Cache battery replacement


Check the cache setting in Storage Manager In order to ensure no data in cache, it is recommended to disable the cache setting According to the service guide to replace the cache battery

DS4300 need to offline the controller. DS4400/DS4500/DS4700 can replace the battery directly. DS4800 should ensure Ctrl A is optimal

Waiting for the battery self-test/charge complete Reset battery age Reset cache setting

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4000 battery policy


Above FW 6.60, the age of DS4000 cache battery has been changed to 10 years. If only battery status is failed, the battery should be replaced. If battery status is near expiration, recommended to update the FW to above 6.60. After the upgrade, the warning will be cleared

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4000 firmware
The firmware pack include the controller firmware, NVSRAM, ESM code, DDM code The controller firmware and NVSRAM should be matched The ESM code and the controller firmware should be matched Pay attention when the DS4000 is attached with both EXP710 and EXP700/EXP810 Remember always check the firmware package readme file and the code matching before doing update All hardware error should be solved before update the firmware except the JFQ3/JFQ4 issue

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4000 firmware update


The normal process of DS4000 firmware update is:

Update ESM code Update Controller/NVSRAM code

DDM code update need stop IO on hosts

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Check the DS4000 firmware

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

ESM update

ESM ESM 5

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Controller firmware update

Controller firmware and NVSRAM can be updated at the same time The update takes about 15-20 mins

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DDM dirve update

When updating, the host should stop IO

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Case scenarios
DS4000 has JFQ3/JFQ4 disks and need to update the firmware DS4000 has EXP710 attached and need to update the firmware from 06.12.16.00 to 06.60.08.00 When updating the controller firmware, the SM lost connection with DS4000

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4000 date collection


All support data Serial port output

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

All support data


Do connect to both controller with hub when collecting ASD If hub is not available, collect two ASD from ctrl A and ctrl B respectively In order to make the drive link statistic more accurate, recommend to do 15-30 before collecting ASD

clear allDriveChannels stats; reset storagesubsystem RLSBaseline; reset storagesubsystem SOCBaseline;

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

All support data

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Serial port output


Using Putty or serial cable to connect the DS4000 Connection parameter

July 1, 2007

2009 IBM Corporation

IBM TotalStorage
7

Serial port output


Command that should be collect

FW6:
loadDebug moduleList 1 arrayPrintSummary netCfgShow inetstatShow moduleShow cfgUnitList vdAll vdShow ghsList printBatteryAge cfgPhyList hwLogShow excLogShow spmShowMaps spmShow getObjectGraph_MT 1 getObjectGraph_MT 4 getObjectGraph_MT 8 ccmStateAnalyze 8 fcDevs 1 i fc 111 ionShow 99 hdd 5 fcAll socShow showEnclosures showEnclosuresPage81 unld ffs:Debug

loadDebug moduleList 1 evfShowOwnership cmgrShow vdmShowDriveList vdmShowRAIDVolList vdmDrmShowMgr vdmShowVGInfo evfShowAllVols bmgrShow 15 bidShow 255 tditnall iditnall fcnShow chall luall ionShow 12 fcAll 10 showSdStatus ionShow 99 discreteLineTableShow ssmShowTree 2 socShow showEnclosuresPage81 excLogShow hwLogShow spmShowMaps spmShow fcHosts 3 getObjectGraph_MT 1 getObjectGraph_MT 4 getObjectGraph_MT 8 ccmShowState netCfgShow inetstatShow dqlist taskInfoAll 3 tpgmShowSummary unld ffs:Debug

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

DS4000 material
Redbook Firmware package

July 1, 2007

2009 IBM Corporation

IBM TotalStorage

Q&A

July 1, 2007

2009 IBM Corporation