You are on page 1of 40

Une brève introduction aux baies

A short introduction to

SUN STORAGETEK 6140

AGR 2017
Overview

• Dual controller
• 1GB cache (2 FC host ports) or 2 GB cache (4 FC host ports) per controller
• 5 to 16 drives (FC or SATA) per tray
• Up to 7 trays (controller tray + 6 expansion trays)
• Up to 112 drives (112 TB raw capacity)
• RAID 0, 1, 1+0, 3, 5 and 6
• Multipathing
• 2 main firmware versions :
• 06.xx
• 07.xx
• Optional functionalities (Premium Features)
2
Front view

3
Front LED

4
Rear view

5
Controller

• 2 Host Ports / 1GB cache (375-3375) : LSI 3992


• 4 Host Ports / 2GB cache (375-3335 / 375-3581) : LSI 3994

(FC)

(FC)

6
Rear LED

7
Expansion trays

Tray ID

I/O module (ESM) 2GB controller : up to 6 expansion trays (7x16 = 112 disks)
P/N 375-3336 1GB controller : up to 3 expansion trays (4x16 = 64 disks)
8
Disks

• FC : 73 / 146 / 300 / 400 / 600 GB, 10k or 15k RPM

• SATA : 500 GB / 1TB / 2TB, 7200 RPM


• SATA drives use an interposer board (SATA/FC) included within the bracket
(P/N 541-1406)

P/N 541-1406
SATA bracket with interposer SATA/FC board

9
Batteries

• Non-SMART batteries
• To be replaced once expired
• Life expectancy : 1170 days (> 3 years)
• Age can be reset
• No learn cycle

• Hot-swappable

• Externally accessible
• No need to offline/remove the controller

10 • P/N 371-0717 (LSI 13695-0x)


Management Interfaces (1)
• Common Array Manager (CAM)
• BUI (Browser User Interface) : https://xxxxxxxxx:6789
• CLI (SSCS, and other commands set)
• Solaris, Windows, Linux (CAM 6.10 : Solaris only)

• SANtricity (no longer supported by Sun/Oracle)


• GUI (Java-based)
• CLI (SMcli)
• One can use any SANtricity-based Storage Manager from any other vendor (IBM DS Storage Manager, DELL MD
Storage Manager, SGI IS Storage Manager …)

• CAM / SANtricity / Storage Manager are software that must be installed on a host

• Serial connection :
• Menu for configuration (controller IP, array password)
11 • Full shell access (not documented by Sun/Oracle)
Management Interfaces (2)
• Management methods :
• Out-of-band (network)
• Both controllers must be accessible for some operations
• The same CAM / SANtricity management station can manage several arrays
• Preferred method
• In-band
• Using the direct SCSI connection between host and array
• Requires special LUN (Universal X-port, usually target 31) to be mapped to the host
• Using a special agent, an in-band managed array can be
manage by a remote CAM / SANtricity management station

12
CAM : BUI

13
SANtricity : GUI

14
Diagnostic : Support Data (1)
• Collecting Support Data
• CAM (BUI)
https://xxxxxxx:6789
--> Sun StorageTek Common Array Manager
--> select the array (left pane)
--> Service Advisor button (FR: Grille de Service)
--> (new window) Array Troubleshooting and Recovery
(FR: Procédures de dépannage et de reprise de baie)
--> Collecting Support Data (FR: Collecte des données de support)
Then follow the instructions

• CAM (CLI)
--> Identify the array : # ras_admin device_list
--> Collect the Data : # supportData -d <identifier> -p <path> -o <filename>
where <identifier> may be the array name or the IP of one of the controllers

Path to the commands :

Solaris : /opt/SUNWsefms/bin/
Linux : /opt/sun/cam/private/fms/bin/
Windows (CAM <= 5.0.2) : C:\Program Files\Sun_Microsystems\StorageTek_Mgmt\Component\fms\bin\
15 Windows (CAM >= 5.1) : C:\Program Files\Sun\Common Array Manager\Component\fms\bin\
Diagnostic : Support Data (2)
• Collecting Support Data
• SANtricity / Storage Manager GUI
--> double-click on the array to launch the Array Management window
--> Advanced menu
--> Troubleshooting
--> Collect All Support Data
--> Specify a directory and a filename
--> Start

• SANtricity / Storage Manager CLI (SMcli)


--> Identify the array : # SMcli –d -i
--> Collect the Data : # SMcli -n <array_name> -c "save storageArray supportData filename=\"array_name-support.zip\";"

Path to the commands :


Solaris : /opt/SMgr/client/
Linux : /opt/SMgr/client/
Windows : C:\Program Files\StorageManager\client\
16
Troubleshooting : listing failures (1)
• LED
• Amber LED for tray / controller / IOM / disk / PSU
• CAM BUI
• Alarms pane
• SANtricity / SM
• Recovery Guru window

17
Troubleshooting : listing failures (2)
• Support Data from CAM
• File alarms.txt
• Always an alarm when installed firmwares do not match the ones expected by the version of CAM (harmless, can be ignored)

Alarm list for device SUN.54065460150.0716AWF00B

Alarm ID : alarm1
Description: Tray.00.Controller.B is at revision "06.60.11.11" baseline version is "06.60.22.10"
Tray.01.IOM.B is at revision "98C1" baseline version is "98D3"
Tray.01.Drive.16 is at revision "3092" baseline version is "3292"

Severity : Major
Element : SUN.54065460150.0716AWF00B
GridCode : 57.75.42
Date : 2014-12-03 00:19:37

Alarm ID : alarm14
Description: A hot spare is in use. The affected virtual disk is vdisk.1, failed drive(s) Tray.00.Drive.03,
spare(s) used Tray.01.Drive.16, the affected volume(s) Volume_tray:0.vdisk:1.lun:0
Severity : Major
Element : t0drive3
GridCode : 57.66.1021
Date : 2017-02-03 00:45:48

• Support Data from SANtricity / SM


18 • File recoveryGuruProcedures.html (to be opened by a browser)
TS : configuration and status

• Support Data : file storageArrayProfile.txt


• Detailed information on configuration and status

• Support Data : file majorEventLog.txt


• List of all events on the array

• Support Data : file stateCaptureData.[dmp|txt]


• Results of controller shell commands (low level – for advanced analysis)

19
TS : storageArrayProfile.txt

• Contains configuration/status information about :


• Controllers
• Virtual Disks (aka Volume Groups, aka Arrays)
• Volumes (aka Logical Drives)
• Drives
• Channels (Hosts & drives)
• Trays (including batteries, PSUs, etc.)
• ...

• Content slightly differs when from CAM or SANtricity


20
Serial Port : cable

• Mini-DIN / RJ45 cable


• P/N 530-3544
IBM : 13N1932 or 39M5908
DELL : CT109 or MN657
• some 530-3544 are miswired and require to use
RJ45/DB9 connector P/N 530-3100 (straight
through) instead of connector 371-1107 (NULL
modem)

• Mini-DIN / DB9 (RS232)


21 • Netapp P/N 23698-00
Serial Port : setting & Service Menu

• Setting :
• Baud rate = 38400
• Data bits = 8 Stop bits=1
• Parity = None Flow Control = None

• Establishing a connection :
• Send BREAK until you get the message
Press the space bar within 5 seconds: <S> for Service Interface. <BREAK> for baud rate

• Press SPACE to set the baud rate


• Send another BREAK
• BREAK until you get the message
Press the space bar within 5 seconds: <S> for Service Interface. <BREAK> for baud rate
22
• Press « S » to get the Service Interface, and « ESC » to reach the shell
Serial Port : Service Menu

• Password : kra16wen

• Service Interface :
• Showing/setting controller IP address
• Resetting array password (SYMbol password, used for communication between
CAM/SANtricity and the array)

Service Interface Main Menu


==============================
1) Display IP Configuration
2) Change IP Configuration
3) Reset Storage Array (SYMbol) Password
Q) Quit Menu
23
Enter Selection:
Serial Port : shell (advanced)

• Login / password : shellUsr / y2llojp


• No login, only password, required in firmware 06.xx

• Commands sets differ in fw 06.xx and fw 07.xx

• Do not use shell commands unless you know what you do

24
Usual interventions

• Look for alarms


• Either from CAM (select the array, then Alarms) or SANtricity (blinking
sthetoscope)
• Support Data :
• file alarms.txt (from CAM)
• file recoveryGuruProcedures.html (from SANtricity)

• Follow guidelines from Service Advisor (CAM) or Recovery


Guru (SANtricity)

25
Usual interventions : batteries (1)
• Info about batteries in Support Data :

• SD from CAM : file alarms.txt

• SD from SANtricity : file recoveryGuruProcedures.html

• File storageArrayProfile.txt : look for « Battery »


Battery: Tray.85.Battery.A
Status: Optimal
Age in days: 1033
Days until replacement: 137

Battery: Tray.85.Battery.B
Status: Optimal
Age in days: 1034
Days until replacement: 136

• File stateCaptureData.dmp :
• Fw 06.xx : look for « BATTERY »
26 • Fw 07.xx : look for « bmgrShow »
Usual interventions : batteries (2)

• Non-SMART batteries : to be replaced when they are Failed, Near Expiration or Expired
• No downtime nor controller failover required
• CAM : follow instructions from Service Advisor (FR: Grille de service)
• SANtricity : follow instructions from Recovery Guru
• Reset the age once replaced

27
Usual interventions : batteries (3)

• Resetting the age (GUI) :


• CAM : select array -> Service Advisor -> array Troubleshooting & Recovery
-> Resetting the Controller Battery Age

• SANtricity : click on the Components icon then Batteries

28
Usual interventions : batteries (4)
• Resetting the age (CLI) :
• CAM CLI :
service -d arrayname -c reset -t tXbatY
X : tray ID (usually 85) ; Y : slot ID (1 = Ctler A ; 2 = Ctler B)
Solaris: /opt/SUNWsefms/bin
Linux: /opt/sun/cam/private/fms/bin
Windows: c:\Program Files\Sun\Common Array Manager\Component\fms\bin

• SANtricity SMcli :
smcli -n arrayname [-p password] -c "reset storageArray batteryInstallDate controller=X;“
smcli @IP_A [@IP_B][-p password] -c "reset storageArray batteryInstallDate controller=X;“
X : either A or B
Solaris, Linux : /opt/SMgr/client/
Windows : C:\Program Files\StorageManager\client\

29
Usual interventions : batteries (5)
• Resetting the age (serial shell) :
• Serial shell :
• menu « M » (Boot Operation Menu) -> 8) Special Services Menu -> 6) Install Battery
BOOT SPECIAL OPERATIONS MENU
-> M
1) Change Board Serial Number
NOTICE: The BOOT OPERATIONS MENU has been invoked too late for 2) Reinitialize All NVSRAM
proper operation of some activities, including Isolation Diagnostics. 3) Change Password
You may wish to restart this controller again and press Control-B 4) Change Ethernet Node Address
IMMEDIATELY after seeing the start-up indicator ("-=<###>=-"). 5) Change Subsystem Name
6) Install Battery
BOOT OPERATIONS MENU 7) Reserved
Q) Quit Menu
1) Perform Isolation Diagnostics 10) Serial Interface Mode Menu
2) Download Permanent File 11) Display Hardware Configuration Enter Selection: 6
3) Reserved 12) Change Hardware Configuration Menu Please enter battery number to set installation date(0 or 1):0
4) Dump NVSRAM Group 13) Development Options Menu
5) Patch NVSRAM Group 14) Display Memory Error Log Current date: 09/28/2017
6) Set Real Time Clock 15) Manufacturing Setup Menu Current battery 0 installation date: 09/15/2015
7) Display Board Configuration R) Restart Controller
8) Special Services Menu Q) Quit Menu Use this operation to inform the controller that the batteries for the
9) Display Exception Message cache memory have been replaced, and to identify the date the new
batteries were installed. (Avoid using this function if the batteries
Enter Selection: 8 have not been replaced; otherwise, data still remaining in cache may be lost.)

Do you wish to continue to set the battery installation date? (y/n) y

Enter installation date (mm/dd/yyyy): 09/28/2017

New battery installation date: 09/28/2017


New battery expiration date: 12/11/2020
New battery expiration warning date: 10/30/2020

30 Press <Enter> to continue


Usual interventions : disks (1)
• Info about disks in Support Data :

• SD from CAM : file alarms.txt

• SD from SANtricity : file recoveryGuruProcedures.html

• File storageArrayProfile.txt : look for « DRIVES---- »


DRIVES-----------------------

TRAY,SLOT STATUS CAPACITY CURRENT DATA RATE PRODUCT ID FIRMWARE VERSION


--------- ------- -------- ----------------- ---------------- ----------------
0, 1 Optimal 279 GB 2 Gbit/s HUS103030FLF21 JFQ8
0, 2 Optimal 279 GB 2 Gbit/s HUS103030FLF21 JFQ8
0, 3 Optimal 279 GB 2 Gbit/s HUS103030FLF21 JFQ8

• File stateCaptureData.dmp :
• Fw 06.xx : look for « cfgPhyList »
31
• Fw 07.xx : look for « vdmShowDriveList »
Usual interventions : disks (2)

• To be replaced when status is Failed or Impending Drive Failure


• A drive marked as bypassed requires further investigation
• If Impending Drive Failure, drive must be manually failed before replacement
• CAM : select array -> Physical Devices -> Disks -> select a disk -> Fail
• SANtricity : select disk -> Advanced -> Recovery -> Fail Drive

32
Usual interventions : disks (3)

• If Impending Drive Failure, drive must be manually failed before replacement

• CAM CLI :
service -d arrayname -c fail -t tXdriveY
X : tray ID (usually 85) ; Y : slot ID
Solaris: /opt/SUNWsefms/bin
Linux: /opt/sun/cam/private/fms/bin
Windows: c:\Program Files\Sun\Common Array Manager\Component\fms\bin

• SANtricity SMcli :
smcli -n arrayname [-p password] -c "set physicalDisk [TrayID,slotID] operationalState=failed;“
smcli @IP_A [@IP_B][-p password] -c "set physicalDisk [TrayID,slotID] operationalState=failed;“
Solaris, Linux : /opt/SMgr/client/
Windows : C:\Program Files\StorageManager\client\

33
Usual interventions : controllers (1)

• Info about controllers in Support Data :

• SD from CAM : file alarms.txt

• SD from SANtricity : file recoveryGuruProcedures.html

• File storageArrayProfile.txt : look for « CONTROLLERS---- »

• File stateCaptureData.dmp :
• Fw 06.xx : look for « getObjectGraph_MT 99 »
• Fw 07.xx : look for « [Controller] »

34
Usual interventions : controllers (2)

• Follow Service Advisor / Recovery Guru instructions

• Replacement is performed online


• if other controller is online, of course
• if multipathing (from host end) is correctly configured

• If not failed, controller has to be offlined manually (from GUI or CLI)

35
Usual interventions : controllers (3)

• Failing (offlining) a controller (GUI) :


• CAM : Service Advisor -> Array Troubleshooting and Recovery
-> Place a Controller Offline -> select the controller -> follow the instructions
• SANtricity : select the controller -> menu Adnced -> Recovery -> Place Controller -> Offline

36
Usual interventions : controllers (4)

• Failing (offlining) a controller (CLI) :


• CAM CLI :
service -d arrayname -c fail -t X
X : a or b
Solaris: /opt/SUNWsefms/bin
Linux: /opt/sun/cam/private/fms/bin
Windows: c:\Program Files\Sun\Common Array Manager\Component\fms\bin

• SANtricity SMcli :
smcli -n arrayname [-p password] -c “set controller [a] availability=offline;”
smcli @IP_A [@IP_B][-p password] -c “set controller [a] availability=offline;”
X : a or b
Solaris, Linux : /opt/SMgr/client/
Windows : C:\Program Files\StorageManager\client\

37
Usual interventions : controllers (5)

• Failing (offlining) a controller (serial shell) :

• From the other controller (the one to keep online) :

• Fw 06.xx and 07.xx :


-> ld </Debug
-> setControllerToFailed_MT 1
• Fw 07.xx only :
-> cmgrSetAltToFailed

38
Advanced topics, questions, …

• Issues :
• Controller lockdown
• Controller held in reset
• Unreadable sectors
• RDAC / AVT
• Volume recovery
• …

• Questions

39
On https://cloud.evernex.com/url/gtscuslh5wvh

- SANtricity 11.30 (for Windows, Linux, Solaris SPARC & x86)


- IBM DS Storage Manager 11.20 and 10.86 for Windows
- Four Support Data archives (from CAM an SANtricity, fw 06.xx and 07.xx)
- This PowerPoint file

40

You might also like