Professional Documents
Culture Documents
Extreme Flash and NVMe Service V5 PDF
Extreme Flash and NVMe Service V5 PDF
Dec 2014
Dec. 2014
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 2
Introducing Exadata X5-2 Extreme Flash (EF) Storage
Server EF
Industry Leading I/O Performance
• All Flash, Scale-out, Highly Available, InfiniBand Connected Smart Storage
• 8x front mounted 1.6TB PCIe flash drives
– State-of-the-art NVMe interface optimized for low-overheard
– No flash cache misses, so predictably low flash response times
• Replaces High Performance (HP) disk configuration
– Similar capacity – 12.8 TB Extreme Flash vs 14.4 TB High Performance Disk
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal/Restricted/Highly Restricted
NVMe Technology Extreme Flash Server
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 7
NVMe Technology High Capacity Server
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 8
NVMe Technology
Traditional SAS-3 SSD Architecture
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 9
Servicing Differences
Between NVMe and SAS
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 10
Servicing differences from SAS
● For SFF devices, PCIe hot-plug procedure MUST be followed.
● If Drive is removed without hot-removal operation, system will crash and reset with a PCIe Surprise
Link Down against the drive. This is not a bug, this is a feature.
● Clear visual indication (Blue LED) when drive is safe to remove. If Blue LED is not lit, do not pull the
drive.
● Drive appear as /dev/nvme devices. Sequential numbering, not slot number.
● /dev/nvme1n1 – first storage namespace
● /dev/nvme1n1p1 – first partition on storage
● Drive should automatically power on when inserted. /var/log/messages will report a drive is
present and identify the slot ID.
name: NVME_10
deviceName: /dev/nvme7n1
diskType: FlashDisk
luns: 0_10
makeModel: "Sun Flash Accelerator F160 PCIe Card"
physicalFirmware: 8DV1RA05
physicalInsertTime: 2014-10-20T20:26:03-07:00
physicalSerial: CVMD426500941P6LGN
physicalSize: 1.4554837569594383T
slotNumber: 10
status: normal
nvme0 is missing the storage namespace [n1] which indicates the controller has taken the
storage offline. See next slide
Internal Device Error: The command was not completed successfully due to an internal
device error.
ERROR: FPGA presence bit is set for nvme drive NVMe 7, but 'Link Layer Link Active'
bit for downstream switch card port 5 on CPU 1 indicates pcie link is not active
checking pcie link on drive NVMe 8 OK
checking pcie link on drive NVMe 9 OK
checking pcie link on drive NVMe 10 OK
checking pcie link on drive NVMe 11 OK
NVME drives pcie link check: FAILED
Example of the cables swapped for Port 0 and Port 1 on switch card
checking DSN on drive NVMe 4 OK
checking DSN on drive NVMe 5 OK
checking DSN on drive NVMe 6
ERROR: PCIE DSN and FRU DSN don't match on drive NVMe 6
PCIE DSN: 55CD2E404BCDB2FF
FRU DSN: 55CD2E404BD5E8A5
● Faults could indicate CPU, Switch Card, PCIe Slot, Drive Back Plane, Drive.
● Cabling and the drive called out would be the best places to start.
● Any unexpected system resets should be checked to see if they are Surprise Link Down errors.
● If they are, this could indicate operator error rather than any hardware fault. Possibly an improper
drive removal.
● Any NVMe SFF drive removed from the system without preparing the drive for removal will
result in System Reset and Surprise Link Down fatal PCIe error.
● Improper removal of an SFF NVMe drive from an Extreme Flash server will result in a system crash
● This error will be diagnosed by ILOM as the system comes back up.
FRU
Status : faulty
Location : /SYS/MB/PCIE2
Chassis
Manufacturer : Oracle Corporation
Name : ORACLE SERVER X5-2L
Part_Number : X5-2L-P1.0-UX1
Serial_Number : 12345678
----------------------------------------
Suspect 2 of 2
Fault class : fault.io.intel.iio.pcie-fatal
Certainty : 50%
Affects : /SYS/MB/P1
Status : faulted
FRU
Status : faulty
Location : /SYS/MB/P1
Name : Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
Part_Number : 060F
Chassis
Manufacturer : Oracle Corporation
Name : ORACLE SERVER X5-2L
Part_Number : X5-2L-P1.0-UX1
Serial_Number : 12345678