Professional Documents
Culture Documents
1
Theme: Storage “Where bad things happen”
2
Scope, scope, scope!
What is affected?
• Particular application
• Only 1 Guest
• All guests accessing the same Lun/Volume
• All guests running on the same ESXi host
• Are there Guests on the same lun NOT reporting issues?
3
So, think It IS Storage! Where do I start ?
4
Storage – “Where bad things happen”
Virtual SCSI
IOps/MBps Maximums
I/O - “not enough speed”
Array Cache
IOps/MBps maximums
Back-end and device configuration
Spindles – “just not enough disks”
5
ESXi host - Vmkernel logs
Example #1:
vmkernel: 1:08:42:28.062 cpu3:8374)NMP:
nmp_CompleteCommandForPath:2190: Command 0x16 (0x41047faed080) to
NMP device "naa.600508b40006c1700001200000080000" failed on physical
path "vmhba39:C0:T1:L16" H:0x0 D:0x28 P:0x0 Possible sense data: 0x0 0x0
0x0.
This status is returned when the LUN prevents accepting SCSI commands
from initiators due to lack of resources, namely the queue depth on the array.
6
ESXi host - Vmkernel logs
Example #2:
This status is returned when the HBA driver is unable to issue a command to
the device. This status can occur due to dropped FCP frames in the
environment.
7
ESXi host - Vmkernel logs (ESXi 5.x)
8
ESXTOP
- Local console
- SSH
- vMA (vSphere Management Assistance)
E S X T O P S C R E E N S
CPU
Scheduler
Memory
Scheduler
Virtual
Switch
vSCSI • c: cpu (default) • i: Interrupts
c, i, p m n d, u, v • m: memory • d: disk adapter
• n: network • u: disk device
• p: power management • v: disk VM
9
ESXtop Disk Adapter Screen (d)
10
Disk I/O – 3 Main Latencies
Application
Guest OS
VMM
vSCSI KAVG KAVG = QAVG + Kernel
processing time
ESX Storage
Stack QAVG QAVG = time I/O spends in
Storage Adapter Queue
Driver
HBA GAVG GAVG = DAVG + KAVG
Fabric
DAVG
Array SP
11
Interpreting Latency Values
12
Queue Length (max # of active commands)
Application
Guest OS GQLEN - Guest OS Queue Length
VMM
VMkernel
vSCSI
ESX Storage
Stack
AQLEN - Adapter Queue Length
Driver
HBA
DQLEN – Device/Lun Queue Length
Fabric
Array SP SQLEN – Array (SP) Queue Length
13
Queuing example
Result :
I/O will Queue up in Kernel (> 2,000 max)
14
Queuing in the VMKernel
When the active requests exceeds the
LUN 32 IOs in flight and device queue depth, all additional I/O will
Queue 32 Queued be queued in the VMKernel and will reflect
depth is 32 (100% active) in the QAVG
Esxtop “u”
Queuing is
GAVG = DAVG + KAVG (QAVG + kernel time) occurring
KAVG is
non-zero
15
Storage Latencies Will Effect CPU State Times (“c” esxtop)
WAIT
• Waiting on Idle (Idle VMX – not too
much activity) RDY
• Waiting on mem. pg. to swap on disk
The CPU scheduler CSTP
IDLE SWPWT Blocked MLMTD pausing access to
% of time that PCPU Due to RUN
% of time that Waiting on % of time the
the VCPU is in the world is storage I/O
contention or limits Co-de-scheduled
world was not
idle loop waiting for ESX completion scheduled due state for SMP VMs
swapping for VM to CPU limit
violations % of time the VM is
running on a PCPU
VMWAIT
16
No errors on ESX host and STORAGE, Now what?
Fibre Channel
• CRC errors
– Bad SFP , Cable
• C3 discard, BB_credit exhaustion
– Fabric overloaded, Oversubscription
• Fabric Routing issues, etc
iSCSI / NAS
17
Error Check (FIBRE-CHANNEL)
cat /proc/scsi/qla2xxx/1
Things to check:
- HBA Driver known issues
- Fabric errors
*kb 1005576 – Enabling verbose logging on QLogic and Emulex Host Bus Adapters
18
Error Check (iSCSI / NFS)
ESXTOP “n”
PORT-ID USED-BY TEAM-PNIC DNAME PKTTX/s MbTX/s PKTRX/s MbRX/s %DRPTX %DRPRX
16777217 iSCSI n/a vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00
16777218 vmnic2 - vSwitch1 1.61 0.00 10.00 0.01 0.00 0.00
16777219 vmk1 vmnic2 vSwitch1 1.20 0.00 6.83 0.01 0.00 0.00
VMKPING TCPDUMP
19
Quick Tip
Where:
path is the particular path to be enabled/disabled
device is the NAA ID of the device
state is active or off
20
Helpful Tools
21
Guest – level issues
Iometer
Perfmon (windows) / top (linux)
22
Perfmon (windows)
23
GAVG should be close to R
Application
A = Application Latency
File
Guest System A
R = Perfmon
I/O Drivers Windows R “Avg. Disk Sec/transfer”
Device Queue
S S = Windows
Physical Disk Service Time
Virtual SCSI G
G = Guest Latency
VMkernel VMFS NFS client K
K = ESX Kernel
D D = Device Latency
24
Iometer (I/O workload generator tool)
Simulate I/O
25
vCenter “Disk” Performance Chart
KAVG
• Kernel Read latency
• Kernel write latency
• Kernel command latency
QAVG
• Queue command latency
• Queue write latency
• Queue read latency
GAVG
• Read latency
• Write latency
• Command latency
DAVG
• Physical device command latency
• Physical device read latency
• Physical device write latency
26
Capture ESXTOP results while issue exists
Batch mode:
esxtop -b -d 2 -n 100 > esxtopcapture.csv
Where “-b” stands for batch mode, “-d 2″ is a delay of 2 seconds and “-n 100″ are 100 iterations. In this specific case
esxtop will log all metrics for 200 seconds.
- ESXi 5.x
vm-support -p -d <duration in seconds> -i <interval in seconds>
KB articles:
- Gathering esxtop performance data at specific times using crontab (http://kb.vmware.com/kb/1033346)
- Collecting performance snapshots using vm-support (http://kb.vmware.com/kb/1967)
27
Considerations and Recommendations
28
VAAI (VMware vStorage APIs for Array Integration)
29
Partition Alignment
Mis-aligned
Aligned
30
VMFS vs RDM
VMFS Scalability
8000
7000
VMFS is a distributed file system
6000
VMFS has Negligible performance cost VMFS
and superior functionality 5000
IOPS
4000 RDM
(virtual)
3000 RDM
(physical)
2000
Use VMFS, unless RDM required 1000
0
4K IO 16K IO 64K IO
31
Virtual Disk modes
Independent Persistent
• Changes persistently written to disk
Independent Non-persistent
• Changes written to re-do log, ESXi host reads the re-do log first for read (Performance hit)
• Changes get lost when vm is powered off
Snapshot
• Changes written to re-do log, ESXi host reads the re-do log first for read (Performance hit)
32
Thick vs Thin (VMDK)
Thick VMDK:
MBs I/O Throughput
Eager-zeroed
Blocks zeroed out during vmdk creation
Performance hit during creation but faster later
Lazy-zeroed
Space allocated first, but blocks zeroed out after
first-write
Thin VMDK:
Same performance hit as Thick Lazy-Zero
fully Inflated and Zeroed, same as Thick Eager-
Zero
http://www.vmware.com/pdf/vsp_4_thinprov_perf.pdf
NO real Performance difference, VAAI will offload
zeroing anyways
33
Multipathing policy
Round-Robin policy:
Utilizes ALL available paths by load-balancing
Best performance in most cases
ALWAYS USE CORRECT POLICY RECOMMENDED BY ARRAY VENDOR to avoid issues (Lun
Thrashing) impacting performance
34
Extents Vs NO Extents?
35
Virtual Storage Adapters
BusLogic Parellel
LSI Logic Parallel
LSI Logic SAS
PVSCSI( Paravirtual)
Reduces CPU Utilization
Increased throughput
Not supported as boot-device for most OS
36
Throttling I/O per VM
ESX Server
25 %
device queue
75%
6
Provide Higher share values for I/O 100 %
intensive disks. 0 0
Storage
Controlled
With SIOC – Latency is Controlled
38
“Common Containers? Why”
39
VMDK Workload Consolidation
40
Sizing Storage
Throughput MB/s
*IOPS Write Read
RAID 0 175 44 110
RAID 5 40 31 110
RAID 6 30 30 110
RAID 10 85 39 110
Useable Storage Space * 100% Sequential write for 15k disks
Rules of Thumb
Drive Type MB/sec IOPS Latency Use Case
• 50 - 150 IOPs/ VM FC 4Gb (15k) 100 200 5.5ms High Perf. Trans
41
vSphere 5.x
New Storage Features
42
vSphere 5.x - Storage Performance Features / Studies
VAAI: vSphere Storage APIs for Array Integration primitives for Thin
Provisioning
43
vFlash in vSphere 5.5
Virtual SCSI
vFlash Read Cache
Paths
Front-end
Processor
Array Cache
Back-end
Spindles
44
Common causes of Storage Performance issues
45
Storage Optimization in VMware (Best practices)
• Ensure disks are correctly distributed • Change Queue Depth values only
46
Troubleshooting Process revisited:
ESXtop
DAVG, High ? -> check SAN IOPS / latency.
-> If SAN IOPS / latency is low, check Fabric/ Network
KAVG high? check QAVG (Queue stats)
Look out for I/O Throttling implied by SIOC / Shares
47
Thank you!
Email: ssiddicky@vmware.com
Presented by,
Sajjad Siddicky , GSS Escalation
48