You are on page 1of 43

Troubleshooting Storage Performance

Agenda

VMware ESX Architecture & Storage Stack


Troubleshooting Storage Using ESXTOP

vShphere 5.0 New Storage Features


Storage Load Generators & Tools

Miscellaneous Best Practices and Study Results

VMware ESX Architecture


File
System

Virtual Machine
Guest OS

TCP/IP

I/O Drivers

Monitor (BT, HW, PV)

ESXi
Scheduler

Physical
Hardware

Memory
Allocator

Virtual NIC

Virtual SCSI

Virtual Switch

File System

NIC Drivers

I/O Drivers

Disk I/O Latencies


Application
Guest OS

VMM

GAVG

vSCSI
ESX Storage
Stack

KAVG

QAVG
Time spent in ESX
storage stack is minimal,
for all practical purposes
KAVG ~= QAVG

Driver
HBA
Fabric
Array SP

DAVG

In a well configured
system QAVG should be
zero

KAVG = GAVG DAVG

Disk I/O Queuing


Application
Guest OS

GQLEN

VMM
vSCSI
ESX Storage
Stack
Driver

WQLEN
AQLEN

DQLEN

HBA

Reported in esxtop
DQLEN can change
dynamically when SIOC
is enabled

Fabric
Array SP

SQLEN

GQLEN Guest Queue


AQLEN Adapter Queue
WQLEN World Queue
DQLEN Device / LUN
Queue
SQLEN Array SP Queue

vSphere Performance Management Tools - vCenter

vCenter Alarms & Charts


Alarms on static thresholds
Alarm trigger may not always
indicate an actual performance
problem

Can Show Historical trends


Rough Granularity (Min 20 Sec)

vCenter Operations
Aggregates metrics into workload,
capacity and health scores

Relies on dynamic thresholds

vSphere Performance Management Tools - ESXTOP

esxtop/resxtop
For live troubleshooting and root cause analysis, Finer Granularity (2 Second)
Lots of Metrics reported
Available in the ESXi shell or the VMware Management Application (VMA)

E S X T O P

CPU
Scheduler

Memory
Scheduler

Virtual
Switch

vSCSI

c, i, p

d, u, v

c: cpu (default)
m: memory

n: network
p: power management

S C R E E N S

i: Interrupts
d: disk adapter

u: disk device
v: disk VM

esxtop disk adapter screen (d)

Host bus adapters (HBAs) includes SCSI, iSCSI, RAID,


and FC-HBA adapters

Latency stats from the


Device, Kernel and the
Guest

DAVG/cmd - Average latency (ms) from the Device (LUN)


KAVG/cmd - Average latency (ms) in the VMKernel
GAVG/cmd - Average latency (ms) in the Guest

Kernel Latency Average (KAVG)


The Amount of time an IO spends in the VMKernel (mostly made up of Kernel Queue Time)
Investigation Threshold: > 2ms, Should typically be 0 ms
Device Latency Average (DAVG)
This is the latency seen at the device driver level
Investigation Threshold: > 20ms, lower is better, some spikes okay
8

Identifying Queue bottlenecks

Disk I/O Queuing Device Queue

Device Queue length,


modifiable via driver
parameter

10

IO commands
in Flight

IO commands
waiting in
Queue

Disk I/O Queuing World Queue

World ID
World Queue Length
modifiable
Disk.SchedNumReq
uestOutstanding

11

Disk I/O Queuing Adapter Queue

Different
adapters have
different queue
size

Adapter Queue
can come into play
if the total
outstanding IOs
exceeds the
adapter queue

12

Device Queue Full !

KAVG is
non-zero

LUN
Queue
depth is 32

13

32 IOs in
flight and
32
Queued

Queuing
issue

How to identify storage


connectivity issues

14

CPU State Times

Elapsed Time

WAIT

IDLE

RDY

SWPWT

Guest I/O

blocked MLMTD

VMWAIT

Chargeback : %SYS time


CPU frequency Scaling: Turbo boost USED > (RUN SYS)
Power management USED < (RUN SYS)
15

CSTP

RUN

Identifying storage connectivity issues


NFS Connectivity Issue (1 of 2)

I/O activity to
NFS datastore

System time
charged for
NFS activity

16

Identifying storage connectivity issues


NFS Connectivity Issue (2 of 2)

No I/O activity
on the NFS
datastore
VM is not
using CPU

17

VM blocked,
connectivity lost
to NFS
datastore

New counters in ESX 5.0

18

Failed Disk IOs

Failed IOs are


now accounted
separately from
successful IOs

19

VAAI: Block Deletion Operations


New set of VAAI
stats for tracking
block deletion

VAAI : vStorage API for Array Integration

20

Low-Latency Swap (Host Cache)

Low-Latency
(SSD) Swap

21

Performance Impact of Swapping

22

Performance Impact of Swapping

Some swapping
activity

Time spent in
blocked state due
to swapping

23

vSphere 5.0
New Storage Features

24

vSphere 5.0 - Storage Performance Features / Studies

SIOC for NFS: Cluster-wide virtual machine storage I/O prioritization


SDRS: Intelligent placement and on-going space & load balancing of
Virtual Machines across Datastores in the same Datastore cluster.

VAAI: vSphere Storage APIs for Array Integration primitives for Thin
Provisioning

1 Million IOPS: vSphere 5.0 can support astonishing high levels of IO


Operation per second, enough to support todays largest and most
consolidated cloud environments

FCOE Performance: New vSphere 5.0 ability to utilize built in software


based FCoE virtual adapters to connect to your FCoE base Storage
infrastructure

25

VMware Storage Appliance - VSA


vCenter Server

VSA Manager

VSA Cluster Service

Managed By
Volume 1

Volume 2
(Replica)

Volume 2

VSA
Datastore 1

VSA
Datastore 2

Affordable Shared Storage


Turns internal server storage into fully
virtualized, clustered, shared and highly
available data stores at a dramatically lower
cost than networked, external shared storage

VSA cluster with 2 members


26

Volume 1
(Replica)

VMware Storage Appliance VSA - Failover


vCenter Server

VSA Manager

VSA Cluster Service

Managed By

Volume11
Volume

Volume 22
Volume
(Replica)
(Replica)

Volume 2

VSA
Datastore 1

VSA
Datastore 2

VSA NFS IP #2

VSA NFS IP #1

27

Volume 1
(Replica)

VMware Storage Appliance VSA - Performance

Ability to Support 1000s of IOPs at < 12ms latency


http://www.vmware.com/files/pdf/techpaper/vsa-perf-vsphere5.pdf
28

Storage - I/O Tools

29

Storage Iometer (I/O workload generator tool)


Iometer is an I/O subsystem measurement and characterization tool for
single and clustered systems. Windows and Linux

Windows and Linux


Free (Open Source)
Single or Multi-server
capable

Multi-threaded
Metrics Collected

30

Total I/Os per Sec.


Throughput (MB)
CPU Utilization
Latency (avg. & max)

Iometer Tips and Tricks

Common IOmeter profiles (database, web, etc) see:


http://blogs.msdn.com/b/tvoellm/archive/2009/05/07/useful-io-profiles-for-simulating-various-workloads.aspx

Make Sure to Check / Try:


Load balancing / multi-pathing

Queue depth & Outstanding I/Os


pvSCSI Device Driver

Look out for:


I/O contention
Disk Shares
SIOC & SDRS
IOP Limits

31

vscsiStats Details
vscsiStats characterizes IO for each
virtual disk
Allows us to separate out each different type
of workload into its own container and
observe trends

Histograms only collected if enabled;


no overhead otherwise

Metrics
I/O Size
Seek Distance
Outstanding I/Os
I/O Interarrival Times
Latency

32

Miscellaneous
Storage Tips and Tricks

33

Sizing Storage
Throughput MB/s
*IOPS
Write Read
RAID 0

175

44

110

RAID 5

40

31

110

RAID 6

30

30

110

RAID 10

85

39

110

Useable Storage Space

Rules of Thumb
50 - 150 IOPs/ VM

* 100%

Drive Type

MB/sec

IOPS

Latency

Sequential write for 15k disks

Use Case

FC 4Gb (15k)

100

200

5.5ms

High Perf. Trans

FC 4Gb (10k)

75

165

6.8ms

High Perf. Trans

<15 ms latencies

SAS (10k)

150

185

12.7ms

Streaming

~Typical workload
8K IO Size
45% Write
80% Random

SATA (7200)

140

38

12.7ms

Streaming/Nearline

SATA (7200)

68

38

12.7ms

Nearline

230(read)
180(write)

25000(read)
6000(write)

< 1 ms

High Perf. Trans


Tiered Storage / Cache

34

SSD

VMDK Workload Consolidation

Random

Too many sequential threads


on a lun will appear as a
random workload to the storage
Negative Impact on
Sequential Perf.

Random

Mixing Sequential with Random


can hurt Sequential workload
Throughput.
Negative Impact on
Sequential Perf.

Random

Group similar workloads together


(Random w/ Random and
Sequential /w Sequential)

Sequential
Sequential

Random
Sequential
Random
Random
35

Storage Consolidation Changes Access Profile


Aggregate Storage Throughput
200

180
160
140

MBps

120
100
80
60
40
20
0
1

2
Sequential Read

Sequential
Write
Number

16

ofRandom
hostsRead

32

64

Random Write

Consolidation may cause multiple simultaneous sequential I/O streams


to resemble random I/O
36

SSD vs. HDD Benchmarking Storage with VMmark 2.0

VMmark used to simulate production level Datacenter workloads

An average VMmark 2.1


score improvement of
25.4%

Average bandwidth
increase of 9.6%

Combined average read


latency reduction of 76%

http://blogs.vmware.com/performance/2011/07/analysis-of-storage-technologies-on-clusters-using-vmmark-21.html
37

Thin Provisioning Performance / Block Zeroing


Thin (Fully Inflated and Zeroed) Disk

MBs I/O Throughput

Performance equal to Thick Eager Zero

Disk

Performance impact due to Zeroing,


not result of allocation of new blocks

To get maximum performance from the


start, must use Thick Eager Zero Disks

Maximum Performance happens


eventually, but when using lazy
zeroing, zeroing needs to occur before
you can get maximum performance
http://www.vmware.com/pdf/vsp_4_thinprov_perf.pdf
38

Use VMFS
Always use VMFS

VMFS Scalability

Negligible performance cost and

8000

superior functionality

7000

Align VMFS on 64K boundaries

6000

www.vmware.com/pdf/esx3_partiti
on_align.pdf

VMFS is a distributed file system


Be aware of the overhead of
excessive metadata updates

If possible schedule
maintenance for off-peak hours

39

IOPS

Automatic with vCenter

VMFS

5000
4000

RDM
(virtual)

3000

RDM
(physical)

2000
1000

0
4K IO

16K IO

64K IO

pvscsi and VMDirectPath: To Use or Not To Use?

pvscsi
Why not use?
Could not boot off of pvscsi before U1
Not optimized for low IO workloads
(<2000 IOPS)

http://kb.vmware.com/kb/1017652

Why use?
Up to 50% less CPU usage at high IO
Needed to exceed 30K IOPS on a
single VM

40

VMDirectPath
Why not use?
Monopolizes access to physical
hardware by a single VM

Pins VM to host, disabling VMotion,


DRS, FT, etc.

Why use?
Added 10% to extreme network
workloads (20 Gbps)

http://communities.vmware.com/docs/
DOC-12103

Storage Optimization

Over 80% of storage related


performance problems stem from
misconfigured storage hardware

VMware ESX
HBA1

HBA2

HBA3

HBA4

Consult SAN Configuration Best Practice Guides

Ensure disks are correctly distributed

FC Switch

Ensure the appropriate controller cache is enabled


Caching Algorithms

SP1

Passive/Standby

SP2

Consider tuning layout of LUNs across RAID sets


Count the cost in choosing a level of protection

Spread I/O requests across available paths

41

Storage array

Summary

Know your workload


Performance IOPS

Throughput MB/sec

Response Time - Investigate > 30ms Disk Latency or > 2ms Kernel Latency

Avoid negatively impacting sequential performance


Sharing and Consolidating Understand queuing implications of
mapping HBAs to luns

Choose a storage protocol best fitting requirements and needs


Use VMFS No overhead compared to RDM (physical or virtual)

Thin vs. Thick provisioning


Utilize vSphere Storage Features: SIOC, VAAI, SDRS, SvMotion

Follow Storage Vendor best practices for Block size and Alignment,
42

Thank You !!
43