Professional Documents
Culture Documents
TROUBLESHOOTING:
USE METHOD
Chad Dorton
06/27/2018
▪ Does the problem affect other people or applications (or is it just you)?
▪ What is the environment? What software and hardware are used? (Versions?
Configuration? Etc...)
▪ While busy the device may still be able to accept work (until saturation)
Threads
10
1
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Ops/sec
Latency vs Throughput
40
▪ When load is added to a system past 35
64
Lat (ms)
20 32
15
10 16
5 1 2 4 8
0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Ops/sec
8 Core VM
Cache
Cache
(memcache)
DB-RO (memcache)
(logs/config)
Interface DB-RW
UI (logs/config)
DB-RO
(logs/config)
dispatch
dispatch
11
MX supermx filterque
filterque
supermx filterque
filterque
vip supermx
supermx filterque dispatch
dispatch
bb
Engine
Engine
Engine
Engine
Engine
Engine
Engine
Engine Storage
(mdac)
(mdac)
(mdac)
(mdac)
(mdac)
(mdac)
(mdac)
(mdac) (quarantine)
pgsql
(maint)
farmd
farmd
farmd
farmd
farmd
farmd
farmd Metrics Admin DNS
(pulse) (rsyslog) (mdlocal)
ps strace
top gdb
pmap blktrace
Per-Process
Type Source
Per-process counters /proc
System-wide counters /proc, /sys
Device driver and debug info /sys
Per-process tracing ptrace, uprobes
Network tracing libpcap
System-wide tracing tracepoints, kprobes, ftrace
▪ Important switches/parameters
▪ None
▪ Example:
root@m0131372:~# uptime
00:53:59 up 2:34, 4 users, load average: 15.90, 10.27, 4.51
▪ An experiment: 1
▪ Conclusion: 0.8
0.2
0
0 60 120 180 240 300 360 420 480 540 600 660
root@m0131372:~# vmstat -w 1
procs -----------------------memory---------------------- ---swap-- -----io---- -system-- --------cpu--------
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 32567528 14864 195140 0 0 5 3 55 37 7 0 93 0 0
1 0 0 32567512 14864 195140 0 0 0 0 552 353 13 0 88 0 0
1 0 0 32567512 14864 195140 0 0 0 0 561 367 13 0 88 0 0
1 0 0 32567512 14864 195140 0 0 0 0 557 371 13 0 88 0 0
1 0 0 32567512 14864 195140 0 0 0 0 588 398 13 0 88 0 0
02:57:37 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
02:57:38 AM all 12.52 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 87.48
02:57:38 AM 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
02:57:38 AM 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
02:57:38 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
02:57:38 AM 3 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
02:57:38 AM 4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
02:57:38 AM 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
02:57:38 AM 6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
02:57:38 AM 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
▪ %nice: user-time for nice’d procs ▪ %soft: software interrupt CPU usage
▪ %irq: hardware interrupt CPU usage ▪ %gnice: CPU time spent running nice’d guests
root@m0131372:~# sar -q 1
Linux 4.4.0-83-generic (m0131372.ppops.net) 06/05/2018 _x86_64_ (8 CPU)
Virtual Memory
Main Memory
Anonymous
(Physical) Swap Device
Paging
▪ Filesystem Paging
▪ Reading and writing of memory-mapped files (mmap())
▪ Code execution
▪ Reading from/writing to Filesystem Page Cache
▪ The good kind of paging!
▪ Anonymous Paging
▪ Writing out (and writing back in private process pages)
▪ Pages live in anonymous swap space (swapon())
▪ The bad kind of paging!
▪ Sometimes referred to as “swapping.”
▪ vm.swappiness:
▪ Parameter between 1 and 100
▪ Higher value
- Favors freeing memory by paging applications (anonymous paging)
▪ Lower value
- Favors freeing memory by reclaiming page cache
▪ My view:
▪ Set to 0 (1 on older kernels)
▪ Is it really ever ok for your production app to page anonymously?
▪ Important switches/parameters
▪ Specify units (-m megabytes, -g gigabytes)
▪ Example:
root@m0131372:~# free -m
total used free shared buff/cache available
Mem: 32158 163 31503 49 491 31571
Swap: 7629 0 7629
root@m0131372:~# vmstat -w 1
procs -----------------------memory---------------------- ---swap-- -----io---- -system-- --------cpu--------
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 32567528 14864 195140 0 0 5 3 55 37 7 0 93 0 0
1 0 0 32567512 14864 195140 0 0 0 0 552 353 13 0 88 0 0
1 0 0 32567512 14864 195140 0 0 0 0 561 367 13 0 88 0 0
1 0 0 32567512 14864 195140 0 0 0 0 557 371 13 0 88 0 0
1 0 0 32567512 14864 195140 0 0 0 0 588 398 13 0 88 0 0
12:30:57 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
12:30:58 AM 32257328 672660 2.04 56292 358788 314860 0.77 310584 172492 8
12:30:59 AM 32257328 672660 2.04 56292 358788 314860 0.77 310584 172492 8
12:31:00 AM 32257328 672660 2.04 56292 358788 314860 0.77 310584 172492 8
12:31:01 AM 32257328 672660 2.04 56292 358788 314860 0.77 310584 172492 8
12:31:02 AM 32257328 672660 2.04 56292 358788 314860 0.77 310584 172492 8
12:44:13 AM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff
12:44:14 AM 0.00 0.00 18.00 0.00 38.00 0.00 0.00 0.00 0.00
12:44:15 AM 0.00 0.00 0.00 0.00 29.00 0.00 0.00 0.00 0.00
12:44:16 AM 0.00 0.00 0.00 0.00 29.00 0.00 0.00 0.00 0.00
12:44:17 AM 0.00 0.00 0.00 0.00 45.00 0.00 0.00 0.00 0.00
12:44:18 AM 0.00 0.00 0.00 0.00 29.00 0.00 0.00 0.00 0.00
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
599172 599172 100% 0.57K 21399 28 342384K radix_tree_node
2003118 1503603 75% 0.10K 51362 39 205448K buffer_head
10110 5559 54% 1.05K 337 30 10784K ext4_inode_cache
35238 23029 65% 0.19K 1678 21 6712K dentry
10528 10528 100% 0.55K 376 28 6016K inode_cache
root@m0131372:~# vmstat -w 1
procs -----------------------memory---------------------- ---swap-- -----io---- -system-- --------cpu--------
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 32567528 14864 195140 0 0 5 3 55 37 7 0 93 0 0
1 0 0 32567512 14864 195140 0 0 0 0 552 353 13 0 88 0 0
1 0 0 32567512 14864 195140 0 0 0 0 561 367 13 0 88 0 0
1 0 0 32567512 14864 195140 0 0 0 0 557 371 13 0 88 0 0
1 0 0 32567512 14864 195140 0 0 0 0 588 398 13 0 88 0 0
root@m0131372:~# df -i
root@m0131372:~# df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
udev 4114281 386 4113895 1% /dev
tmpfs 4116262 2566 4113696 1% /run
/dev/sda3 6053888 101781 5952107 2% /
tmpfs 4116262 1 4116261 1% /dev/shm
tmpfs 4116262 3 4116259 1% /run/lock
tmpfs 4116262 16 4116246 1% /sys/fs/cgroup
/dev/sda1 60960 307 60653 1% /boot
tmpfs 4116262 4 4116258 1% /run/user/14480
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
48824 48824 100% 0.12K 718 68 5744K kernfs_node_cache
26502 26502 100% 0.19K 631 42 5048K dentry
18582 18582 100% 0.55K 326 57 10432K inode_cache
11466 11466 100% 0.10K 294 39 1176K buffer_head
10864 10864 100% 0.07K 194 56 776K Acpi-Operand
256
complex 80
70
▪ Often reflect multiple
60
inflection points Saturation Point
Latency (ms)
50
before saturation 30
64
32
▪ Might mean something is 20
16
broken/misconfigured 10
1 2 4 8
0
0 500 1000 1500 2000 2500 3000 3500
IOPS
4K Writes
L=λW
Avg. # of requests in system = Arrival rate * Avg service time
Queuing System
Service
Queue Center
Arrivals Departures
Disk System
Disk
Queue Device
Arrivals Departures
Network System
On the
Queue Wire
Arrivals Departures
In
Queue Transmission
Arrivals Departures
Throughput Latency
Requests Throughput Requests Throughput
Assuming constant Assuming constant
Requests in Flight Requests in Flight Assuming constant Assuming constant
Latency Latency
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 1.99 0.01 1.62 0.18 19.66 24.39 0.00 0.36 2.39 0.34 0.21 0.03
sdb 0.00 173.01 0.00 48.78 0.02 3603.72 147.76 0.05 1.05 1.19 1.05 0.49 2.39
dm-0 0.00 0.00 0.00 90.72 0.01 1704.58 37.58 0.06 0.67 1.36 0.67 0.13 1.13
dm-1 0.00 0.00 0.00 129.11 0.01 1899.14 29.42 0.09 0.69 2.22 0.69 0.10 1.31
▪ rrqm/s: read requests merged/queued per second ▪ avgrq-sz: average request size (in 512b blocks)
▪ wrqm/s: read requests merged/queued per second ▪ avgqu-sz: average IO queue depth
▪ r/s: read requests completed per second ▪ await: average request time (includes wait, ms)
▪ r_await: average read request time (ms)
▪ w/s: write requests completed per second
▪ w_await: average write request time (ms)
▪ rkB/s: KB read per second
▪ scvtm: here be dragons*
▪ wkB/s: KB written per second
▪ %util: here also be dragons*
57 © 2016 Proofpoint, Inc.
iostat Caveats
▪ svctm
▪ man page: “The average service time (svctm field) value is meaningless”
▪ Why? Kernel measures stats at block/request level (not the device)
▪ Service time is inferred from %util and total IOPS
▪ %util
▪ Percentage of time that I/O requests were issued to the device
▪ This doesn’t work for:
- RAID Arrays
- SSDs
- Why? Parallelism
11:17:06 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
11:17:07 PM dev8-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
11:17:07 PM dev8-16 53.00 0.00 8880.00 167.55 0.03 0.60 0.38 2.00
11:17:07 PM dev252-0 117.00 0.00 7512.00 64.21 0.09 0.79 0.10 1.20
11:17:07 PM dev252-1 116.00 0.00 1368.00 11.79 0.05 0.45 0.07 0.80
Total DISK READ : 0.00 B/s | Total DISK WRITE : 2.71 M/s
Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 3.41 M/s
TID PRIO USER DISK READ DISK WRITE> SWAPIN IO COMMAND
17966 be/4 postfix 0.00 B/s 1693.11 K/s 0.00 % 0.34 % cleanup -z -t unix -u -c
19865 be/4 postfix 0.00 B/s 479.33 K/s 0.00 % 0.32 % cleanup -z -t unix -u -c
20054 be/4 postfix 0.00 B/s 405.88 K/s 0.00 % 0.24 % cleanup -z -t unix -u -c
20050 be/4 postfix 0.00 B/s 81.18 K/s 0.00 % 0.10 % cleanup -z -t unix -u -c
20053 be/4 postfix 0.00 B/s 46.39 K/s 0.00 % 0.07 % cleanup -z -t unix -u -c
RWBS
▪ R - Read ▪ S - Synchronous
▪ W - Write ▪ D - Discard
▪ B - Barrier
P T/U
Plug Queue Unplug Queue
A Q G I D C
Intent to Get Request Insert Into Submitted to
Remap IO Complete
Queue Struct Queue Driver
F/M
Merged
10:08:14 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
10:08:15 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
10:08:15 PM eth0 78.00 0.00 4.55 0.00 0.00 0.00 0.00 0.00
10:08:15 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
10:08:16 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
10:08:16 PM eth0 56.00 1.00 3.27 0.39 0.00 0.00 0.00 0.00
09:52:40 PM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s
09:52:41 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09:52:41 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09:52:41 PM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s
09:52:42 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09:52:42 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
▪ drops: Number of packets dropped per second ▪ fifos: Number of FIFO overrun errors that
because of a lack of space in linux buffers happened per second