You are on page 1of 11

Linux Performance & Tools:

Common System Metrics:

Tools: BASIC
# uptime
23:27:29 up 4 min, 2 users, load average: 0.07, 0.26, 0.13

 The load average counts runnable threads on CPU, or runnable and waiting. Includes tasks blocked in disk
I/O.
 These are exponentially damped moving averages, with time constraints of 1, 5 and 15 minutes.
 If the load is greater than CPU count, (CPUs are 100% utilized) and threads are suffering scheduler latency.
That can be disk I/O factor too.

# top

%CPU = interval sum of all CPUs


top command can also consume CPU.
It can miss short-lived processes and {kernel-threads (unless included in top options)}.
Allows analysing high cpu processes: why and what…
High %CPU may be stalls in Memory I/O, upgrading to faster CPUs does not help.
# mpstat

Checks for hot-threads and unbalanced workloads.


Columns are summarized system wide in top’s header.

# iostat

%util depends on the target – virtual devices backed up by multiple disks may accept work at 100% utilization.
Calculates I/O controller stats by summing their devices.
Would like to see disk errors add “-e”.

# vmstat

First line of the output includes some summary -since-boot values.


“r” = total number of runnable threads, including those running.
Swapping aka. Paging costs performance.
# ps
$ ps -ef f
$ ps -eo user,sz,rss,minflt,majflt,pcpu,args (to get custom fields data)

#swapon
Shows which device is used for swap.
$ swapon -s

# lsof

$ lsof /path/to/file (Which process is using the file)


$ lsof +D /path/to/dir (List opened files in a directory)
$ lsof -u <username> (List of files opened by a user)
$ lsof -p <PID> (to view files opened by a process)

$ lsof -i (all network connections opened)


$ lsof -i -a -p <PID> (To list all network files opened by the process)
$ lsof -i :<port>
$ lsof -i <tcp or udp>
$ lsof -N -u <username> -a (to list all NFS files used by the user)
$ lsof -iTCP -sTCP:ESTABLISHED (to list all the established network tcp connections)

# free
[root@rhel7 ~] # free
total used free shared buff/cache available
Mem: 1867292 592332 752844 10880 522116 1036852
Swap: 2097148 0 2097148

Buffer: block device I/O cache


Cached: virtual page cache.

# ping
Simple ICMP test.

# nicstat

Calculates network controller stats by summing the interfaces

# dstat
The output above indicates:

 CPU stats: cpu usage by a user (usr) processes, system (sys) processes, as well as the number of idle (idl) and
waiting (wai) processes, hard interrupt (hiq) and soft interrupt (siq).
 Disk stats: total number of read (read) and write (writ) operations on disks.
 Network stats: total amount of bytes received (recv) and sent (send) on network interfaces.
 Paging stats: number of times information is copied into (in) and moved out (out) of memory.
 System stats: number of interrupts (int) and context switches (csw).
Tools: INTERMEDIATE

# sar (System Activity Monitor)


Examples:
$ sar -u ALL 1 3 (used for viewing CPU every 1 second 3 times)
$ sar -P <cpu_number> <interval_sec> <count> (for viewing individual CPU perf)
$ sar -r (for memory stats) (focus on “kbmemfree” and “kbmemused” for free and used memory)
$ sar -S (for SWAP space) (If the “kbswpused” and “%swpused” are at 0, then your system is not swapping)

$ sar -b (for overall stats)

 tps – Transactions per second (this includes both read and write)
 rtps – Read transactions per second
 wtps – Write transactions per second
 bread/s – Bytes read per second
 bwrtn/s – Bytes written per second

$ sar -d (for individual block device statistics) (can also use sar -p -d)
$ sar -w (for the total number of processes created per second, and total number of context switches per second)
$ sar -q (for run queue size and load average)
$ sar -n <KEYWORD> (for network statistics)

 DEV – Displays network devices vital statistics for eth0, eth1, etc.,
 EDEV – Display network device failure statistics
 NFS – Displays NFS client activities
 NFSD – Displays NFS server activities
 SOCK – Displays sockets in use for IPv4
 IP – Displays IPv4 network traffic
 EIP – Displays IPv4 network errors
 ICMP – Displays ICMPv4 network traffic
 EICMP – Displays ICMPv4 network errors
 TCP – Displays TCPv4 network traffic
 ETCP – Displays TCPv4 network errors
 UDP – Displays UDPv4 network traffic
 SOCK6, IP6, EIP6, ICMP6, UDP6 are for IPv6
 ALL – This displays all the above information. The output will be very long.

$ sar -A > $(date +`hostname`-%d-%m-%y-%H%M.log) (to save the output to a file)

# netstat
Gets network protocol statistics
$ netstat -na (to list all the connections and ports)
$ netstat -s (for getting individual statistics report)

and so on….

# pidstat

For breaking down the processes. The pidstat command is used for monitoring individual tasks currently being
managed by the Linux kernel.
$ pidstat 1
$ pidstat -drl
-d: reports I/O stats
-r: page faults and memory utilization
-l: command name and all its args.
-p <PID>: allows to get the data for a specific process.
-u: CPU utilization
$ pidstat -r -p xxxx 2 5
$ pidstat -C “fox|bird” -r -p ALL
$ pidstat -T CHILD -r 2 5

minflt/s: Total number of minor faults the task has made per second, those which have not required loading a memory
page from disk.
majflt/s: Total number of major faults the task has made per second, those which have required loading a memory
page from disk.
VSZ: virtual size
RSS: resident Set Size: non swappable physical memory
StkSize: memory reserverd for task as stack
StkRef: memory used as stack

# strace

to debug a process
strace /path/executable-name <arguments-if-any>

to debug an already running process


strace -p PID
strace -tttT -p PID (for time in microseconds)

to print a summary
strace -c dd if=/dev/zero of=/dev/null bs=512 count=1024k

time dd if=/dev/zero of=/dev/null bs=512 count=1024k


time strace -c dd if=/dev/zero of=/dev/null bs=512 count=1024k

# gdb

to analyse the core file


gdb /path/to/application /path/to/corefile

# tcpdump
$ tcpdump -n (To get the hostnames and ports)
-vv (very verbose)
-i <int_name> (to specify an interface)
-i any (will listen to all interfaces)
-w /path/to/file (to save the output to a file)
-r /path/to/file (to read the file)
-s <number> (specify capture size of each packets)
-c <number>: to specify number of packets to be captured
# tcpdump -w /var/tmp/tcpdata.pcap -i any -c 10 -vvv

Tcpdump filter examples:


# tcpdump -nvvv -i any -c 3 port 22 and port 60738 (filter source and destn ports)
# tcpdump -nvvv -i any -c 3 src host 10.0.3.1 (filter source is 10.0.3.1)
# tcpdump -nvvv -i any -c 3 host 10.0.3.1 (filter all traffic for host 10.0.3.1)
# tcpdump -nvvv -i any -c 20 'port 80 or port 443' (search for one port or another)
# tcpdump -nvvv -i any -c 20 '(port 80 or port 443) and host 10.0.3.169' (traffic on two specific ports and from a
specific host)
# tcpdump -nvvv -i any -c 20 '((port 80 or port 443) and (host 10.0.3.169 or host 10.0.3.1)) and dst host 10.0.3.246'

# tcpdump -nvvv -i any -c <n> <protocol>


Protocol = icmp / udp/ tcp etc

# blktrace
Allows block device event tracing and investigating I/O latency. Need to mount debugfs on /sys/kernel/debug
$ mount -t debugfs none /sys/kernel/debug
$ btrace /dev/sdb -w N_seconds -n N_buffer -b buff_size

# iotop
Shows disk I/O by process.
The IO> is the time the thread was waiting on I/O
CONFIG_TASK_IO_ACCOUNTING needs to be enabled for this to work.

$ iotop -bod 5

# slabtop
Shows where kernel memory is being utilized (check for CACHE SIZE)
$ slaptop -sc
# sysctl
The /sbin/sysctl command is used to view, set, and automate kernel settings in the /proc/sys/ directory.

$ sysctl -a (display all currently available values)


-w <settings_name>=<value> (to change a setting)
-p /etc/sysctl.conf (to load the setting from the file specified) (default is /etc/systcl.conf)

/proc

 /proc/cmdline – Kernel command line


 /proc/cpuinfo – Information about the processors.
 /proc/devices – List of device drivers configured into the currently running kernel.
 /proc/dma – Shows which DMA channels are being used at the moment.
 /proc/fb – Frame Buffer devices.
 /proc/filesystems – File systems supported by the kernel.
 /proc/interrupts – Number of interrupts per IRQ on architecture.
 /proc/iomem – This file shows the current map of the system’s memory for its various devices
 /proc/ioports – provides a list of currently registered port regions used for input or output
communication with a device
 /proc/loadavg – Contains load average of the system
The first three columns measure CPU utilization of the last 1, 5, and 10 minute periods.
The fourth column shows the number of currently running processes and the total number of
processes.
The last column displays the last process ID used.
 /proc/locks – Displays the files currently locked by the kernel
Sample line:
1: POSIX ADVISORY WRITE 14375 08:03:114727 0 EOF
 /proc/meminfo – Current utilization of primary memory on the system
 /proc/misc – This file lists miscellaneous drivers registered on the miscellaneous major device, which
is number 10
 /proc/modules – Displays a list of all modules that have been loaded by the system
 /proc/mounts – This file provides a quick list of all mounts in use by the system
 /proc/partitions – Very detailed information on the various partitions currently available to the
system
 /proc/pci – Full listing of every PCI device on your system
 /proc/stat – Keeps track of a variety of different statistics about the system since it was last restarted
 /proc/swap – Measures swap space and its utilization
 /proc/uptime – Contains information about uptime of the system
 /proc/version – Version of the Linux kernel, gcc, name of the Linux flavor installed.

Debug:
Obtain stack trace, dynamic object dependencies, address map, open file descriptors of the processes & using the gcore utility get a core
file of the process

# pstack PID > pstack.PID


# pldd PID > pldd.process
# pmap PID > pmap.process
# pfiles PID > pfiles.process
# gcore -o /path/app.core PID

Equivalents for Linux:


ptree -> pstree
pfiles -> lsof -a -p <pid>
pstack -> pstack
truss -> strace

You might also like