You are on page 1of 8

Memory and swapping

Two indicators of a RAM shortage are the scan rate and swap device activity.

In both cases, the high activity rate can be due to a process that does not
have a consistent large impact on performance. The processes running on
the system have to be examined to see how frequently they are run and
what their impact is. It may be possible to re-work the program or run the
process differently to reduce the amount of new data being read into
memory. See "Process Memory Usage" below.

Whether or not to provide additional RAM for infrequent processes is a


classic money/performance tradeoff. If the cost is more important than the
performance, additional virtual memory space must be provided to allow
enough space for the application to run. The cheapest way to do this is to
provide additional swap space. If adequate total virtual memory space is not
provided, new processes will not be able to open. (The system may report
"Not enough space" or "WARNING: /tmp: File system full, swap space limit
exceeded.")

If inadequate physical memory is provided, the system will be so busy


paging to swap that it will be unable to keep up with demand. (This state is
known as "thrashing" and is characterized by heavy I/O on the swap device
and horrendous performance. In this state, the scanner can use up to 80% of
CPU.) (For a more thorough discussion of paging, see "Paging" below.

Scan Rate

The page scanning rate is the main tipoff that a system does not have
enough physical memory. Use sar -g or vmstat to look at the scan rate.

With vmstat, use vmstat 30 to check memory useage every 30 seconds. Ignore
the summary statistics on the first line. If page/sr exceeds 200 pages per
second for an extended time, your system may be running short of physical
memory. (Shorter sampling periods may be used to get a feel for what is
happening on a smaller time scale.)

A very low scan rate is a sure indicator that the system is not running short
of physical memory. On the other hand, a high scan rate can be caused by
transient issues, such as a process reading large amounts of uncached data.
The processes on the system should be examined to see how much of a
long-term impact they have on performance.

A nonzero scan rate is not necessarily an indication of a problem. Over time,


memory is allocated for caching and other activities. Eventually, the amount
of memory will reach the lotsfree memory level, and the pageout scanner will
be invoked. For a more thorough discussion of the paging algorithm, see
Paging below.

Swap Device Activity

The amount of disk activity on the swap device can be measured using iostat.
For Solaris 2.6 and higher, iostat -xPnce provides information on disk activity
on a partition-by-partition basis. For Solaris 2.5.1, iostat -xc provides
information on a disk-by-disk basis, which may be of limited use unless swap
has its own physical disk. sar -d provides similar information, and vmstat
provides some usage information as well.

If there are I/O's queued for the swap device, application paging is occurring.
If there is significant, heavy I/O to the swap device, a RAM upgrade may be
in order.

Process Memory Usage

The /usr/proc/bin/pmap command is available in Solaris 2.6 and above. It can


help pin down which process is the memory hog. /usr/proc/bin/pmap -x PID prints
out details of memory use by a process.

Summary statistics regarding process size can be found in the RSS column of
ps -ly or top.

dbx,the debugging utility in the SunPro package, has extensive memory leak
detection built in. The source code will need to be compiled with the -g flag
by the appropriate SunPro compiler.

shows memory statistics for shared memory. This may be useful


ipcs -mb
when attempting to size memory to fit expected traffic.

Segmentation Violations

A "segmentation violation fault" results when a process overflows its stack.


The kernel recognizes the violation and can extend the stack size, up to a
configurable limit.

In a multithreaded environment, the kernel does not keep track of each user
thread's stack, so it cannot perform this function. The thread itself is
responsible for stack SIGSEGV (stack overflow signal) handling. (The
SIGSEGV signal is sent by the threads library when an attempt is made to
write to a write-protected page just beyond the end of the stack. This page is
allocated as part of the stack creation request.)

Swap Space
The Solaris virtual memory system combines physical memory with available
swap space via swapfs. If insufficient total virtual memory space is provided,
new processes will be unable to open.

Swap space can be added, deleted or examined with the swap command.
swap -l reports total and free space for each of the swap partitions or files that
are available to the system. Note that this number does not reflect total
available virtual memory space, since physical memory is not reflected in the
output. swap -s reports the total available amount of virtual memory, as does
sar -r

If swap is mounted on /tmp via tmpfs, df -k /tmp will report on total available
virtual memory space, both swap and physical. As large memory allocations
are made, the amount of space available to tmpfs will decrease, meaning that
the utilization percentages reported by df will be of limited use.

Paging

Solaris uses both common types of paging in its virtual memory system.
These types are swapping (swaps out all memory associated with a user
process) and demand paging (swaps out the not recently used pages). Which
method is used is determined by comparing the amount of available memory
with several key parameters:

• physmem: physmem is the total page count of physical memory.


• lotsfree: The page scanner is woken up when available memory falls
below lotsfree. The default value for this is physmem/64; it can be tuned
in the /etc/system file if necessary. The page scanner runs in demand
paging mode by default. The initial scan rate is set by the kernel
parameter slowscan , which is fastscan/10 by default.
• minfree: Between lotsfree and minfree, the scan rate increases linearly
between slowscan and fastscan. ( minfree is set to desfree/2 and fastscan is
set to physmem/4 by default.) If free memory falls below desfree
( lotsfree/2 by default), the page scanner is started 100 times per
second. Each page scanner will run for desscan pages. This parameter is
dynamically set based on the scan rate.
• maxpgio: maxpgio (default 40 or 60) limits the rate at which I/O is
queued to the swap devices. It is set to 40 for sun4c, sun4m and sun4u
architectures and 60 for sun4d architectures. If the disks are faster
than 7200rpm, maxpgio can safely be set to 100 times the number of
swap disks.
• throttlefree: When free memory falls below throttlefree (default
minfree), the page_create routines force the calling process to wait until
free pages are available.
• cachefree: If the kernel parameter priority_paging is set to 1 on a Solaris
7 system (or current patchlevels of 2.5.1 or 2.6), only data files will be
targeted by the page daemon until lotsfree is reached. By default,
cachefree is set to 2 x lotsfree. (Solaris 8 uses a different algorithm to
determine which pages are targeted by the page daemon. priority_paging
should not be set on a Solaris 8 machine.)

The page scanner operates by first freeing a usage flag on each page at a
rate reported as "scan rate" in vmstat and sar -g. After handspreadpages
additional pages have been read, the page scanner checks to see whether
the usage flag has been reset. If not, the page is swapped out. (The default
for handspreadpages is physmem/4 up through Solaris 9. It is set dynamically in
Solaris 10.)

Solaris 8 Paging

Solaris 8 uses a different algorithm for removing pages from memory. This
new architecture is known as the cyclical page cache. It is designed to
remove most of the file system cache-induced problems with virtual
memory. The new system fills the same need as priority paging does for
Solaris 2.5.1-7.

The cyclical page cache uses a file system free list to cache filesystem data
only. Other memory objects are managed on a separate free list. (This
second list would include application binaries, shared libraries, applications
and uninitialized application data.)

With the new algorithm, filesystem cache only competes with itself for
memory. It does not force applications out of primary memory as sometimes
happened with the earlier OS versions.

As a result of these changes, vmstat under Solaris 8 will report different


statistics than would be expected under an earlier version of Solaris:

• Page Reclaim rate higher.


• Higher reported Free Memory: A large component of the filesystem
cache is reported as free memory.
• Low Scan Rates: Scan rates will be near zero unless there is a
systemwide shortage of available memory.

reports paging activity details for applications (executables), data


vmstat -p
(anonymous) and filesystem activity.

Swapping

If the system is consistently below desfree of free memory (over a 30 second


average), the memory scheduler will start to swap out processes. (ie, if both
and avefree30 are less than desfree, the swapper begins to look at
avefree
processes.)

Initially, the scheduler will look for processes that have been idle for maxslp
seconds. (maxslp defaults to 20 seconds and can be tuned in /etc/system.) This
swapping mode is known as soft swapping.

Swapping priorities are calculated for an LWP by the following formula:


epri = swapin_time - rss/(maxpgio/2) - pri
where swapin_time is the time since the
thread was last swapped, rss is the
amount of memory used by the LWPs process, and pri is the thread's priority.

If, in addition to being below desfree of free memory, there are two processes
in the run queue and paging activity exceeds maxpgio, the system will
commence hard swapping. In this state, the kernel unloads all modules and
cache memory that is not currently active and starts swapping out processes
sequentially until desfree of free memory is available.

Processes are not eligible for swapping if they are:

• In the SYS or RT scheduling class.


• Being executed or stopped by a signal.
• Exiting.
• Zombie.
• A system thread.
• Blocking a higher priority thread.

Direct I/O

Large sequential I/O can cause performance problems due to excessive use
of the memory page cache. One way to avoid this problem is to use direct I/O
on filesystems where large sequential I/Os are common.

Managing swap in the Solaris OS

Installation of the Solaris OS creates /swaps space and allocates 512 Mbyte by
default. The Solaris OS supports applying swap to raw disk partitions and to
file systems, and it also uses physical RAM as a swap area. Usually physical
memory is more efficient, but we are always restricted with the amount of
physical memory installed on the system.
It's always a good idea to apply swap to a raw partition, as compared to a file
system, because a raw partition doesn't involve the overhead of the file
system.

(Note: I've written this for Solaris versions 7, 8, 9, and 10. That said, I am
pretty sure this is applicable to all the versions.)

Adding Raw Partition swap Space

To add a raw swap partition you need to perform the following steps on your
system:

1. Identify a free disk partition on your system.

2. Add an entry to /etc/vfstab for the new raw partition as a swap partition:

/dev/dsk/c0t1d0s0 - - swap - no -

3. To enable this swap partition, issue the following command:

#swap -a /dev/desk/c0t1d0s0

4. To view the current swap details, use the following command:

#swap -l

Adding File System swap

The Solaris OS supports applying swap to a file. To enable a file system swap
you need to perform the following tasks:

1. Create a file using mkfile:

#mkfile 250m /opt/myswapfile

This will create a 250 Meg file, which the Solaris OS can use for swap.

2. To use this swap file, enable it with the following command:

#swap -a /opt/myswapfile

3. Check your change:


#swap -l

Note: To enable the new swap file at the next system boot, add the following
entry to /etc/vfstab:

/opt/swapfile - - swap - no -

Disabling swap Space


The Solaris OS provides the ability to disable a swap file while the system is running.
This is done with the -d option for swap. All allocated blocks are copied to other swap
areas.

solaris# swap -d /opt/myswapfile

To check your change, type this:

solaris# swap -l

Monitoring swap

It's always important to configure the right amount of swap space: Too little
will result in poor performance and too much will waste disk space.

The Solaris OS starts using swap if it's running out of physical memory. This is
called paging.

Here's how to get a summary of swap space:

solaris#swap -s
total: 3500744k bytes allocated + 3048720k reserved = 6549464k used,
23869824k available

And here's how to get details on the individual device or file that constitutes
swap space:

solaris#swap -l
swapfile dev swaplo blocks free
/dev/md/dsk/d1 85,1 16 41945456 41945456

If your system is running out of swap space you will see the following errors:

Not Enough Space


or

WARNING /tmp: File system full, swap space limit exceeded

To see if the system is running short of physical memory you can use vmstat
and iostat.

solaris#vmstat
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr m0 m1 m3 m4 in sy cs us sy id
0 0 0 24137360 6421168 70 179 21 14 14 0 0 0 0 0 0 472 3363 1776 4 2 94
0 0 0 23869912 5953040 11 13 0 0 0 0 0 0 0 0 0 430 1071 1545 7 1 92
0 0 0 23870896 5953904 58 313 0 2 2 0 0 0 0 0 0 578 2369 1798 20 1 78
0 0 0 23874712 5957216 11 11 0 0 0 0 0 0 0 0 0 417 1325 1648 0 0 100
0 0 0 23874744 5957248 22 64 0 3 3 0 0 0 0 0 0 423 1578 1629 1 2 97

Watch the column sr (Scan Rate) in the vmstat output.

solaris#iostat -Pxn
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.1 2.7 1.1 5.6 0.0 0.1 0.2 25.5 0 2 c1t0d0s0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 9.7 0 0 c1t0d0s1
0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.2 0 0 c1t0d0s2

Watch the r/s and w/s columns in the iostat output for the device, which is
configured as the swap device. If the values are high this means that a large
amount of I/O is generated to free up pages.

If physical memory is too low, the system will be busy paging to the swap
device with a heavy I/O on the swap device. In this state the system's CPU
utilization will also increase.

Summary

For improved system performance it's important that you have allocated
sufficient swap space to the system. To start with, configure 1.5 times the
physical memory installed on the system. If required, allocate more swap
space.