HP Global Technical Partner − Cadence

HP−UX Kernel Tuning Guide for Technical Computing Getting The Best Performance On Your Hewlett−Packard HP 9000 Systems Version 2.0
Introduction
This document describes the underlying basics of why and how a HP−UX kernel is tuned and configured. The intent is to provide customers, developers, application designers, and HP's technical consultants the information necessary to optimize the performance of existing hardware configurations and to make intelligent decisions when running applications on HP's UNIX platforms.

Hardware Considerations
HP, and other hardware vendors, offer a broad selection of products with a wide range of CPU performance, memory and disk options, varying greatly in price. Obviously, performance of a software application will be affected by the hardware selected to run it on. The reason so many different products are available, is to allow the customer to select the most cost effective solution for their particular software problem. A large, heavily configured system may not be utilized to its full potential if you only need to solve small, simple problems while a less capable system may be overloaded trying to solve large, complex problems that exceed its capacity. Under these circumstances, neither system would be cost effective when utilized in this manner. Selecting the most cost effective system requires understanding your compute requirements as well as the hardware options. There five key hardware areas that directly affect the performance you will obtain from your application: CPU, Memory, Disk, Graphics, and Network. While all these hardware areas are important, it is equally important to configure a balanced system. It is counter productive to buy the fastest CPU and then configure it with insufficient memory. You might get better performance and throughput with a slower, less expensive, CPU with the difference in price invested in more memory. There are a large number of variables to consider when deciding on the hardware for your compute infrastructure. The compute needs may vary from the very simple to the incredibly complex. The best way to select the appropriate hardware configurations is to resolve your compute needs: • How many users need to be served? • What are the data server needs ? • What are the compute server needs ? • What are the application software needs?

Getting The Best Performance On Your Hewlett−Packard HP 9000 Systems Version 2.0

1

HP Global Technical Partner − Cadence There should be couple of different system configurations to fully cover your environment. Maybe 1, 2 or 3 base system configurations will properly handle your desktop computing needs: one hardware configuration for one type of user, a slightly different configuration for another and yet another configuration for the userr who has major memory and swap requirements for her/his system. There may be a need for managing both small and large batch tasks under a compute server or task queuing methodology. A data server will be needed for storing the large amounts of data with a reliable backup system and revision control system. Add to this collection a software server dedicated to manage large software applications and licensing programs. The best way to select your appropriate hardware configuration(s) is to perform benchmark tests that duplicate your intended use of the system. With relevant benchmark data in hand, you will have the information you need to make intelligent tradeoff decisions on the cost/performance benefits of the available hardware options for your site.

CPU
Many operations require a large number of integer and floating point calculations. A few applications will use integer calculations, but others might rely heavily on floating point calculations. CPU performance is the single most important performance factor for executing a large number of calculations in the shortest possible time. Selecting the CPU is a tradeoff between cost, the size of the problems you will be solving, and your perception of adequate performance. If an operation takes five seconds, is it worth it to you to spend an extra $10,000 to do the operation in three seconds? However, if the operation takes five hours and the time can be reduced to one or three hours, it may be worth the added expense. If the operation is done several times a day it is almost certainly worth it. If it is only done once a month then it may be questionable. When evaluating hardware performance, you must prioritize the tasks to be performed relative to their importance, frequency, and impact on overall productivity. Tasks that are most affected by CPU performance are those that involve more computation than disk access or graphics display. Don't forget to consider investment protection. The CPU that seems adequate today may not meet your needs in the near future. The rapid pace of hardware development makes existing systems obsolete in a very short period of time. How easy will it be for you to upgrade your systems to increase MIP's capacity or take advantage of the latest compiler or hardware technology? One standard benchmark that you can use to gauge CPU performance is SPECint.

Memory
One of the most commonly asked questions is "How much memory do I need?". Unfortunately, the real answers to this question are "Enough" and "It depends". The amount of memory you need is directly related to the size of the applications you are working with. While 'X' amount of memory may allow you to run your application, it may not be large enough to allow for optimal performance. Memory management is a complex topic. Memory, its relationship to swap space, and its effect on performance are discussed in more detail in the section "Understanding Memory and Swap" later in this document. Again, cost must be weighed versus benefits; certainly you can spend the money to configure a system with enough memory to allow your application to be run in memory, but depending on the application, the cycle time savings may not be worth it.

Disk
Sometimes data can be quite large. Disk I/O is often a performance bottleneck. Other than the obvious effects on data loading bandwidth, disk I/O can also be the limiting factor in overall performance if a system starts paging. Hardware Considerations 2

HP Global Technical Partner − Cadence HP's philosophy is to design balanced systems in which no single component becomes a performance bottleneck. HP has made significant enhancements to I/O performance in order to keep pace with the speed of our CPUs. I/O performance depends on several parts of the system working together efficiently. The I/O subsystems have been redesigned so that they now offer the industry's fastest and most functional I/O as standard equipment. To improve disk I/O performance: Distribute the work load across multiple disks. Disk I/O performance can be improved by splitting the work load. In many configurations, a single drive must handle operating system access, swap, and data file access simultaneously. If these different tasks can be distributed across multiple disks then the job can be shared, providing subsequent performance improvements. For example, a system might be configured with four logical volumes, spread accross more than one physical volume. The HP−UX operating system could exist on one volume, the application on a second volume, swap space interleaved across all local disk drives and data files on a fourth volume. Split swap space across two or more disk volumes. Device swap space can be distributed across disk volumes and interleaved. This will improve performance if your system starts paging. This is discussed in more detail in the section on Swap Space Configuration later in this document. Enable Asynchronous I/O − By default, HP−UX uses synchronous disk I/O, when writing file system "meta structures" (super block, directory blocks, inodes, etc.) to disk. This means that any file system activity of this type must complete to the disk before the program is allowed to continue; the process does not regain control until completion of the physical I/O. When HP−UX writes to disk asynchronously, I/O is scheduled at some later time and the process regains control immediately, without waiting. Synchronous writes of the meta structures ensure file system integrity in case of system crash, but this kind of disk writing also impedes system performance. Run−time performance increases significantly (up to roughly ten percent) on I/O intensive applications when all disk writes occur asynchronously; little effect is seen for compute−bound processes. Benchmarks have shown that load times for large files can be improved by as much as 20% using asynchronous I/O. However, if a system using asynchronous disk writes of meta structures crashes, recovery might require system administrator intervention using fsck and, might also cause data loss. You must determine whether the improved performance is worth the slight risk of data loss in the event of a system crash. A UPS device, used in a power failure event will help reduce the risk of lost data. Asynchronous writing of the file system meta structures is enabled by setting the value of the kernel parameter fs_async to 1 and disabled by setting it to 0, the default. For instructions on how to configure kernel parameters, see the section Kernel Configuration Parameters later in this document. You may want to use a RAID (Redundant Array of Inexpensive Disks) configuration for reliability. Most RAID configurations do not perform as well as non−RAID configurations, but the reliability gains may be worth it.

Graphics and Color Mapping
Many tools use 2−D graphics, and are X11 based. Thus, a platform's X11 performance is key to maximizing the graphics performance of these applications. This can be measured with the standard benchmark xmark93.

Disk

3

HP Global Technical Partner − Cadence

Network
Many installations are client/server networks, primarily because of the need for shared data and massive amounts of on−line storage. Therefore, the network configuration can be, and usually is critical to the overall performance and throughput. Most current networks are ethernet−based, which, when combined with a 700 class machine may create an unbalanced situation. For example, a single HP 735 can almost saturate a single ethernet wire under the right conditions. See the section labeled Networking later in this document for tuning and configuration guidelines for ethernet networks. You can, of course, upgrade to Fast Ethernet, FDDI, ATM, or other faster network technology if you have the money.

Understanding Memory and Swap
There is a lot of confusion regarding cache memory, configuration of swap space, swap's relationship to physical memory, kernel parameters affecting memory allocation, and performance implications. If there was a simple formula, this would be easy. However, this is not the case. It is important to understand memory in order to understand these settings and how to determine optimal settings for a given situation.

Memory Management
HP−UX memory management system is composed of 3 basic elements: Cache, memory and swap space. Swap space can be composed of two types: device swap space and file system swap space. Device swap space can be made up of primary swap space that is defined on the root file system disk drive and secondary swap space which is defined on the remaining disk volumes. All of these memory elements can be optimized through HP−UX kernel parameter tuning or application compile. The data and instructions of any process (a program in execution) must be available to the CPU by residing in physical memory at the time of execution. RAM, the actual physical memory (also called "main memory"), is shared by all processes. To execute a process, the HP−UX kernel executes through a per−process virtual address space that has been mapped into physical memory. The term "memory management" refers to the rules that govern physical and virtual memory and allow for efficient sharing of the system's resources by user and system processes. Memory management allows the total size of user processes to exceed physical memory by using an approach termed demand−paged virtual memory. Demand paged virtual memory enables you to execute a process by bringing into main memory parts of the process only as needed, that is, on demand, and pushing out to disk, parts of a process that have not been recently used. The HP−UX operating system uses paging to manage virtual memory. Paging involves moving small units (called pages) of a process between main memory and disk swap space. One method for increasing the efficiency of memory allocation within memory management is the usage of the mallopt command before each malloc call within the EDA application code. This command is unique to HP−UX and controls the memory allocation algorithm and other optimization options within the malloc library. Usage of this option can improve application execution time up to 10X depending on the data size. It is important that the Maxfast and Numlblks options (i.e. the first two options to mallopt) be defined to reflect the data size links being accessed.

Network

4

HP Global Technical Partner − Cadence

Physical Memory
Physical memory is composed of hardware known as RAM (also called SIMM's, DIMM's, etc...). For the CPU to execute a process, the relevant parts of a process must exist in the system's RAM. The more main memory in the system, the more data it can access and the more or larger a process(es) it can execute without having to page. This is because the system can retain more processes in main memory, thus requiring the kernel to page less frequently. Each time the system has to page there is a performance cost since the speed of reading or writing from/to disk is much slower than accessing memory. Not all physical memory is available to user processes. The kernel occupies some main memory (that is, it is never paged). The amount of main memory not reserved for the kernel is termed available memory. Available memory is used by the system for executing processes.

Secondary Storage
Main memory stores computer data required for program execution. During process execution, data resides in two faster implementations of memory found in the processor subsystem, registers and cache. Program files are kept in secondary storage or secondary memory, typically disks accessible either via system buses or network. Data is also stored when no longer needed in main memory, to make room for active processes.

Swap
A temporary form of secondary data storage is termed swap, dating from early UNIX implementations that managed physical memory resources by moving, i.e. swapping, entire processes between main memory and secondary storage. HP−UX uses paging, a more efficient memory resource management mechanism. It should be noted that HP−UX does not "swap" any more, it pages and, as a "last resort" deactivates processes. The process of deactivation replaces what was formerly known as swapping entire processes out. While executing a program, data and instructions can be paged (copied) to and from secondary storage, or disk, if the system load warrants such behavior. Swap space is initially allocated when the system is configured. HP−UX supports two types of swap space: device swap space and file system swap space. Device swap is allocated on the disk before a file system has been created and can take the following forms: • an entire disk • a designated area on a disk • a software disk−striped partition on a disk If the entire disk hasn't been designated as swap, the remaining space on the disk can be used for a file system. File−system swap space is allocated from a mounted file system and can be added dynamically to a running system. If more swap space is required, it can be added dynamically to a running system, as either device swap or file−system swap. Note that file−system swap has significantly lower performance than device swap as it must use separate read/write requests for each page block and has a smaller page swapping size than used in device swap. The I/O for file system swap will contend with user I/O on that file system, which will cause performance to degrade. File system swap space usage should be avoided. Physical Memory 5

HP Global Technical Partner − Cadence Either Sam or the swapon command can be used to enable disk space or a directory in a file system for swap. NOTE: Once allocated, you cannot remove either type of swap without rebooting the system. HP−UX also uses a early swap space reservation method to make sure it has space available but it only allocates the space when it actually needs to write to it. Virtual Address Space Virtual memory uses a structure for mapping processes termed the virtual address space. The virtual address space contains information and pointers to the memory that the process can reference. One virtual address space (vas) exists per process and serves several purposes: • It provides the overall description of each process. • It contains pointers to another element in the memory management subsystem − per−process regions. (pregions) • It keeps track of pregions most recently involved in page faults. Each HP−UX process executes within a 4 Gb virtual address space (this may change in the near future). The virtual address space structure points to per−process regions, or pregions. Pregions are logical segments that point to specific segments of a process, including code (text, or process instructions), data, u_area and kernel stack, user stack, shared memory segments and shared library code and data segments. The size of various memory segments is controlled by the values assigned to certain configurable kernel parameters. It is beyond the scope of this paper to discuss all the process virtual memory segments. The following, however, is a description of the segments most relevant to this discussion. Text − The text segment holds a process's executable object code and may be shared by multiple processes. The maximum size of the text segment is limited by the configurable operating−system parameter maxtsiz. Data − The data segment contains a process's initialized (data) and uninitialized (.bss) data structures, along with the heap, private "shared" data, "user" stack, etc. A process can dynamically grow it's data space. The total allotment for initialized data, uninitialized data and dynamically allocated memory (heap) is governed by the configurable kernel parameter maxdsiz. Stack − Space used for local variables, subroutine return addresses, kernel routines, etc. The u_area contains information about process characteristics. The kernel stack , which is in the u_area, contains a process's run−time stack while executing in kernel mode. Both the u_area and kernel stack are fixed in size. Space available for remaining stack use is determined by the configurable parameter maxssiz. Shared Memory − Address space which is sharable among multiple processes.

Configurable Parameters
HP−UX configurable kernel parameters limit the size of the text, data, and stack segments for each individual process. These parameters have pre−defined defaults, but can be reconfigured in the kernel. Some may need to be adjusted when swap space is increased. This is discussed in more detail in the section on configuring the HP−UX kernel. Swap 6

HP Global Technical Partner − Cadence bufpages create_fastlinks fs_async hpux_aes_override maxdsiz maxfiles maxfiles_lim maxssiz maxswapchunks maxtsiz maxuprc netmemmax nfile ninode nproc npty Sets number of buffer pages Store symbolic link data in the inode Sets asynchronous write to disk Controls directory creation on automounted disk drives Limits the size of the data segment. Limits the soft file limit per process Limits the hard file limit per processes Limits the size of the stack segment. Limits the maximum number of swap chunks Limits the size of the text (code) segment. Limits the maximum number of user processes Sets the network dynamic memory limit Limits the maximum number of "opens" in the system Limits the maximum number of open inodes in memory Limits the maximum number of concurrent processes Sets the maximum number of pseudo ttys

The four GB virtual address space is divided into four one−GB quadrants. Each quadrant has associated with it: • The first quadrant always contains the process's text segment (code), and sometimes some of the data (EXEC_MAGIC). • The second quadrant contains the data segment (static data, stack, and heap, etc.). • The third quadrant contains shared library code, shared memory mapped files and sometimes shared memory. • The fourth quadrant contains shared memory segments, shared memory−mapped files, shared library code, and I/O space.

Physical Memory Versus Performance
The amount of memory available to applications is determined by the amount of swap configured plus physical memory. The size of physical memory determines how much paging will be done while applications are running. Paging imposes a performance penalty because pages are being moved between physical memory and secondary storage, or disk. The more time that is spent paging, the slower the performance. There is a critical threshold for physical memory size below which the system spends almost all its CPU time paging. This is known as thrashing and is evident by the fact that system performance virtually comes to a standstill and even simple commands, like ls, take a long time to complete. Optimally, all operations would be done in physical memory and paging would never occur. However, memory costs money, so there is usually a tradeoff made between budgetary constraints and the minimum acceptable performance level. Understanding how memory size affects performance can help you make sure you are maximizing your expenditure on memory. One thing to keep in mind is that memory needs are always changing and the base system configuration will need to be constantly addressed. HP's Glance/GlancePlus is a good application that will help you address and resolve memory versus performance issues. Where Is The Memory Going? To help you understand the minimum memory configuration you should consider, it helps to understand how memory is consumed. On a system, you will minimally have the following memory consuming resources:

Configurable Parameters

7

HP Global Technical Partner − Cadence • HP−UX Operating System • Windowing System 10−12 MB 21 MB (X11) 25 MB (VUE) 32 MB (CDS)

Any other processes or services running on the system will consume additional memory resources. As you can see, if you add these up, before you even load the first part, you are already consuming approximately 50Mb of memory. This isn't quite as straightforward as it seems, however. HP−UX uses a paging algorithm to move data in and out of physical memory. The only data that isn't subject to paging is HP−UX itself. Out of the 25Mb of executable code in VUE, you will not be using all of it at any given time. Since code will be overwriten if it isn't used, and there are many functions in VUE that you may seldom or never use, there is some percentage of the executable code that will never be paged in. This same behavior applies to applications. For example, an application that involves significant disk I/O or LAN activity, followed by intensive CPU activity. Determining Appropriate Physical Memory Size There are a couple of ways to determine whether the amount of physical memory in your system is adequate. The first is to run a series of timed benchmarks on systems with increasing levels of physical memory and determine the impact of additional memory on those operations. Another way is to use one of HP's performance tools to monitor the system operation. It will tell you how much paging is occurring, if any. If you plot memory size versus time to perform an typical operation in an application, you will get a dog−leg shaped curve for most operations. This means that performance increases on a fairly steep curve as memory size is increased up to a point. Beyond that point, the curve flattens out and adding additional memory will not significantly improve performance. The ideal memory configuration is one that falls on the breakpoint. If your memory is less than the breakpoint, you are not getting all the performance you could from your system. The performance breakpoint varies depending on the operation being performed in combination with the data set used. The only accurate way to determine the optimal memory size is to perform timed benchmarks using real data.

HP−UX Configuration
This section explains HP−UX configurable software settings and parameters that affect system capacity and/or performance. Most of this section is common for HP−UX 9.X and HP−UX 10.X. Specific differences are noted.

Swap Configuration
How much swap do I have? SAM, Glance/GlancePlus, top, and swapinfo all show swap information. To see how much swap space is configured on your system, and how much is in use, execute one of the following commands: • top • Glance/GlancePlus • sam • /etc/swapinfo −t • /usr/sbin/swapinfo −t

requires root passwd HP−UX 9.X systems and requires root login HP−UX 10.X systems and requires root login

Physical Memory Versus Performance

8

HP Global Technical Partner − Cadence Any user can execute top and Glance. The program sam and command swapinfo both require root privilege. This is because these commands must open the kernel memory file /kmem to read the swap usage information . Since this is a critical operating system file, access is usually restricted to root only. How Much Swap Do I need? The amount of swap available determines the maximum address space, or virtual memory, available for applications . The minimum recommendation is twice as much swap space as physical memory. If swap is too small, and you try to load something that exceeds available swap you will get an out of memory error. If you configure more swap than you will ever need, you are wasting valuable disk space. The correct swap size will vary considerably depending on the application(s) run on a system.The optimal swap configuration may vary between individual users and/or systems. However, optimizing swap on a user to user basis is not advised. A common swap size for systems should be resolved for ease of supportability and maximum long−term design flexibility. The correct swap space configuration for your site can only be accurately determined by monitoring swap usage while working with real data. This could be done either with the swapinfo command or using a tool like HP's GlancePlus. GlancePlus allows you to monitor system resources on a per process basis and will track high water marks over a period of time. You would configure a system with more swap than you expect to need and then run GlancePlus while running an application in a real work environment. By monitoring the high water mark, you can determine the maximum swap space used and adjust the swap configuration accordingly. Obviously, if you experience out of memory errors, swap space is too small. Swap space should not be less than the amount of physical memory in your system. NOTE: For best performance, swap space should be distributed evenly across all disks at the same priority . There are two types of swap space in HP−UX, device and file system. Device swap provides much better performance because it utilizes the raw disk I/O. File system defined swap space should be avoided. Configuring Swap Space As mentioned previously, device swap is preferred over file system swap to achieve the best performance. The ideal swap configuration is device swap interleaved on two or more disks. When device swap is interleaved on 2 or more disks, the system alternates between the disks as paging requests occur, providing better performance than a single disk. SAM is the easiest method for adding and configuring swap space. Swap configuration is under the Disks and File System area of SAM. For more information on configuring swap, please see the on−line Help section within SAM's Swap Configuration.

Kernel Configuration Parameters
Bufpages Bufpages specifies how many 4096−byte memory pages are allocated for the file system buffer cache. These buffers are used for all file system I/O operations, as well as all other block I/O operations in the system (exec, mount, inode reading, and some device drivers.). In HP−UX 10.X, we highly recommend this kernel parameter be set to 0. This will enable dynamic buffer cache which has been changed in the 10.X OS.

Swap Configuration

9

HP Global Technical Partner − Cadence In HP−UX 9.X, we do NOT recommend using dynamic buffer cache. A fixed buffer cache can be specified by setting bufpages to a non−zero value, for example, 4096 and nbuf to 0. This will set 2048 buffer headers and allocate 16 Kb of buffer pool space at system boot time. If you wish to reserve 10% of physical memory for the file system buffer cache, the value can be calculated as: bufpages = (.1 * ((physical memory in Mb) / (pagesize in 4096 bytes)) ). Create_Fastlinks Create_fastlinks tells the system to store HFS symbolic link data in the symbolic link's inode. This reduces disk space usage and speeds things up. By default, this feature is disabled for backward compatibility. We recommend all systems have create_fastlinks enabled by setting this kernel parameter to 1. Dbc_Max_Pct This parameter determines the percentage of main memory that the dynamically allocated buffer cache is allowed to grow to. As the system will use as much memory as it can for buffer cache, when performing intense block I/O, this becomes the size of the buffer cache on a system that is not feeling memory pressure due to process invocations. The problem arises when memory stress due to process space requirements requires the system to start paging, at which point, the system tries to reclaim buffer cache pages to allocate them to running processes. But the system is also trying to allocate as much buffer cache as it can, causing a vicious cycle of allocating and deallocating memory between buffer cache and process memory space, creating a large amount of overhead. The idea then is to keep this number resonably low, allowing you to have the cache space but also keep the application space large enough to avoid high levels of conflict between them. The default value is 50%, but we recommend 25% to start. We have seen systems that need buffer cache to have a max of as little as 5%, with a min at 2%. This is something that requires careful attention, with appropriate modification. If this form of thrashing in main memory becomes an increasing problem, the only good fix is to purchase more physical memory. Fs_Async This kernel parameter controls the switch between synchronous or asynchronous writes of file system meta structures to disk. Asynchronous writes to disk can improve file system I/O performance significantly. However, synchronous writes to disk make it easier to restore file system integrity if a system crash occurs while file system meta structures are being updated on the file system. Depending on the application, you will need to decide which is more important. The decision should be based on what types of applications are going to be run. You may value file system integrity more than I/O speed. If so, fs_async should be set to 0. HPUX_AES_Override This value is part of the OSF/AES compliance. It controls directory creation on automounted disk drives. We recommend hpux_aes_override be set to 1. If this value is not set, you may see the following error message: mkdir: cannot create /design/ram: Read−only file system. This system parameter cannot be set using SAM. The kernel must be manually modified the old way. It is best to modify the other parameters with SAM first and then change this parameter second, else SAM will override your 'unsupported' value with default. Maxdsiz Maxdsiz defines the maximum size of the data segment of an executing process. The default value of 64 Mb is too small for most applications. We recommend this value be set to the maximum value of 1.9Gb. If maxdsiz is exceeded by a process, it will be terminated, usually with a SIGSEGV (segmentation violation) and you will probably see the following message: Kernel Configuration Parameters 10

HP Global Technical Partner − Cadence Memory fault(coredump) In this case, check out the values of maxdsiz, maxssiz and maxtsiz. For more information on these parameters, please see the on−line Help section within SAM's Kernel Configuration. If you need to exceed the specified maximum of 1.9Gb, there are a couple of ways (yet to be supported) to do so. Contact your Hewlwett Packard technical consultant for the details. It is important to note that the maxdsiz parameter must be modified in order for these procedures to work. Maxdsiz will need to be set to 2.75Gb or 3.6Gb depending on the method chosen and/or size required. Maxfiles This sets the soft limit for the number of files a process is allowed to have open . We recommend this value be set to 200. Maxfiles_Lim This sets the hard limit for number of files a process is allowed to have open . This parameter is limited by ninode. The default for this kernel parameter is 2048. Maxssiz Maxssiz defines the maximum size of the stack of a process. The default value is 8Mb. We recommend this value be set to a value of 79 Mb. Maxswapchunks This (in conjunction with some other parameters) sets the maximum amount of swap space configurable on the system. Maxswapchunks should be set to support sufficient swap space to accommodate all swap anticipated. Also remember, swap space, once configured, is made available for paging (at boot) by specifying it in the file /etc/fstab. The maximum swap space limit is calculated in bytes is: (maxswapchunks * swchunk * DEV_BSIZE). We recommend this parameter be set to 2048. Maxtsiz Maxtsiz defines the maximum size of the text segment of a process. We recommend 1024 MB. Maxuprc This restricts the number of concurrent processes that a user can run. A user is identified by the user ID number and not by the number of login instances. Maxuprc is used to keep a single user from monopolizing system resources. If maxuprc is too low, the system issues the following error message to the user when attemting to invoke too many processes: no more processes We recommend maxuprc be set to 200. Maxusers This kernel parameter is used in various algoritms and formulae throughout the kernel. It is used to limit system resource allocation and not the actual number of users on the system. It is also used to define the system table size. The default values of nproc, ncallout, ninode and nfile are defined in terms of maxusers. We are recommend fixed values for nproc, ninode and nfile. Set maxusers to 124. Netmemmax This specifies how much memory can be used for holding partial internet−protocal(IP) messages in memory. They are typically held in memory for up to 30 seconds. The default of 0 allows up to 10% of total memory to be used for IP level reassembly of packet fragments. Values for netmemmax are specified as follows:

Kernel Configuration Parameters

11

HP Global Technical Partner − Cadence Value −1 0 Description No limit, 100% of memory is available for IP packet reassembly. netmemmax limit is 10% of real memory. Specifies that X bytes of memory can be be used for IP packet reassembly. The minimum is 200 Kb and the value is rounded up to the next multiple of pages (4096 bytes).

>0

If system network performance is poor, it might be because the system is dropping fragments due to insufficient memory for the fragmentation queue. Setting this parameter to −1 will improve network performance, but, at the risk of leaving less memory available for processes. We recommend it be set to −1 for systems acting as data servers only. For all other systems, we recommend a setting of 0. Nfile Nfile sizes the system file table. It contains entries in it for each instance of an open of a file. It therefore restricts the total number of concurrent "opens" on your system. We suggest that you set this at 2800. This parameter defaults to ((16 * (nproc + 16 + maxusers) / 10 ) + 32 + 2 * npty). If a process attempts to open one more (than nfile) file, the following message will appear on the console: file: table is full When this happens, running processes may fail because they cannot open files and no new processes can be started. Ninode Ninode sizes the incore inode table, also called the inode cache.For performance, the most recently accessed inodes are kept in memory. Each open file has an inode in the table. An entry is made in the table for each "login directory", each "current directory", each mount point directory, etc. It is recommended that ninode be set to 15,000. Nproc Nproc sizes the process table. It restricts the total number of concurrent processes in the system.When some one/process attepmts to start one more (than nproc) process, the system issues these messages: at console window : proc: table is full at user shell window: no more processes Set nproc to 1024. Npty This parameter limits the number of master/slave pty data structures that can be opened. These are used by network programs like rlogin, telnet, xterm, etc. We recommend this parameter be set to 512.

Configuring Kernel Parameters
The following are the suggested kernel parameter values.

Value Configuring Kernel Parameters 12

HP Global Technical Partner − Cadence # Parameter # bufpages create_fastlinks dbc_max_pct fs_async maxdsiz maxfiles maxfiles_lim maxssiz maxswapchunks maxtsiz maxuprc maxusers netmemmax nfile ninode nproc npty 0 # on HP−UX 10.X 4096 # on HP−UX 9.X 1 25 1 2063806464 200 2048 (383*1024*1024) 4096 (1024*1024*1024) 200 124 0 # on desktop systems −1 # on data servers 2800 15000 1024 512

Configuring Kernel Parameters in 9.X
In HP−UX 9.X we recommend manual kernel configuration. All work related to creating a new kernel in 9.X takes place in the /etc directory. You will copy the old kernel configuration file, dfile, into an new name. Modify the dfile. Run make to build the new kernel. Then copy the new kernel file into place after saving the old kernel. • cd /etc/ • cp dfile dfile.old • vi dfile • Modify the dfile to include the kernel parameters and values suggested above. • config dfile • make −f config.mk • mv /hp−ux /hp−ux.old • mv /etc//hp−ux /hp−ux • cd / ; shutdown −h 0 Note: For more information on manual kernel configuration, please see the HP−UX System Administration "How To" Book

Configuring Kernel Parameters in 10.X
In HP−UX 10.X we recommend first manually modifying the kernel parameter hpux_aes_overide and then modifying the other kernel parameters in SAM by using a tuned parameter set. The hpux_aes_override kernel parameter is the only recommended parameter that must be modified manually. The other parameters could then be updated with SAM or modified manually along with hpux_aes_override. We recommend using SAM to take advantage of its built−in kernel parameter rule checker.

Configuring Kernel Parameters in 9.X

13

HP Global Technical Partner − Cadence To configure a kernel manually, you must be root. All work related to creating a new kernel in 10.X takes place in the /stand/build directory. You will create a new kernel configuration file, after moving the existing configuration file, system, into a new name. Run mk_kernel to build the new kernel and copy the new kernel file into place after saving the old kernel (as another name). Then reboot the system • cd /stand/build • /usr/lbin/sysadm/system_prep −s system • vi system • Either add or modify the entries to match: • hpux_aes_override 1 • mk_kernel −s system • mv /stand/system /stand/system.prev • mv /stand/build/system /stand/system • mv /stand/vmunix /stand/vmunix.prev • mv /stand/build/vmunix_test /stand/vmunix • cd / ; shutdown −h 0 Note: For more information on manual kernel configuration, please see the HP−UX 10.X System Administration "How To" Book. . To configure the remaining kernel parameters with SAM, follow these steps: • Login to the system as root • Place the list of kernel parameter values above in the file: • /usr/sam/lib/kc/tuned/stuff.tune (The first line should be "STUFF Applications" in the format shown in the general "Configuring Kernel Parameters" section above.) • Start SAM by typing the command: sam • With the mouse, double−click on Kernel Configuration . • On the next screen, double−click on Configurable Parameters. • SAM will display a screen with a list of all configurable parameters and their current and pending values. Click on the Actions selection on the menu bar and select Apply Tuned Parameter Set ... on the pull−down menu. Select STUFF Applications from the list and click on the OK button. • Click on the Actions selection on the menu bar and select Create A New Kernel. A confirmation window will be displayed warning you that a reboot is required. Click on YES to proceed. • SAM will build the new kernel and then display a form with two options: ♦ Move Kernel Into Place and Reboot the System Now ♦ Exit Without Moving the Kernel Into Place ♦ If you select the first option and then click on OK, the new kernel will be moved into place and the system will be automatically rebooted. ♦ If you select the second option move the kernel from the /stand/build directory into the /stand/vmunix

Networks

Configuring Kernel Parameters in 10.X

14

HP Global Technical Partner − Cadence Network configuration can also have an impact on performance. Virtually all installations use some form of local area network to facilitate sharing of data files and to simplify system management. Most installations use NFS to mount remote file systems so they appear local to the user. This enables the user to access data from any disk on the network as easily as from a local disk. This imposes a performance penalty, however, because the I/O bandwidth for accessing data on an NFS mounted disk is less than that for a directly connected disk. There are a few system configuration recommendations that can be made to maximize the convenience that NFS and the local area network provide while minimizing the performance penalty. • Patches. Always install the latest HP−UX NFS patch. HP periodically releases patches that correct problems associated with NFS, many of them performance related. If you are using NFS, you should make sure the latest patch is installed on both the client and server. See the PATCHES section for more details. General HP−UX patch information can be found on http://us−support.external.hp.com. • Local vs. Remote. You will need to determine what things are located remotely, and which should be local. From a system administration viewpoint, the most convenient scenario is to have applications, data, home directories, and basically anything anyone cares about on a central NFS file server which is backed up regularly. That server is then accessed by multiple clients, which are typically workstations with a minimal amount of local disk for OS and swap, and are not backed up. At the other extreme, for maximum performance it is best to have no network access whatsoever and keep everything on local disks. Between those two extremes there are a continuum of options, all of which have associated tradeoffs. • Subnetting. In general, it is a bad idea to have too many systems on a single wire. Implementation of a switched ethernet configuration with a multi host server or a server backbone configuration can preserve existing wiring while maximizing performance. If you are doing rewiring, seriously consider using fiber for future upgradability. • Local paging. When applications are located remotely, one trick you can use is to set the "sticky bit" on the applications binaries, using the chmod +t and find commands. This forces the system to page the text segment to the local disk, improving performance. Otherwise, it is paged across the network. Of course, this would only apply when there is actual paging occurring. More recently, there is a kernel parameter, remote_nfs_swap, when set to 1 will accomplish same. • Demand loading. Previous versions of this document have setting the demand loading bit on binaries using the chatr command. There's been some controversy over this; empirical data has shown that it does make a difference, while some information has been found stating that there is no difference between demand loadable binaries and shared ones. The current conclusion is that there is indeed a difference and that it may be beneficial to lessen startup times by setting the demand loading bit as described. • File locking. Make sure the revisions of statd and lockd throughout the network are compatible; if they are out of synch, it can cause mysterious file locking errors. This particularly affects user mail files and Korn shell history files. • NFS configuration. On NFS servers, a good first order approximation is to run two nfsd processes per disk. The default is four total, which is probably not enough on a server. On 9.x systems, too many nfsd processes can cause context switching bottlenecks, because all the nfsds are awakened any time a request comes in. On 10.x systems, this is not the case and you can safely have extra nfsd processes. Start with 30 or 40 nfsd's. On NFS clients run sixteen biod processes. In general, HP−UX 10.X has much better NFS performance than previous versions. • Design the lan configuration to minimize inter segment traffic. To accomplish this you will have to ensure that heavily used network services (NFS, licensing, etc.) are available on the same local segment as the clients being served. Avoid heavy cross segment automounting. • Maximize the usage of the automounter. It allows you to centralize administration of the network and also greater flexibility in configuring the network.. Avoid the use of specific machine names which may change over time in your mount scheme; force mount points that make sense. /net ties you to a particular server, which may change over time. Networks 15

HP Global Technical Partner − Cadence • You can watch the network performance with Glance, the netstat command, and the nfstat command. There are other tools like NetMetrix or a LAN analyzer to watch lan performance. Additionally, you can use the HP products PerfView Software/UX and HP MeasureWare/UX to collect data over time and analyze it. You may want to tune the timeo and retrans variables. For HP systems, small numbers 4 for retrans and 7 for timeo are good. The default values for wsize and rsize, 8K, are almost always appropriate. Do NOT use 1024 unless talking to an Apollo system running NFS 2.3 on SR10.3. 8K is appropriate for 10.4 Apollos running NFS 4.1. • Explore using dedicated servers for computing, file serving, and licensing. A good scenario has a group of dedicated servers connected with a fast "server backbone", which is then connected to an ethernet switch, which is itself connected to the desktop systems.

Flexlm Licensing
Some EDA applications use FlexLM, a commonly used UNIX licensing scheme. Some things you may want to be aware of: • Licensing can generate significant network traffic. Some EDA applications perform a "breath of life" license check periodically. This varies from application to application; some intervals are as short as 40 seconds. • In heavy usage mission critical situations, configure three machines to be your redundant license cluster, and make licensing the only thing running on those machines. They can be small workstations, for example, but don't bog them down with NFS or other services. • You can mix license files from many vendors and use a single server or cluster to serve them. The vendors must support Flex 2.2 or above, and you must use the LM_LICENSE_FILE. • There is NO FlexLM performance benefit in node−locked licenses; the server is still contacted for license checkin and checkout. • You will want to follow the following order in the license file: node−lock multilicense lines, node−lock single license lines, floating multilicense lines, floating single license lines. • You must call the vendor hotline and get a new license file if you want to either change the node associated with a node lock license or change servers. • By default the device file /lan0 is overprotected for FlexLM usage.; it is set to rw−−−−−−−. This must be changed for FlexLM to work. rw−r−−r−− is appropriate. This has been fixed at 10.x. The symptom here is that the user root can execute applications successfully, but an ordinary user cannot.

X Terminal Configuration
Many EDA sites are moving to X terminal (or "X station" in HP talk) configurations. Here are some guidelines regarding these configurations: • Server memory. You will need 64Mb to start, and 24−48Mb for each X terminal to be served depending on the application. The more memory, the better. Swap space configuration should fall along the same lines as other systems, just all on the server. • X terminal memory. 18Mb minimum. This allows efficient usage of fonts. • Server kernel configuration. Set maxusers to 64. Set nptys to 512. • Networking. Try and keep X terminal traffic away from critical NFS traffic on the network. • Use NFS to load the server files; it's faster than TFTP. • Font paths. You may have to hardwire the paths to the EDA vendor specific fonts in the setup screen. Or set up a font server.

Flexlm Licensing

16

HP Global Technical Partner − Cadence

Patches
Since patch numbers change frequently, it is recommended that you always check for the latest information. Here are some general recommendations: • If you are using dynamic buffer cache on a 9.x system, load the latest kernel patch that mentions dynamic buffer cache. These patches limit the growth of the buffer cache to half of physical memory, and also modify cache management algorithms to be more efficient. These are not needed on 800 systems (in 9.X), or systems not using dynamic buffer cache. • Always load the latest kernel megapatch, ARPA transport patch, NFS/automounter patches, statd/lockd patches, and SCSI patch. Many performance and reliability improvements can be had. • Load the latest C compiler and linker. The linker in particular is required for 9.01 systems. • Load HP−VUE or CDE, and X/Motif patches at your discretion. Generally these are bug fixes. • Almost always load the latest X server. Many display issues have been solved in the past by loading the latest X server. There have been isolated instances in the past of a new X server causing problems with EDA applications, though. When in doubt, call the hotline. How to get patches. If you have WWW access go to http://us−support.external.hp.com, and follow the links to the patch list. This is also a good way to browse the latest patch list. You can also get patches by e−mail. If you know what the name of the patch you want is, send a message to support@@support.mayfield.hp.com, with the text "send patchname". Don't forget to substitute the name of the patch you want for "patchname". You can get a current list by sending the text "send patchlist". To get a complete guide on using the mail server, send the text "send guide". If the customer has HP SupportLine access, then patches can be requested from the HP SupportLine at (800)633−3600, and are also available for FTP access. How to tell what patches are loaded. First scan the directory /etc/filesets (9.x) systems, or use the swlist command (10.x). Patches are named PHxx_nnnn, where xx can be KL, NE, CO, or SS. nnnn refers to the patch number, which is always unique no matter what PHxx category is specified. If a patch has been loaded on a 9.x system, a file will exist in /etc/filesets, with the same name as the patch. If a patch has been loaded on a 10.x system, the patch should be listed in the output of swlist. How to load patches. Patches are shipped as shell archives, named after the patch. To unpack the shell archive, enter sh filename where filename is the path to the patch shell archive. You will end up with two files, a .text file and a .updt file. The .text file has detailed information about the patch. The .updt file is the actual patch source. You can install the patch with /etc/update on 9.X, either in command line mode or interactive mode. Use the following command line: /etc/update −s/pathname−to−updt−file −S700 −r \* You must specify either −S700 or −S800. The −r allows a kernel rebuild and reboot if you are installing a kernel patch, so be prepared to reboot the system. Using interactive mode, point to the patch file as if it were a tape device in the "Change Source or Destination" menu, then have at it. Make sure you are in single user mode when installing any patch. To install a patch on a 10.X system, use the following command line: swinstall −x autoreboot=true −x match_target=true −s /pathname−to−depot−file

Patches

17

HP Global Technical Partner − Cadence You can install multiple patches at a time by creating a netdist area that contains the patches using /etc/updist, or by specifying a list of patches in a file using the −f switch. Patch management. Patch management can be a fulltime job for a large site. HP recommends that large sites that don't want to tackle that particular task purchase the PSS support option. This service provides a consultant who, among other things, provides patch management. It's well worth the money. How to make a patch tape. On a 9.x system, you can use dd to make a patch tape as follows: dd if=/pathname−to−updt−file of=/rmt/0m bs=2k On a 10.x system, use the following command: swpackage −s /pathname−to−depot −x target_type=tape −d /rmt/0m patchname

Performance Tips
Kernel Parameters Most, if not all of the kernel parameter tuning has been covered in the preceding sections of this document. Any additional/future parameters will appear here. File Systems When using UFS (HFS) file systems, configure them with a block size of 64K and a fragment size of 8K. HFS file systems have historically preferred to perform I/O in 64K block sizes. I have improved performance by using a VxFS (JFS) file system when it is being used as a "scratch" file system...a file system that you do not care about when the application crashes, or when it completes successfully. When doing so, you need to mount this file system with three specific options in order to gain performance. They are: • nolog • mincache=tmpcache • convosync=delay The on−line (advanced) JFS product is required to use these options. In my experience, the JFS block size is of no consequence when using JFS. JFS likes to perform I/O in 64K chunks, regardless of the block size. Supported block sizes are 1, 2, 4, and 8K. There is no fragment on a JFS file system. When striping with LVM, one should make sure that the file system block size and the LVM stripe size are identical. This will aid performance. When mounting file systems, they should be positioned at mount points that are as close to the "root" of the tree. This will help "shorten" directory search paths. It is very important that file systems that contain "tools" that will be used by the application(s), be mounted as close to the top as possible. As of the latest revision (2.0) of this document, there is a JFS "mega patch" for performance. The patch number is PHKL_12901 for 700's and PHKL_12902 for the 800's. Logical Volume Manager The following are simply recommendations...you do not have to do them. Obviously, there are pros and cons with everything. This is not the forum for this type of discussion, so, here they are. Use as many physical disks as possible. Stripe them if you can. If you have followed the file system recommendation of using a 64K block size, use a 64K stripe size as well. I would suggest a 64K stripe size for LVM anyway. Hopefully, you will have identical disks (make, model, size, geometry, etc.). When you have control, place your logical Performance Tips 18

HP Global Technical Partner − Cadence volumes so that the "pieces" a logical volume are located in the same place accross the physical devices. For example, having four physical devices, you "stripe" a logical volume so that 25% of appears on each of the four disks, and, each piece appears at the "top" of the disk. Startup Program I have noticed very many customers and ISV's using the C shell as a startup. This might be OK on other "variations" of UNIX, but does not fare as well on HP−UX (due to the implementation) as the K shell or POSIX shell. When a process forks many children, the .cshrc file is "fired up" and executed for each fork. I have seen some of these files that are extremely long AND they source files that source other files, and so on. This is very time consuming and degrades performance. If possible, do not use the C shell. The PATH Variable This is one of the most abused areas that causes performance problems. PATH variables that are way too long AND the positioning of the directory that contains the most frequently used tools (by the application), at the end. This is of great concern. NFS Check your buffer cache size. Some say 128K for each 1000 IOP's a server expects to deliver. Check your disk and file system configurations: • LVM configuration/layout • Multiple disk striping? • HFS? ...check your block/fragment sizes • JFS? ...check your mount options Reads and writes...server and client block sizes should match. Pay attention to the suggestions for file systems (above). nfsd's ...start with 30 or 40. Some say that 2 per spindle is adequate Make sure that ninode is at least 15000 (on 10.X). Some people have seen performance degradation on Multi Processor systems when ninode is greater 4000. Check it on your system. The details of this problem are much to detailed and complicated for this document. NFS file systems should be exported with the async option in /etc/exports. Some items that can be investigated... nfsd invocations • nfstat −a UDP buffer size • netstat −an | grep −e Proto −e 2049 How often the UDP buffer overflows • netstat −s | grep overflow

Performance Tips

19

HP Global Technical Partner − Cadence NFS timeouts...are they a result of packet loss? Do they correlate to errors reported by the links? Use lanadmin() or netstat −i to check this. IP fragment reassembly timeouts? • netstat −p ip UDP socket buffer overflows? • ...see above mounting through routers? • check to see if routers are dropping packets check for transport bad checksums • netstat −s is server dropping requests as duplicates? • nfsstat is client getting duplicate replies? (badxid) • nfsstat on CLIENT Some people have mentioned that they have had serious problems because of too many levels of hierarchy within the netgroup file. It seems that this file is re−read a very many times, and the more hierarchy, thelonger it takes to read. (c) Copyright 1996 Hewlett−Packard Company.
December 1, 1997

Performance Tips

20

Sign up to vote on this title
UsefulNot useful