You are on page 1of 218

UXIE-SUPPORT

January 2004

Copyright 2004 Hewlett-Packard Development Company, L.P.
The information contained herein is subject to change without notice. The only warranties
for HP products and services are set forth in the express warranty statements
accompanying such products and services. Nothing herein should be construed as
constituting an additional warranty. HP shall not be liable for technical or editorial errors or
omissions contained herein.
This is an HP copyrighted work that may not be reproduced without the written permission
of HP. You may not use these materials to deliver training to any person outside of your
organization without the written permission of HP.
Printed in France
HP-UX 11i Support
Student guide
January 2004
HP Restricted — Contact HP Education for customer training materials.
UXIE-SUPPORT STUDENT HANDOUT

UXIE-SUPPORT
8 modules in virtual class: Module 1 Boot
Module 2 Recovery
Module 3 LVM
Module 4 Mirroring
Module 5 SD-UX Patch
Module 6 Swap Dump
Module 7 file-systems
Module 8 Ignite

Boot-1 UXIE-SUPPORTvB

January 2004 BOOT PAGE: 1


UXIE-SUPPORT STUDENT HANDOUT

OBJECTIVES
Students will follow step by step the complete
boot process up to the Unix prompt in order to
analyse and fix any problem during this critical
phase.
They will be prepared to use the right utility to
recover the system.

Boot-2 UXIE-SUPPORTvB

January 2004 BOOT PAGE: 2


UXIE-SUPPORT STUDENT HANDOUT

Boot process: checkpoints & breakpoints

LABEL BDRA lvol1

HPUX vmunix

AUTO init

ISL shell

Power on PDC It is possible to interrupt


the automatic boot process
at any breakpoint.

Boot-3 UXIE-SUPPORTvB

PDC Processor Dependent Code


ISL Initial System Loader
AUTO Autoboot file
HPUX hp-ux operating system loader
BDRA Boot Data Reserved Area
LABEL Pointers to the starting point and size of boot-relevant
logical volumes

In the boot process, it is possible to highlight some checkpoints. If this checkpoint is correct
you can be sure the process is OK up to this point.
It is possible to stop the boot process : It is breakpoint where it is possible to launch commands
and do checking. Three breakpoints exist:

-PDC
-ISL
-Single user mode (just before the init process)

January 2004 BOOT PAGE: 3


UXIE-SUPPORT STUDENT HANDOUT

PDC Processor Dependent Code

Main Functions:

•Performs a self-test and initialises the SPU

•Stops at the BCH point if autoboot not set

•Finds the ISL loader (defaults to the primary boot PATH)

•Loads and transfers control to the ISL

Boot-4 UXIE-SUPPORTvB

PDC Processor Dependent Code


SPU System Processor Unit
BCH Boot Command Handler
ISL Initial System Loader

Example hereafter of a printscreen of the PDC with the different available


menus.
At the prompt you can start the boot from the primary path and if you answer yes
you will force the system to stop at the ISL breakpoint.

---- Main Menu ---------------------------------------------------------------


Command Description
------- -----------
BOot [PRI|ALT|<path>] Boot from specified path
PAth [PRI|ALT] [<path>] Display or modify a path
SEArch [DIsplay|IPL] [<path>] Search for boot devices

COnfiguration menu Displays or sets boot values


INformation menu Displays hardware information
SERvice menu Displays service commands

Display Redisplay the current menu


HElp [<menu>|<command>] Display help for menu or command
RESET Restart the system
----
Main Menu: Enter command or menu >bo pri
Interact with IPL (Y, N, or Cancel)?> y

January 2004 BOOT PAGE: 4


UXIE-SUPPORT STUDENT HANDOUT

QUESTION
What is the PDC?

How do we know that its job is complete?

Boot-5 UXIE-SUPPORTvB

The PDC is the firmware that implements all processor-dependent functionality,


including initialization and self-test of the processor. Upon completion, it loads
and transfers control to the initial system loader (isl(1M)).

When the processor is reset after initialization and self-test complete, the PDC
reads the Console Path from Stable Storage, and attempts to initialise the console
device.

Its job is complete when you reach the BCH (boot command handler) prompt:

For example:

Main Menu: Enter command or menu >

January 2004 BOOT PAGE: 5


UXIE-SUPPORT STUDENT HANDOUT

ISL Initial System Loader


Main Functions:

•Opens the AUTO file which defines the secondary loader


options and behaviour
•Loads and transfers control to the secondary loader utility

Content of the AUTO file

•disk hardware path, disk section (default 0)


•secondary loader name and path in lvol1 (default vmunix)
•Option to execute the secondary loader

Boot-6 UXIE-SUPPORTvB

ISL> help

HELP Help Facility


LS List ISL utilities
AUTOBOOT Set or clear autoboot flag in stable storage
AUTOSEARCH Set or clear autosearch flag in stable storage
PRIMPATH Modify primary boot path in stable storage
ALTPATH Modify alternate boot path in stable storage
CONSPATH Modify system console path in stable storage
DISPLAY Display boot and console paths in stable storage
LSAUTOFL List contents of autoboot file
FASTSIZE Sets or displays FASTSIZE
800SUPPORT Boots the s800 Support Kernel from the boot device
700SUPPORT Boot the s700 Support Kernel from the boot device
READNVM Displays contents of one word of NVM
READSS Displays contents of one word of stable storage
LSBATCH List contents of batch file
BATCH Execute commands in batch file
LSEST List contents of EST (Extended Self Test) file
EST Execute commands in EST (Extended Self Test) file
RESET Reboot the system

ISL> ls
Utilities on this system are:
filename type start size created
ODE -12960 584 848 03/03/03 16:01:22
HPUX -12928 3848 848 00/11/08 20:50:00

January 2004 BOOT PAGE: 6


UXIE-SUPPORT STUDENT HANDOUT

HPUX secondary system loader


Main Functions:
•Locates the boot, root and swap logical volumes
(using by default the LABEL file)

•Mounts the boot file-system (/stand)

•Locates the kernel file (default vmunix)

•Loads the kernel file into memory

•Transfers the control to the kernel


hpux disc (<HW path of boot disk>;<section>)<path of kernel> <options>

Boot-7 UXIE-SUPPORTvB

HPUX System bootstrap utility

The hpux command has three parameters:

- The hardware location of the boot file-system.


- The path in the file-system and the name of the kernel.
- The kernel options (see later)

Example:

ISL> hpux
(2/0/1.5.0;0)/stand/vmunix.prev -is
Boot :
disk(2/0/1.5.0;0)/stand/ vmunix.prev
5615348 + 425984 + 406432 start 0x1c2868

Where:

2/0/1.5.0 is the disk hardware


path
;0 is the disk section
number (0 for LVM)
/stand is the kernel directory
vmunix.prev is the kernel name
-is is the command option
January(single
2004user mode) BOOT PAGE: 7
UXIE-SUPPORT STUDENT HANDOUT

QUESTION
How is the HPUX loader able to locate the boot file-system?

How is the HPUX loader able to mount the boot file-system?

Boot-8 UXIE-SUPPORTvB

The LABEL file is used by the HPUX loader to locate the kernel and to get the
kernel startup parameters (root file-system and primary swap).

Once the boot file-system is located, the HPUX loader needs to copy its super
block into memory and mount the file-system. At this point, as LVM is not
running, this logical volume MUST be contiguous in order to have access to the
whole file-system. If it was not the case, it won’t be possible to know the size of
any gap between extents. This is normally the work of the LVM VGDA.

The BDRA (Boot Data Reserve Area) is used by the LVM subsystem to initialize
itself and activate the Root volume group. This area contains the boot disks
hardware paths, the locations of the root, primary swap, and dump logical
volumes.

The remaining volume groups are activated later during the boot process;
information about them is contained in the /etc/lvmtab file, accessible once the
root volume group is activated and the root file-system is mounted.

January 2004 BOOT PAGE: 8


UXIE-SUPPORT STUDENT HANDOUT

HPUX options

•Options are read from the AUTO file

•Options can be changed manually at the ISL prompt or


using the mkboot command from a running system

hpux main options

ls or ll list files in /stand


-lq boot in quorum mode
-is boot in single user mode
-lm boot in LVM maintenance mode

Boot-9 UXIE-SUPPORTvB

Hereafter are some examples of possible options:

ISL> hpux show autofile [devicefile]


ISL> hpux set autofile devicefile string
ISL> hpux ls disk(0/0/1/1.2.0.0.0.0.0;0)
ISL> hpux ll [devicefile]
ISL> hpux ls [-aFiln] [devicefile]
ISL> hpux
ISL> hpux -is
ISL> hpux /stand/vmunix.prev

system_sh # mkboot -a "hpux" /dev/rdsk/<boot_disk>

January 2004 BOOT PAGE: 9


UXIE-SUPPORT STUDENT HANDOUT

/stand/vmunix
Main Functions:

•Kernel file

•Locates and mounts the root file-system

•Executes the /sbin/pre_init_rc script

•Starts the SWAPPER

•Starts the initial /sbin/init process

Boot-10 UXIE-SUPPORTvB

/stand/vmunix locates and configures the hardware devices,


locates the root file-system, and starts a shell to read commands from
/sbin/pre_init_rc.
It then starts the first process /sbin/init, which reads from the /etc/inittab
initialisation file to define the environment for normal working conditions.

/sbin/pre_init_rc checks the root file-system.

When the system begins thrashing or when free memory falls below another
threshold, known as "desfree", the swapper becomes active. the swapper
deactivates processes, which prevents them from running, and thus reduces the
rate at which new pages are accessed. Pages belonging to a process that is
deactivated will not be referred to and will become good candidates to be freed
by the paging daemon. When the swapper detects that available memory has risen
above the minfree threshold and the system is no longer thrashing, it will
reactivate the deactivated processes.

January 2004 BOOT PAGE: 10


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ1
The secondary loader (hpux) is OK, but is unable to load vmunix.
What could be the cause of the problem in the boot process, in
chronological order?

† lvol1 is corrupt
† /stand/vmunix has been deleted
† BDRA is defective
† LABEL is defective

Boot-11 UXIE-SUPPORTvB

Any of this 4 answers can generate a boot failure.

LABEL BDRA lvol1

HPUX vmunix

AUTO init

ISL shell

PDC

But looking at the boot process the first error is given by the LABEL.

January 2004 BOOT PAGE: 11


UXIE-SUPPORT STUDENT HANDOUT

QUESTION
How is the kernel able to locate the root file-system?

How is the kernel able to mount the root file-system?

Boot-12 UXIE-SUPPORTvB

The BDRA (Boot Data Reserve Area) is used by the LVM subsystem to initialize
itself and activate the Root volume group. This area contains the boot disks
hardware paths, the locations of the root, primary swap, and dump logical
volumes.

The kernel needs to copy the root file-system super block into memory and mount
it to the /stand mount point. At this point, as LVM is not running, this logical
volume MUST be contiguous in order to have access to the whole file-system. If
it was not the case, it won’t be possible to know the size of any gap between
extents. This is normally the work of the LVM VGDA.

January 2004 BOOT PAGE: 12


UXIE-SUPPORT STUDENT HANDOUT

init process

Main Functions:

•Reads the content of the /etc/inittab file

•Sets the default running level

•Defines the environment for normal working conditions

•Starts all other processes

Boot-13 UXIE-SUPPORTvB

/sbin/init
Executable that starts the system and calls inittab to determine what
steps to take.
/etc/inittab
Script containing instructions for init program.
/sbin/rc
Script invoked when a new run level is reached; calls rc.config (among
other things) to execute startup scripts.
/etc/rc.config
Runs all scripts in /sbin/rc(level).d/ directory in alpha order.
/etc/rc.log
Output of scripts run at shutdown and startup

HP-UX starts the init process, /sbin/init. The init process has process ID one (1) and no parent
process. The init process is ultimately responsible for starting all other user processes.
The init process reads the /etc/inittab initialization file to define the environment for normal
working conditions.

System Initialisation File -- /etc/inittab


The init process reads the /etc/inittab file one line at a time, each line containing an entry that
describes an action to take.

Default Run Levels -- initdefault


The /etc/inittab file sets up system run-levels. Run-levels are defined as collections of processes
that allow the system to operate with certain properties. The entry marked initdefault sets the
default run-level (typically 3 or 4):
init:3:initdefault:

January 2004 BOOT PAGE: 13


UXIE-SUPPORT STUDENT HANDOUT

Which scripts are executed by the


init process ?

• /sbin/ioinitrc

• /sbin/bcheckrc
• /sbin/rc
• /usr/sbin/getty

Boot-14 UXIE-SUPPORTvB

The init process will start these four scripts in this order.

January 2004 BOOT PAGE: 14


UXIE-SUPPORT STUDENT HANDOUT

/sbin/ioinitrc

•Checks the I/O configuration


•Checks if the configuration files ioconfig exist
•Invokes /sbin/ioinit on errors
•Compares the 2 copies:
•/stand/ioconfig Set Instance numbers
•/etc/ioconfig

Boot-15 UXIE-SUPPORTvB

What’s an ioconfig file?


It’s a binary format file that contains what the hardware configuration of your system looks like.
The notes have a decoded example from an 11.00 system.

/usr/sbin/ioinit
Summary Test and maintain consistency between the kernel I/O data structures and /etc/ioconfig.
(Called by ioinitrc during boot up).

/sbin/ioinitrc
This script first checks if the system is running from a local root or is running as an NFSD client.
For local root systems, it checks if the boot filesystem must be mounted to /stand, as is the case
for separate boot/root filesystem; for NFSD clients, it starts the nfs daemons.

It then checks to see if /etc/ioconfig exists. ioinitrc will invoke a shell at the console if
/etc/ioconfig is absent. Otherwise ioinitrc invokes ioinit with the -i and -r options.

NOTE: This script is not configurable! Any changes made to this script will be overwritten when
you upgrade to the next release of HP-UX.

January 2004 BOOT PAGE: 15


UXIE-SUPPORT STUDENT HANDOUT

What are Instance Numbers ?

• Numbers associated with a class of devices


• Uniquely identifying a device within its class.

• Allocated by the /sbin/ioinitrc script


• Kept in 2 binary files
• /stand/ioconfig
• /etc/ioconfig

Boot-16 UXIE-SUPPORTvB

What is an instance number?


An instance number is a number associated with a class of device that will uniquely identify it to
the system.
Some instance numbers mean nothing to the usage of a device (disk drives) almost every other
device has some important dependence on its instance number.

Done What, now its when?


Instance numbers are allocated by /sbin/ioinitrc during system startup.
A copy of the ioconfig file (where instance numbers are kept on permanent basis) is kept in the
boot and root filesystem (if separate). If they are different new device files will be created and the
files synchronised.

Class
Match devices that belong to a given device class, class. Device classes can be listed with the
lsdev command. They are defined in the files in the directory /usr/conf/master.d. The special class
pseudo includes all pseudo-drivers.

January 2004 BOOT PAGE: 16


UXIE-SUPPORT STUDENT HANDOUT

How can we modify them ?

• Instance numbers can be changed through ioinit:

– create a file describing the changes (/tmp/newinst)


– invoke the ioinit command

#/sbin/ioinit -f /tmp/newinst

– reboot

Boot-17 UXIE-SUPPORTvB

How can you change it?


The only supported way to change it is via the ioinit command. For example you changed the
SCSI address of the tape drive to 1 in the decode example and you want to reset the tape instance
to 0 instead of its current 1.

The three steps :


- create a file containing the following (in this example the file will be
/tmp/iochanges):

8/16/5.1.0 tape 0
8/16/5.0.0 tape 1

- run the ioinit command:


# ioinit -f /tmp/iochanges –r

- reboot the system


The -r option will reboot the system, this is mandatory to change the instance
number.

January 2004 BOOT PAGE: 17


UXIE-SUPPORT STUDENT HANDOUT

QUESTION

How are instance numbers distributed?

Boot-18 UXIE-SUPPORTvB

As the system discovers the hardware, it distributes the instance numbers


starting from 0 in each class of products and then increase this number.

A unique number is set per class to a device.

January 2004 BOOT PAGE: 18


UXIE-SUPPORT STUDENT HANDOUT

/sbin/bcheckrc

•Activates LVM volume group (performs the quorum check)


using the /sbin/lvmrc script file
and the /etc/lvmtab file
•Runs eisa_config
•Checks and mounts file-systems
using the /etc/fstab file

Boot-19 UXIE-SUPPORTvB

If you are implementing the Logical Volume Manager (LVM), bcheckrc calls /sbin/lvmrc to
activate LVM volume groups.

On all systems, /sbin/bcheckrc verifies that the system was properly shut down, and that file-
systems were consistently saved on the disk. Depending on the type of the file-system, bcheckrc
may invoke a series of operations such as fsck that verify file-system integrity, and correct it if
necessary. If a file-system has become damaged and fsck cannot repair it automatically without
loss of data, then bcheckrc will start a shell at the system console with the prompt "(in
bcheckrc)#", instructing you to run fsck interactively. If this occurs, you must run fsck to correct
the integrity of your file-system.

Some file-system problems must be fixed in this way to minimize risk of data loss. After running
fsck interactively, you may be instructed to reboot the system. If so, reboot the system using
"reboot -n" to bring down the system without writing the in-core filesystem map out to disk; this
maintains the disk file-system's integrity. If fsck does not tell you to reboot, exit the shell by
typing CTRL-D. This returns control to bcheckrc.
When fsck has verified the consistency of the file-systems, it exits.

/etc/lvmrc
# This file is sourced by /sbin/lvmrc. This file contains the flags
# AUTO_VG_ACTIVATE and RESYNC which are required by the script in /sbin/lvmrc.
# These flags must be set to valid values.

/sbin/lvmrc
This file controls the automatic activation of LVM volume groups
during boot up.
The activation of volume groups can be customised by editing /etc/lvmrc.
The script in /sbin/lvmrc (this file) depends on the variables
AUTO_VG_ACTIVATE and RESYNC, both of which are set in /etc/lvmrc.

January 2004 BOOT PAGE: 19


UXIE-SUPPORT STUDENT HANDOUT

/sbin/rc
•Initialises the system up to the default run level (/etc/inittab)
•Each level has a specific set of scripts
•They are stored in /sbin/rcx.d (x=level#)

Starting script Killing script

/sbin/rc3.d/Sxxx LEVEL3 /sbin/rc2.d/Kxxx

/sbin/rc2.d/Sxxx LEVEL2 /sbin/rc1.d/Kxxx

/sbin/rc1.d/Sxxx LEVEL1

Extract from the /etc/inittab file.


sqnc::wait:/sbin/rc >/dev/console 2>&1 # system initialization

Boot-20 UXIE-SUPPORTvB

/sbin/rc
Initial Customisation Script -- /sbin/rc
The following entry invokes /sbin/rc:

sqnc::wait:/sbin/rc >/dev/console 2>&1 # system initialization

The /sbin/rc script is a general purpose sequencer program that runs whenever there is a change in
the system run-level (such as a change in the run-level from 2 to 3). The system executes /sbin/rc
at startup as this is a change from run-level 0 (halted) to the initdefault level.

/sbin/rc invokes startup scripts appropriate for the run-level. When entering state 0, /sbin/rc starts
scripts in /sbin/rc0.d. When a system is booted to a particular run-level, it will execute startup
scripts for all run-levels up to and including the specified level. For example, if you are booting to
run-level 4, the /sbin/rc sequencer script executes the start scripts in this order: /sbin/rc1.d,
/sbin/rc2.d, /sbin/rc3.d, and /sbin/rc4.d.

Current start scripts and sequence numbers are listed in an accompanying file. Note that the
entries on your system may vary depending on your configuration. The scripts are run in
alphanumeric sequence. For a description of the script, see the ASCII-readable script file on your
system.

Also note that kill scripts for start scripts in directory /sbin/rc{{n}}.d reside in /sbin/rc({{n}}-
1).d.
The init process waits until /sbin/rc exits before processing the next entry in /etc/inittab.

January 2004 BOOT PAGE: 20


UXIE-SUPPORT STUDENT HANDOUT

HP-UX shutdown & reboot

Multi-User Mode

shutdown

Shutdown -h
Shutdown -r
Single-User Mode

reboot -h
reboot
power

Halt State
power
Power Off

Boot-21 UXIE-SUPPORTvB

Shutdown command is a multi-user command. It will push the system to the


single user mode stopping all processes according to the killing scripts.

Reboot command is a single user mode command, which is used to push the
system from the single user mode to the multi-user mode through the power on
state. It will stop no multi-user processes. If they are running they are
immediately killed. If a data base application is running, it will certainly corrupt
the data base.

NOTE: Never use the reboot command in multi-user mode at the


customer site.

Some option are used, such as -r to force a reboot or -h to halt a system.

If You need to halt very and properly a system in order to power it off use the
command:
#shutdown -hy 0

January 2004 BOOT PAGE: 21


UXIE-SUPPORT STUDENT HANDOUT

QUESTION
What is the difference between the reboot
and the shutdown command?

Boot-22 UXIE-SUPPORTvB

Reboot is a single user mode command.

Shutdown is a multi-user mode command.

January 2004 BOOT PAGE: 22


UXIE-SUPPORT STUDENT HANDOUT

/usr/sbin/getty

• Terminal processes start up


• Starts the login process

Extract from the /etc/inittab file.

cons:123456:respawn:/usr/sbin/getty console console


ttp1:234:respawn:/usr/sbin/getty -h tty0p1 9600

Boot-23 UXIE-SUPPORTvB

/usr/sbin/getty
Terminal Processes Startup -- /usr/sbin/getty
Once /sbin/rc has finished control returns to init, which runs the commands
from the process field for appropriate run-level entries in /etc/inittab.
Typically, entries in /etc/inittab for a given run-level invoke the /usr/sbin/getty
command, one for each terminal on which users log in.
(When you add a new terminal with the SAM utility, it automatically adds an
appropriate /usr/sbin/getty entry to /etc/inittab.) The /usr/sbin/getty command
runs the login process on the specified terminal, allowing users to login on the
terminal.

For example, the following /etc/inittab entry runs a getty at the system console:

cons:123456:respawn:/usr/sbin/getty console console # system console

The respawn action field tells init to restart the getty process after it exits.
This means that each time you log off the system console, a new "login:
"prompt is displayed, so you can log in the next time. The 123456 run-levels
field indicates that init runs getty in run-levels 1 through 6.

January 2004 BOOT PAGE: 23


UXIE-SUPPORT STUDENT HANDOUT

QUESTION
How are terminals initialised for telnet connections?

Boot-24 UXIE-SUPPORTvB

They don’t use the getty process. They use the program inetd, which will start
the telnet program after security and permission verifications.

January 2004 BOOT PAGE: 24


UXIE-SUPPORT STUDENT HANDOUT

log files
• # dmesg command
• display console diagnostics messages

• /etc/rc.log
• Output of startup and shutdown scripts

• /var/adm/syslog/syslog.log

• /etc/shutdownlog

Boot-25 UXIE-SUPPORTvB

dmesg looks in a system buffer for recently printed diagnostic messages and prints them on the
standard output. The messages are those printed by the system when unusual events occur (such
as when system tables overflow or the system crashes).

rc.log records all events concerning sub systems during the init phase.

syslog.log will record all system activity since the last boot on a new just created file. Its
configuration file is /etc/syslog.conf and its daemon is /sbin/init.d/syslogd.

shutdownlog file is updated when the system reboots. It is updated, as well, when the system
stops. It is very useful to see the history of the system.

January 2004 BOOT PAGE: 25


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 2
How can we recover a /etc/inittab file erased or corrupt?

† Re-install the system


† Boot in single user mode
† Use a diagnostic recovery CD
† Boot in quorum mode

Boot-26 UXIE-SUPPORTvB

The inittab file is corrupted.There is a way to boot without it:


Boot in single user mode. You reach the unix prompt just before
the init process.

LABEL BDRA lvol1

HPUX vmunix

AUTO init

ISL shell

PDC

At this point,You can edit the file, restore it or re-create it.

January 2004 BOOT PAGE: 26


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 3
The root password has been changed, but nobody knows it.
How can we login to the system?

† Re-install the system


† Boot in single user mode
† Use the diagnostic recovery CD
† Boot in quorum mode

Boot-27 UXIE-SUPPORTvB

The login process is started by the init process, so if You boot just before the init
process, you will be able to change the root password. Boot in single user mode,
the system gives you the prompt just before to start the init process.
You have the prompt and you are connected under root with all capabilities.
You are able to change the root password with the command:

#passwd root

January 2004 BOOT PAGE: 27


UXIE-SUPPORT STUDENT HANDOUT

UXIE-SUPPORT
8 modules in virtual class: Module 1 Boot
Module 2 Recovery
Module 3 LVM
Module 4 Mirroring
Module 5 SD-UX Patch
Module 6 Swap Dump
Module 7 File Systems
Module 8 Ignite

Recovery-1 UXIE-SUPPORTvB

January 2004 RECOVERY PAGE: 1


UXIE-SUPPORT STUDENT HANDOUT

OBJECTIVES

Students will be able to discover reasons of a


boot failure. They will know all commands or
utilities in order to fix the problem.
After a short analysis, students are able to
take the right decision: Recover or Restore

Recovery-2 UXIE-SUPPORTvB

January 2004 RECOVERY PAGE: 2


UXIE-SUPPORT STUDENT HANDOUT

What to do in case of failure ?

• Re-install HP-UX and restore all files 1 day

• Reload from a recovery tape (IGNITE) 2 hours

• Recover (HP-UX prompt) 20 minutes

Recovery-3 UXIE-SUPPORTvB

In case of system failure, you MUST re-install the HP-UX system from your original CDs.
Then you have to install all applications, then all current patches. Then you have to customise the
system with the help of your customer. (group, users, passwords)
This will take at least one complete day, perhaps more.

The second option is to use an IGNITE backup take created before.


You must have install the product, which corresponds to your HP-UX release and then do correct
backups.
This could be an entire backup (see the ignite module) or a core OS backup and a fbackup of all
files.

The last option is to try to recover the system.


Sometime it is better to spend sometime to try to recover. The analysis will take less than 20
minutes and you will have an exact diagnostic of the trouble. At this point you have a real idea
of the time you need to fully recover the system.
Schedule an action plan with your customer and take the right decision with him. Even if you
need one hour to recover, your customer will prefer top wait 2 hours to reload and to be sure to
have a solid dead line. Never forget the customer satisfaction and the customer expectations.

January 2004 RECOVERY PAGE: 3


UXIE-SUPPORT STUDENT HANDOUT

How to recover ?

• Boot in degraded mode


– Single User mode
– LVM quorum mode
– LVM maintenance mode

• Use a Recovery medium


– automatic recovery
– manual recovery

Recovery-4 UXIE-SUPPORTvB

Recovery overview
System recovery is necessary when the system does not boot up properly.
If the system can be booted in single user mode, or in LVM maintenance mode, you can use a set
of HP-UX commands to recover the system.
However, if the system cannot be booted, you must use the recovery media to recover the system
either automatically or manually.
►Using automatic recovery, scripts are run to restore the system in a minimal
mode. Further recovery operations needs to be done once the system is
rebooted.
►Using manual recovery, you have access to an HP-UX shell run from the
recovery media. This lets you use your own customised recovery procedure.
Lastly, Ignite/UX, introduced for version 11.00 but supported in versions 10.x, provides the ability
of creating a recovery tape, that is a bootable backup tape.

January 2004 RECOVERY PAGE: 4


UXIE-SUPPORT STUDENT HANDOUT

QUESTION

Where can I find recovery programs?

Recovery-5 UXIE-SUPPORTvB

The recovery programs are located in the Install/core&recovery CD.


This programs are, as well, located in any Ignite server.

January 2004 RECOVERY PAGE: 5


UXIE-SUPPORT STUDENT HANDOUT

Bootable Disk LVM Structures

LIF Header
LIF volume
PVRA
BOOT AREA
BDRA ISL HPUX AUTO
(2,9 MB)
LIF Area ODE LABEL ...

VGRA

lvol1

lvol2 DATA AREA


lvol3
...

Recovery-6 UXIE-SUPPORTvB

PVRA Physical Volume Reserved Area

BDRA Boot Data Reserved Area

VGRA Volume Group Reserved Area

LABEL File and BDRA


The boot process of HP-UX on a LVM boot disk involves both the LABEL file, used by the
kernel loader, and the BDRA, used by the kernel.
The LABEL file is read by the kernel loader to locate the kernel and to get the kernel startup
parameters (root and boot file systems and primary swap). If it is lost or corrupted, the kernel
loader is unable to locate the kernel.
The BDRA is used by the LVM subsystem to initialise itself and activate the root volume group.
If the BDRA is corrupted, the kernel cannot initialise LVM, even for the root disk.
The remaining volume groups are activated later during the boot process: their information is
contained in the /etc/lvmtab file, accessible once the root volume group is activated.

January 2004 RECOVERY PAGE: 6


UXIE-SUPPORT STUDENT HANDOUT

Boot and Root Logical volumes

2912KB
Boot Area

LVOL1 Boot /stand


rootconf

First address Primary swap


LVOL2
default dump

LVOL3 Root /
?

Recovery-7 UXIE-SUPPORTvB

Boot Logical Volume: The logical volume containing the HP-UX kernel.
Root Logical Volume: The logical volume containing the root file system.
The LVM boot disk contains four important logical volumes:

►Boot filesystem. It is the filesystem containing the kernel, thus /stand in 10.20 and further. In
10.01 and 10.10, this logical volume is the same than the root filesystem. The boot filesystem
must be the first logical volume on the boot disk, and contiguous. Notice the filesystem must be
HFS (even in 11.00).
►Root filesystem. It is the filesystem /. In 10.20 and further, it is a separate logical volume from
the boot filesystem. It must be within the first 2 GB of the boot disk, and contiguous. It is usually
lvol3 in 10.20 and further.
►Primary swap. It must be a contiguous logical volume within the first 2 GB of the boot disk. It
is usually lvol2.
►Dump device. By default, it is the primary swap. If changed, it must be a contiguous logical
volume on the root volume group.
In addition to a regular LVM disk, the LVM boot disk has a BDRA (boot disk reserved area),
used to initialise LVM. It contains pointers to the four important logical volumes listed above.
In addition, an LVM boot disk also has a LIF file specific to LVM configurations, the LABEL
file. This file contains pointers to the boot and root filesystems.

/stand/rootconf contains the location of the root logical volume. It is used during maintenance-
mode boots to locate the root volume for volume groups with separate boot and root volumes

January 2004 RECOVERY PAGE: 7


UXIE-SUPPORT STUDENT HANDOUT

QUESTION

Which processes require the starting


address of lvol1?

Recovery-8 UXIE-SUPPORTvB

This starting address, which is the same for all HP-UX releases since 10.20, is
used when the LABEL file is corrupted or empty.

This is the case when we boot in LVM maintenance mode. We suspect the
LABEL file and its lvol1 entry is by-passed by the 2912 address, which is
hardcoded in the hpux program.

This address is used also by the offline recovery programs during the LIF
format. These programs rewrite the LABEL file and put 2912 at the place for the
lvol1 address.

January 2004 RECOVERY PAGE: 8


UXIE-SUPPORT STUDENT HANDOUT

Boot process possible failures


Check points

PDC ISL AUTO HPUX LABEL BDRA lvol1 vmunix

ISL>ls

ISL>lsauto

ISL>

BCH>

ISL>hpux ll
Recovery-9 UXIE-SUPPORTvB

BCH Boot Command Handler


During a boot failure analysis, different check points can be verified by different
commands at one of the two break points:
PDC If you reach the BCH prompt, the PDC and its selftests are OK.
ISL If you reach the ISL prompt, you are sure that the ISL primary
loader is OK and the ISL headers as well.
AUTO file As you are under the ISL prompt, you can list or rewrite the
AUTO file.
HPUX The secondary loader hpux can be verified under ISL using the ls
command. But only the hpux existence, not its validity.
BDRA Under the ISL prompt, with the ll command you can check these
LABEL four check points. If you are able to list files in the lvol1 boot file
lvol1 system, you are sure that BDRA & LABEL or OK for lvol1.
vmunix If you are able to mount the boot file system, you are sure its
super block is not corrupted. Then listing files in the boot file
system, you can verify the existence of the kernel file and its
format as hpux put a start at the end of all file names it recognises
as a valid kernel file.

January 2004 RECOVERY PAGE: 9


UXIE-SUPPORT STUDENT HANDOUT

Boot process possible failures


If you can’t check this point

PDC ISL AUTO HPUX LABEL BDRA lvol1 vmunix

Hardware problem

Recovery-10 UXIE-SUPPORTvB

PDC Processor Dependent Code


If you are unable to reach the BCH prompt, there is a hardware failure (usually
denoted by an IODC tombstone).

January 2004 RECOVERY PAGE: 10


UXIE-SUPPORT STUDENT HANDOUT

Boot process possible failures


If you can’t check this point

PDC ISL AUTO HPUX LABEL BDRA lvol1 vmunix

Fix the problem


Hardware problem (disk, iodc…)

Path not specified Check and modify the PATH

ISL header corrupted


Reformat the LIF volume
ISL corrupted with the recovery CD
or from an IGNITE server.

Recovery-11 UXIE-SUPPORTvB

ISL Initial System Loader

If you cannot boot on the primary disk, you can have a disk failure or the primary
path specified in the stable storage EEPROM doesn’t point to the right disk.
Or the LIF area is lost or corrupted. To recover the LIF, you have to use the
recovery media (recovery CD or an IGNITE server). Use the automatic recovery
procedure to rebuild only the LIF. You need then to reboot in LVM maintenance
mode (on the boot disk), then to update the BDRA and LABEL file information
(using the lvlnboot command).

January 2004 RECOVERY PAGE: 11


UXIE-SUPPORT STUDENT HANDOUT

Boot process possible failures


If you can’t check this point

PDC ISL AUTO HPUX LABEL BDRA lvol1 vmunix

The file is empty Put the right info with the command

The file has bad info ISL>hpux set autofile “string”

Recovery-12 UXIE-SUPPORTvB

AUTO Autoboot file

For any reason the AUTO file contains not the correct information to boot the
system in multi-user mode.

To check it, you can boot manually, setting directly the command under the ISL
prompt. If it is OK then you can set the contain of the AUTO file using with
command:

ISL> hpux set autofile “contain of the file”

January 2004 RECOVERY PAGE: 12


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 4
The AUTO file is missing. How can you recreate it?

† Boot in single user mode


† Use the command “hpux set autofile”
† Boot in quorum mode
† Reformat the LIF
† Re-install

Recovery-13 UXIE-SUPPORTvB

The AUTO file has been removed from the LIF volume.
Your system is unable to boot as it doesn’t know which secondary loader to use.

This file can be recreated under HP-UX with the #mkboot -a command, after a
manual boot.

Under ISL you can, as well, recreate it with the hpux set autofile command.

January 2004 RECOVERY PAGE: 13


UXIE-SUPPORT STUDENT HANDOUT

Boot process possible failures


If you can’t check this point

PDC ISL AUTO HPUX LABEL BDRA lvol1 vmunix

The file is missing


Reload the HPUX file
The file is corrupted using the recovery CD
or an IGNITE server.

Recovery-14 UXIE-SUPPORTvB

HPUX vmunix operating system loader


NOTE - The automatic recovery procedure configures the AUTO file to start the kernel in LVM maintenance
mode. Thus, once the system is recovered, you must not forget to modify the AUTO file, either from the ISL
prompt or using the mkboot command from the HP-UX prompt.
The recovery main menu contains a number of options. To make a system which LIF area is corrupted
bootable, you should first try to restore only the Bootlif. If the kernel is still good, and you override it with a
minimal kernel, you will loose time recovering the kernel… Thus the good option is c, in this case.
The first screen asks for the boot path. It will be used to locate the device file (from the recovery media mini-
kernel).
The second screen asks for confirmation of the device file used to access the root filesystem. It should be
OK.
The third screen asks for confirmation of the content of the AUTO file. It should be OK.
The recovery procedure will run the mkboot command twice:
mkboot -b /dev/dsk/c1t2d0 -l -i ISL -i HPUX /dev/rdsk/c0t6d0
to download a fresh copy of ISL and HPUX LIF files, and
mkboot -a "hpux -lm /stand/vmunix" /dev/rdsk/c0t6d0
to create the AUTO file (booting the kernel in LVM maintenance mode).
The fourth screen prompts for rebooting the system. Do not forget, once you have checked the filesystems
and the system is recovered, to recreate an AUTO file, for example using the following HP-UX command:
mkboot -a "hpux" /dev/rdsk/c0t6d0

January 2004 RECOVERY PAGE: 14


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 5
When you reformat the LIF volume using a recovery media,
which files are restored?

‰ /stand/vmunix
‰ ISL
‰ /vmunix
‰ hpux

Recovery-15 UXIE-SUPPORTvB

When the LIF volume is reformatted by the offline recovery programs, it will
rewrite ISL headers, ISL program, the hpux secondary loader and the LABEL
file is changed with the lvol3 entry.
The AUTO file is modified to force the system to boot in LVM maintenance
mode, skipping the LABEL lvol1 entry.

This information must set at the next reboot using the #lvlnboot command.

January 2004 RECOVERY PAGE: 15


UXIE-SUPPORT STUDENT HANDOUT

Different Boot Modes

ISL> hpux -is Boots HP-UX in single user mode

Fails if problem on root VG

ISL> hpux -lq Boots HP-UX ignoring quorum requirement

Can be used along with -is

ISL> hpux -lm Boots HP-UX in LVM maintenance mode

Fails if the BDRA is corrupt

ISL> hpux ll List files in the boot logical volume (lvol1)

Recovery-16 UXIE-SUPPORTvB

ISL Boot Commands


The hpux -is command boots up in single user mode. Only the root volume group is activated.
However, if the root volume group cannot be activated, or if the kernel cannot be located, this
command will fail.
The hpux -lq command (usually used together with the previous, hpux -is -lq) boots up ignoring
LVM quorum requirements. This command can be used if the root volume group has quorum
problems. However, it will fail if the boot disk itself is corrupted (bad LIF or BDRA).
The hpux -lm command boots up in LVM maintenance mode. It is even less than the single user
mode: no volume group is activated. It ignores the LABEL file, using the fixed offset between the
beginning of the disk and lvol1 (the boot filesystem). However, if the BDRA is corrupted, this
command will fail in most cases

January 2004 RECOVERY PAGE: 16


UXIE-SUPPORT STUDENT HANDOUT

Boot process possible failures


If you can’t check this point

PDC ISL AUTO HPUX LABEL BDRA lvol1 vmunix

Try “hpux -lm”


LABEL file is corrupted
LABEL contains wrong
information
Reformat the LIF volume
using the recovery CD
or an IGNITE server.

Recovery-17 UXIE-SUPPORTvB

LABEL Pointers to the starting point and size of boot-relevant


logical volumes

If only entry for lvol1 is corrupted, you can boot skipping this entry using the
LVM maintenance mode.

If the entry for lvol3 is corrupted, you must reset it using the offline recovery
programs, which will use the rootconf entry.

January 2004 RECOVERY PAGE: 17


UXIE-SUPPORT STUDENT HANDOUT

What is the LABEL file ?

• The LABEL file stores addresses of the:

– boot logical volume (lvol1)


– root logical volume (lvol3)
– primary swap (lvol2)

Recovery-18 UXIE-SUPPORTvB

The LABEL file contains pointers to the beginning of the main logical volumes.
Boot logical volume (lvol1) root logical volume (lvol3) and the primary swap
device.

January 2004 RECOVERY PAGE: 18


UXIE-SUPPORT STUDENT HANDOUT

LABEL file values after modification with lvlnboot

# xd LABEL
0000000 3c57 d2ff 8b7f 6a3c 0000 0001 0000 0000
0000010 0000 0000 0000 0b60 0001 0000 0000 0000 lvol1
0000020 0000 0000 0005 ab60 0001 e000 0000 0000 lvol3
0000030 0000 0000 0001 0b60 0004 a000 0000 0000 lvol2

First Address size

Recovery-19 UXIE-SUPPORTvB

The LABEL file contains information on the starting address and the size of the
main logical volumes.

January 2004 RECOVERY PAGE: 19


UXIE-SUPPORT STUDENT HANDOUT

Boot process possible failures


If you can’t check this point

PDC ISL AUTO HPUX LABEL BDRA lvol1 vmunix

BDRA is corrupted Reformat the LIF volume


using the recovery CD
or an IGNITE server.

Recovery-20 UXIE-SUPPORTvB

BDRA Boot Data Reserved Area

BDRA contains the physical disk addresses. Under HP-UX is can be recreated
using the #lvlnboot or doing a #vgcfgrestore.

If you have no access to a HP-UX running system, you must recreate it with the
information you provide using the “Rebuild the Bootlif” option of the offline
Recovery program.

The system reboots in LVM maintenance mode and using the #lvlnboot
command you set it to the customised values.

January 2004 RECOVERY PAGE: 20


UXIE-SUPPORT STUDENT HANDOUT

What is the BDRA ?

• BDRA stores addresses of the:

– physical disk
– root logical volume
– swap devices
– dump devices

Recovery-21 UXIE-SUPPORTvB

This is an example of the BDRA header:

Version of this BDRA 3


Number of PVs in root VG 1
Root VGID (0 means null BDRA) 2002204383 1012320329
Root VG number 64 0x0
Root LV[0] 64 0x3
Swap LV[0] 64 0x2
Dump LV[0] 64 0x2
Start root VG PV list 0x82
Size root VG PV list 20
Checksum root VG PV list 25

Root maint device minor[0] 0x1


Swap maint device minor[0] 0x2
Dump maint device minor[0] 0x3
BDRA flags
Checksum of Boot Data Record 39555
Printing Primary PVlist 8.8.5.0.

January 2004 RECOVERY PAGE: 21


UXIE-SUPPORT STUDENT HANDOUT

Boot process possible failures


If you can’t check this point

PDC ISL AUTO HPUX LABEL BDRA lvol1 vmunix

Lvol1 is corrupted Fsck the lvol1


using the recovery CD
or an IGNITE server.

Recovery-22 UXIE-SUPPORTvB

If the boot logical volume is corrupted, You can recover it using the fsck
command from the offline recovery programs.

At the prompt you can issue the fsck command using an alternate super block
(16).

January 2004 RECOVERY PAGE: 22


UXIE-SUPPORT STUDENT HANDOUT

QUESTION

How are the recovery programs


able to recreate the original BDRA?

Recovery-23 UXIE-SUPPORTvB

Recovery programs will recreate the BDRA with enough information for the
system to boot. Mainly we MUST boot in LVM maintenance mode.

The default hardware coded address is used to find the starting address of lvol1.
The /stand/rootconf file is used to find the starting address of lvol3.

After the boot, under the HP-UX prompt, you MUST recreate entries in the
BDRA and in the LABEL file.

January 2004 RECOVERY PAGE: 23


UXIE-SUPPORT STUDENT HANDOUT

Boot process possible failures


If you can’t check this point

PDC ISL AUTO HPUX LABEL BDRA lvol1 vmunix

Use a copy from lvol1


The file is not present
(or corrupted) Reload it from a backup
using the recovery CD

Regenerate it
using the recovery CD
or an IGNITE server.

Recovery-24 UXIE-SUPPORTvB

If the loss of the kernel (and its backup copy) occurred after a reconfiguration,
you can try to restore a kernel from a system backup.
This can be done using manual recovery. Once booted on the recovery media,
mount the boot filesystem, and use the backup tool (such as frecover) to restore
the kernel file.
Finally, the kernel can be rebuilt using the usual (manual) method. Once booted
on the recovery media, import and activate the root volume group, and mount the
root and boot filesystems.
You need also to mount the /usr filesystem. You can then use the mk_kernel
command, using the previous system file, a new (handwritten) file, or a copy
retrieved from a system backup.

January 2004 RECOVERY PAGE: 24


UXIE-SUPPORT STUDENT HANDOUT

QUESTION

Which kernel copy could we find


and what is its quality?

Recovery-25 UXIE-SUPPORTvB

Most of the time we can find a vmunix.prev file, which is a backup copy made
before a kernel modification. This previous kernel file could be used to recover and
restore a good quality kernel.

January 2004 RECOVERY PAGE: 25


UXIE-SUPPORT STUDENT HANDOUT

Load recovery programs

• Boot from a Recovery medium


– Select No ISL interaction

• Select Run a recovery shell

• Start the recovery shell

• Use the right option

Recovery-26 UXIE-SUPPORTvB

You can boot from the CD recovery or issue a BO LAN.ip address command
at the BCH prompt.

January 2004 RECOVERY PAGE: 26


UXIE-SUPPORT STUDENT HANDOUT

Booting the Install/Core/Recovery source


(11.00) (11i)

Don’t stop the boot process at the ISL level

INSTALL
RECOVERY

Recovery-27 UXIE-SUPPORTvB

Select the second option in the main screen.

January 2004 RECOVERY PAGE: 27


UXIE-SUPPORT STUDENT HANDOUT

Starting a Recovery Shell

Recovery-28 UXIE-SUPPORTvB

Then select the first option: Run a recovery shell.

January 2004 RECOVERY PAGE: 28


UXIE-SUPPORT STUDENT HANDOUT

Automatic vs. Manual Recovery

AUTOMATIC

MANUAL (script)

Recovery-29 UXIE-SUPPORTvB

The r option provides access to automatic recovery options.


The x option provides access to a shell, which allows manual recovery.
The b option reboots the system. Always use this option after manual recovery
operations! It is the only way to ensure the result of recovery operations is not
lost…
The s and l options allow, respectively, to search and load a file into the memory-
based filesystem. The options are useful for manual recovery, especially when
using recovery media from a tape.
The c option provides some help to mount and chroot to the root filesystem on an
LVM boot disk

January 2004 RECOVERY PAGE: 29


UXIE-SUPPORT STUDENT HANDOUT

QUESTION

What does manual recovery mean?

Recovery-30 UXIE-SUPPORTvB

Using the x option from the main menu, we reach a prompt.


To access a manual recovery means to take the control of the disk drive and
to use local programs and local files.

To do it, launch the chroot_lvmdisk script and take the control with the chroot
command. This is possible ONLY if the failure is located in the boot area and
if the root and boot file systems are OK.

January 2004 RECOVERY PAGE: 30


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 6
We need to restore the HPUX secondary loader file.
Which option from the recovery main menu should you use?

† s (search for a file)


† b (reboot)
† l (load a file)
† r (recover)
† x (exit to shell)

Recovery-31 UXIE-SUPPORTvB

The hpux secondary loader is missing or corrupted.


To restore it from the recovery media, use the r option in the main screen.
It will reformat the LIF volume and restore the hpux file from the recovery
media.
Your system will boot in LVM maintenance mode and You need to rewrite the
BDRA which has been reset in the same operation.

January 2004 RECOVERY PAGE: 31


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 7
The LVM structure (PVRA or VGRA) has to be reloaded.
What is the best action to do from the recovery program?

† Take the control of the disk


† Reformat the LIF volume
† Regenerate the kernel
† Reboot
† Don’t know

Recovery-32 UXIE-SUPPORTvB

If You need to restore an LVM area (PVRA or VGRA), its backup is under your
disk system. No way to find it from the memory-based running HP-UX.

You must take the control of your disk drive and from it restore the
configuration using the vgcfgrestore command.

January 2004 RECOVERY PAGE: 32


UXIE-SUPPORT STUDENT HANDOUT

UXIE-SUPPORT
8 modules in virtual class: Module 1 Boot
Module 2 Recovery
Module 3 LVM
Module 4 Mirroring
Module 5 SD-UX Patch
Module 6 Swap Dump
Module 7 File Systems
Module 8 Ignite

LVM-1 UXIE-SUPPORTvB

January 2004 LVM PAGE: 1


UXIE-SUPPORT STUDENT HANDOUT

OBJECTIVES

Reinforce the basic concepts for Logical Volume


Manager. There will be LVM related exercises in
all labs. They are able to define and to deal with
all LVM files and areas.

LVM-2 UXIE-SUPPORTvB

January 2004 LVM PAGE: 2


UXIE-SUPPORT STUDENT HANDOUT

What’s LVM?

• OSF product
– bundled with HP-UX from version 9.0
• Kernel-based disk management system
• Independent of disk use

LVM-3 UXIE-SUPPORTvB

. Logical Volume Manager (LVM) is not a filesystem, but rather a disk


management subsystem that offers access to file systems. It allows the system
manager much greater flexibility in controlling the size of the various "disk
partitions" or logical volumes. As with traditional disk sections, logical
volumes can be used for file systems, raw data, swap locations or dump
locations. A logical volume can be used in place of a disk section or a whole
disk in an HP-UX administration command.
. LVM was originally designed by the Open Software Foundation (OSF) in
conjunction with HP and other system vendors. The LVM code was ported to
HP-UX and introduced as standard on HP 9000 servers with the release of HP-
UX version 9.0. Beginning with release of HP-UX version 10.0, LVM was
also implemented on the HP 9000 workstations.

Features of LVM
. Flexibility: Add disks to a system and add their space to existing file systems.
. Disk Spanning: File system size is no longer limited to a single disk size.
Stability: LVM provides a mirroring capability to protect against disk failure.

Not all of these features are available by default or for every disk type.
However, the advantages of LVM overcome some classic problems that UNIX
systems have in a commercial environment.
NOTE: Be aware that mirroring (MirrorDisk/UX) is an add-on product to
HP-UX.

January 2004 LVM PAGE: 3


UXIE-SUPPORT STUDENT HANDOUT

LVM terminology?

• Physical volume Volume group

• Extent Logical volume

PVRA VGRA
– smallest unit allocated.
– Physical volumes are broken down into
Physical extents.

Physical and logical extent sizes must be the same.

LVM-4 UXIE-SUPPORTvB

Physical Volume
You MUST use the entire disk for LVM. A disk must be prepared and converted to an LVM Physical
Volume (PV) with pvcreate (1m). The PV can then be added to an existing Volume Group (VG) or a new
VG can be created. A disk must be entirely under LVM control, so the device file is always used to
represent the entire disk when using pvcreate(1m) and other commands which refer to physical volumes.
Always use the 'raw' device file when using pvcreate(1m).
Volume Group
The Volume Group (VG) is the most fundamental concept in LVM. It comprises all the space on all the
disks made available to it, minus a small management overhead section on each disk. Logical Volumes
are allocated from this pool of space, generally without concern for the underlying disk(s) and their
specific attributes, sizes, locations, etc. A directory in /dev must exist for each volume group built. The
name can be any meaningful name. VGnn is the standard naming convention. Each disk in a volume
group has a record on it containing the volume group id, so the system can easily tell which disks belong
to which volume group.
Logical Volume
The Logical Volume (LV) is the unit of space allocated that behaves like a disk section. The important
difference is that the size of the LV is chosen by the System Administrator. The physical location of the
disk that is being used is managed by LVM; the System Administrator does not need to know exactly
where the disk and its data are located. Logical Volumes have a default name of lvolx where x is the
number of the logical volume in the volume group. However, any name may be chosen for the LV. It is
recommend that default naming conventions are maintained. This will be especially helpful when
exporting/importing volume groups.
Physical and Logical Extents
The smallest unit of space LVM can allocate is an extent. A physical volume is broken down into
Physical Extents (PE), and all its physical extents are made available when the physical volume is added
to the volume group. When a logical volume is created, the size is specified. LVM will allocate enough
sequentially numbered Logical Extents (LE) to make up the space requested. However, each logical
extent may map onto a physical extent from any physical volume in the whole volume group. Extent
sizes can range from 1 to 256MB, in powers of 2. The choice of extent
size is important because it is the smallest unit that can be added to a logical
volume. Also, the disk overhead increases with smaller extents.

January 2004 LVM PAGE: 4


UXIE-SUPPORT STUDENT HANDOUT

LVM Areas

•PV ID number
Bootable disk •VG ID number Non Bootable disk
•PE size
LIF Volume header •PV size PVRA
PVRA VGRA
•Bad Block Directory
BDRA
LIF Volume •Pointers to other disk areas User
VGRA Data
Area
User
Data
Area
• VGDA (description)
Bad Block Pool • VGSA (PE status) MCR Bad Block Pool

LVM-5 UXIE-SUPPORTvB

The bootable information is set by pvcreate -B, you lose all data when you translate a non bootable
disk to a bootable disk. If you are create a mirror disk without option -B, all data were mirrored but
you cannot boot on this PV.
Translate it, remove all mirror lvols copies, remove PV from VG, create a bootable disk pvcreate -B,
extend VG and create all mirror lvols (boot,swap,root ……)
The Data Structures area consists of the Physical Volume Reserved Area (PVRA) and the Volume
Group Reserved Area (VGRA). The PVRA data remains static. It is populated by the pvcreate
command. The VGRA data contains static and dynamic areas. It is populated by the vgcreate and/or
vgextend commands. For a boot disk, about 2.9MB are required in the Data
Structures area. For a non-boot disk, about 400KB are required
The data structures in use on the disk hold the whole configuration together. They are vital and it is
often disastrous if they are lost. So never write to the physical volume once it is under LVM control.
The bad block pool, which is created by pvcreate, is optionally reserved by the
administrator to provide alternate locations for blocks which go bad on the disk. In advance of using
the disk, blocks which are known to be bad can be recorded here.
PVRA is created with the pvcreate (1m) command and written to by other LVM commands. It must
be built BEFORE the disk can be used by LVM. The PVRA also holds a key to 'prove' easily that this
is an LVM disk. If the PVRA is lost, then it is often impossible to recover the data on the disk.
The VGRA is created when the volume group is built or extended. When the VGRA is built, the
remaining fields in the PVRA are filled in as well. Every disk in the volume group has a map of the
logical to physical extents for every logical volume defined, even if none of the physical extents touch
this physical volume. There is also space reserved to hold this map even if the maximum number of
physical volumes is added to the volume group sometime later.
The VGSA is dynamic, recording what physical volumes are currently available; that is, which are
powered up and working

January 2004 LVM PAGE: 5


UXIE-SUPPORT STUDENT HANDOUT

LVM command
• Commands are in /usr/sbin or /sbin (for single mode)
• Structure [object][action]
• object
– pv Physical disk
– vg Volume group
– lv Logical volume
• action
– change, display, create, remove
– extend, reduce sync ….etc

LVM-6 UXIE-SUPPORTvB

All LVM commands are the association of one object and one action.
The object could be a physical disk, a volume group or a logical volume.
The action could be any action from change to sync.
This creates this kind of commands:

pvchange
vgdisplay
lvcreate

January 2004 LVM PAGE: 6


UXIE-SUPPORT STUDENT HANDOUT

Logical Volume Characteristics


• Block of allocated extents.

• Extents could be contiguous or non-contiguous.

• Lvol1 lvol2 and lvol3 MUST be contiguous.

• A Bad Block Pool is associated for any defective block.


(except for lvol1 lvol2 and lvol3)

• The default naming convention is Lvolx


(with x as the logical volume number)

LVM-7 UXIE-SUPPORTvB

Logical Volume
The Logical Volume (LV) is the unit of space allocated that behaves like a disk section. The
important difference is that the size of the LV is chosen by the System Administrator. The
physical location of the disk that is being used is managed by LVM; the System Administrator
does not need to know exactly where the disk and its data are located. Logical Volumes have a
default name of lvolx where x is the number of the logical volume in the volume group.
However, any name may be chosen for the LV. It is recommend that default naming
conventions are maintained. This will be especially helpful when exporting/importing volume
groups.

January 2004 LVM PAGE: 7


UXIE-SUPPORT STUDENT HANDOUT

LVM commands

System
Areas
Files
VGREDUCE
VGEXTEND
VGCREATE
LVREMOVE
LVCREATE
VGIMPORT
VGEXPORT
VGREMOVE

NEWFS
PVCREATE
LVEXTEND

LVM-8 UXIE-SUPPORTvB

All LVM commands act on disk areas (such as PVRA or VGRA) or HP-UX files
in the system.
They can modify only disk areas, or only files or the two at the same time.
It is very important to know their action in order to prevent any mistakes.

January 2004 LVM PAGE: 8


UXIE-SUPPORT STUDENT HANDOUT

LVM files

/etc/lvmconf/vg00.conf /etc/lvmconf/vg00.conf.old

/etc/lvmtab
System
Files
/dev/vgxx/group

/dev/vgxx/lvoly
/dev/vgxx/rlvoly

LVM-9 UXIE-SUPPORTvB

The LVM Control File is /etc/ lvmtab . This file holds the LVM structure together by recording
the device file(s) associated with each disk in a volume group. This file must exist at boot time so
that / sbin /lvmrc can determine which volume group(s) should be activated and copy the relevant
data into memory. This data is also used to complete sanity checks between tables in the kernel
and the reserved areas on disk. This information is updated any
time a volume group is created, extended, or reduced.
/etc/ lvmtab is a binary file. It can only be viewed using the strings command (e.g., # strings /etc/
lvmtab ). The above slide shows a sample output of this command. Each volume group ID is
listed, followed by the device files contained in that volume group.
In the event that this file is destroyed, it can be recreated with information available in the
reserved areas on each physical volume (PVRA & VGRA).
The command, vgscan , can be used to rebuild the /etc/ lvmtab file, using the information stored in
the PVRA and VGRA on each disk. This command should be used with caution.
Backing up LVM Structures
Normal backup utilities backup files in a file system, but they do not back up the LVM structures.
If the disk must be replaced, the LVM configuration information must be restored before the files
can be restored.

January 2004 LVM PAGE: 9


UXIE-SUPPORT STUDENT HANDOUT

Recover LVM files?

• /etc/lvmtab #vgscan

• /dev/vgxx/group #mknod

• structure #vgcfgrestore

• device files #vgimport


#mknod

LVM-10 UXIE-SUPPORTvB

The LVM Control File is /etc/lvmtab. This file holds the LVM structure together by
recording the device file(s) associated with each disk in a volume group. This file must
exist at boot time so that /sbin/lvmrc can determine which volume group(s) should be
activated and copy the relevant data into memory. This data is also used to complete sanity
checks between tables in the kernel and the reserved areas on disk. This information is
updated any
time a volume group is created, extended, or reduced.
/etc/lvmtab is a binary file. It can only be viewed using the strings command (e.g., #
strings /etc/lvmtab). The above slide shows a sample output of this command. Each
volume group ID is listed, followed by the device files contained in that volume group.

In the event that this file is destroyed, it can be recreated with information available in the
reserved areas on each physical volume (PVRA & VGRA). The command, vgscan, can be
used to rebuild the /etc/lvmtab file, using the information stored in the PVRA and VGRA

on each disk. This command should be used with caution.

January 2004 LVM PAGE: 10


UXIE-SUPPORT STUDENT HANDOUT

VGSCAN command
# vgscan -v

c0t1d0 c0t2d0 c0t3d0

PV1 PV2 PV3

vg01
/etc/lvmtab

# strings /etc/lvmtab
vg01
/dev/dsk/c0t1d0
/dev/dsk/c0t2d0
/dev/dsk/c0t3d0

LVM-11 UXIE-SUPPORTvB

If lvmtab is corrupted, no volume group could be activated.


VGSCAN will recreate LVMTAB.

Recovering the LVM Control File /etc/ lvmtab holds the LVM structure together by recording the
device file associated with each Physical Volume in a Volume Group. This must exist at boot time
so that the script, /sbin /lvmrc , can start up each volume group,
copying the relevant data into memory. If the /etc/ lvmtab is lost, it can be rebuilt using the vgscan
(1m) command.
/etc/ lvmtab is not an ASCII file. It can be read using the strings command. It is used primarily at
boot time, but it is also used to sanity -check commands.
This information needs to be kept in non -volatile storage so that when the system boots and tries
to load LVM information into memory, it knows which disks to look at. But when the system is
up, should this file g et lost, this data can be re -constructed. However, to do this would require
reading the VGRA of every disk on the system and correlating this input with /dev/*/group and
kernel memory information.

Preview mode is used to debug or diagnose what is wrong with the LVM configuration. The
vgscan command should be reserved for use in catastrophic failures (such as the loss of /etc/
lvmtab ).
If an existing /etc/ lvmtab file exists, it should be moved to /etc/ lvmtab .old before running this
command to keep it safe.

January 2004 LVM PAGE: 11


UXIE-SUPPORT STUDENT HANDOUT

Recover the structure?

• Commands (HP-UX)

#vgcfgbackup

#vgcfgrestore

LVM-12 UXIE-SUPPORTvB

Commands that modify the LVM configuration automatically invoke vgcfgbackup (1m) in HP -
UX version 10.0 or later. The automatic backup is always saved in /etc/ lvmconf /vgname.conf. If
chosen, vgcfgbackup (1m) can be run at the command line and the backup file can be saved to any
file specified.
It is strongly recommended that any alternate configuration back up file be created in the root file
system (as is the case with the default path name). This facilitates easy volume group recovery
during maintenance mode, such as after a system crash.

January 2004 LVM PAGE: 12


UXIE-SUPPORT STUDENT HANDOUT

Create a Bootable disk from a blank disk


pvcreate -B
Creates the PVRA, Bad Block Pool and allocates
space for the LIF Area and the BDRA.
LIF header
mkboot PVRA
PVRA
Creates a standard LIF area from a file image
BDRA BDRA
vgcfgrestore LIF LABEL
Restores PVRA, BDRA, VGRA
VGRA
from a file image

lvlnboot
Updates LABEL file and BDRA
Bad Block Pool

LVM-13 UXIE-SUPPORTvB

The bootable information is set by pvcreate -B, you lose all data when you translate a non
bootable disk to a bootable disk.
If you are create a mirror disk without option -B, all data were mirrored but you cannot boot on
this PV.
Translate it, remove all mirror lvols copies, remove PV from VG, create a bootable disk pvcreate
-B, extend VG and create all mirror lvols (boot,swap,root ……)

January 2004 LVM PAGE: 13


UXIE-SUPPORT STUDENT HANDOUT

LVM Data Structure Backup


# vgcfgbackup vg01

vg01.conf

PVRA PVRA PVRA


VGRA VGRA VGRA

lvol 1 lvol 1 lvol 1

lvol 2 lvol 3 lvol 4


Bad Block Pool Bad Block Pool Bad Block Pool

Disk 1 Disk 2 Disk 3


VG01

LVM-14 UXIE-SUPPORTvB

The vgcfgbackup command, which will run automatically each time a LMV
structure modification occurs, will backup the PVRA and the VGRA of all disk
drives in the volume group.

The backup file is stored under /etc/lvmconf/vg01.conf.


Previous backup file is renamed vg01.conf.old

Any area can be restored from this file using the vgcfgrestore command.

January 2004 LVM PAGE: 14


UXIE-SUPPORT STUDENT HANDOUT

LVM Data Structure Restore (Disk 2 LVM structures corrupted)

disk1 disk2 disk3

PVRA PVRA PVRA

VGRA VGRA VGRA

LVOL2 LVOL3 LVOL4


LVOL1 LVOL1 LVOL1

Bad Block Pool Bad Block Pool Bad Block Pool

VG01

# vgcfgrestore -n vg01 /dev/rdsk/c0t2d0 vg01.conf

LVM-15 UXIE-SUPPORTvB

For example:
In vg01 the disk2 fails.
You replace it by a blank new one. In order to restore the backup structure of
disk2 you use the command:

# vgcfgrestore -n vg01 /dev/rdsk/c0t2d0

January 2004 LVM PAGE: 15


UXIE-SUPPORT STUDENT HANDOUT

QUESTION

Is it possible to activate the volume group VG01?

LVM-16 UXIE-SUPPORTvB

To be able to activate a volume group, this one must have the quorum:
this means more than 50% of configured disk must be present.

In the previous example we had 2 disk present over 3 configured.


This means our quorum were 66% . As it is greater than 50% , the volume group
could be activated.

It is possible to activate a volume group skipping the quorum security.


You have only to use the option -q n in the vgchange command.

January 2004 LVM PAGE: 16


UXIE-SUPPORT STUDENT HANDOUT

QUORUM?

• LVM rejects activation of the volume group


if the quorum is not reached.

Quorum: > 50% of physical volumes in volume group

• /sbin/lvmrc performs the check for quorum during


the boot process.

LVM-17 UXIE-SUPPORTvB

To help preserve the consistency of the disk data structures, LVM will reject changes
to a volume group structure (e.g., adding a new logical volume) if more than half of
the expected physical volumes are missing for any reason.
This check is called a check for quorum.
In order to activate a volume group, one more than half the disks in the volume group
must be available. Once a volume group is active, it passes the quorum check if at
least half the disks are present. If during run time a disk fails and causes quorum to be
lost, LVM alerts with a message to the console, but keeps the volume group active.
Until quorum is re-established, no changes to the volume group will be allowed.

Quorum: > 50% of physical volumes in volume group

/sbin/lvmrc performs the check for quorum during the boot process.
The command vgchange is used to change the status of a volume group (active

or inactive).

January 2004 LVM PAGE: 17


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 8

Are the Data from lvol1 disk1 still valid?


‰ YES
‰ NO
‰ MAY BE
‰ DON’T KNOW

PVRA PVRA PVRA


VGRA VGRA VGRA

lvol 1 lvol 1 lvol 1

lvol 2 lvol 3 lvol 4


Bad Block Pool Bad Block Pool Bad Block Pool

Disk 1
VG01 Disk 2 Disk 3

LVM-18 UXIE-SUPPORTvB

Disk2 in the volume group vg01 fails.


As lvol1 is spread over the three disks, data are not valid, because you can not
mount the file system.

January 2004 LVM PAGE: 18


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 9

Are the Data from lvol4 disk3 still valid?


‰ YES
‰ NO
‰ MAY BE
‰ DON’T KNOW

PVRA PVRA PVRA


VGRA VGRA VGRA

lvol 1 lvol 1 lvol 1

lvol 2 lvol 3 lvol 4


Bad Block Pool Bad Block Pool Bad Block Pool

Disk 1
VG01 Disk 2 Disk 3

LVM-19 UXIE-SUPPORTvB

Lvol4 is only located on disk3. It is not affected by the crash of disk2.


Its data are still valid.

January 2004 LVM PAGE: 19


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 10

What is the quorum value of VG01?


‰ 100%
‰ 66%
‰ 50%
‰ 33%

PVRA PVRA PVRA


VGRA VGRA VGRA

lvol 1 lvol 1 lvol 1

lvol 2 lvol 3 lvol 4


Bad Block Pool Bad Block Pool Bad Block Pool

Disk 1
VG01 Disk 2 Disk 3

LVM-20 UXIE-SUPPORTvB

the quorum of this volume group is:

number of disk present 2


-------------------------------- = ----- = 66%
number of disk configured 3

January 2004 LVM PAGE: 20


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 11

Could you activate VG01?


‰ YES
‰ NO
‰ MAY BE
‰ DON’T KNOW

PVRA PVRA PVRA


VGRA VGRA VGRA

lvol 1 lvol 1 lvol 1

lvol 2 lvol 3 lvol 4


Bad Block Pool Bad Block Pool Bad Block Pool

Disk 1
VG01 Disk 2 Disk 3

LVM-21 UXIE-SUPPORTvB

The quorum of the volume group vg01 is 66% which is greater than 50%
so the volume group could be activated.

January 2004 LVM PAGE: 21


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 12

The LVM structure has been restored on disk2.What to do next to be


able to recover the system?
‰ Re-create lvol1, lvol2, lvol3 and lvol4.
‰ Re-create all file systems.
‰ Re-create only file systems from Disk2 (lvol1 & lvol3).

PVRA PVRA PVRA


VGRA VGRA VGRA

lvol 1 lvol 1 lvol 1

lvol 2 lvol 3 lvol 4


Bad Block Pool Bad Block Pool Bad Block Pool

Disk 1
VG01 Disk 2 Disk 3

LVM-22 UXIE-SUPPORTvB

The disk2 LVM structure has been restored with the vgcfgrestore command. So,
PVRA and VGRA have been restored.
The LVM layout is restored as it is part of VGRA (VGDA).

You need now to create all file systems on disk2 and to restore all data files in
these file systems from a backup tape.

January 2004 LVM PAGE: 22


UXIE-SUPPORT STUDENT HANDOUT

What are LVM’s limits?


Minimum Default Maximum

Volume Group 1 10 256


Physical volume 1 16 255
Logical Volume (per VG) 255
Physical Extend Size (MB) 1 4 256
Max Extends per PV 1 5120 65535

Boot area limit PV number * Extend number * Extend size


2.9MB 16 * 5120 * 4MB

Maximum disk size is 25GB

LVM-23 UXIE-SUPPORTvB

In the table, You can see minimum, maximum and default values configured
in the kernel.

These default values correspond to the maximum allowed parameters for the
VGRA in the boot area. As the boot area has a limited size, You can calculate
the maximum default size for a disk drive, which is around 25GB.

If You have a bigger disk, you must modify your default kernel parameters
in order to take care a the whole disk size.

During an HP-UX installation, the system will automatically calculate it and


propose you correct values. But later if you install a new disk larger
than the one used at the previous installation, you need to take in account it.

January 2004 LVM PAGE: 23


UXIE-SUPPORT STUDENT HANDOUT

Moving Disk
• Disks need to move to a different system or on
a different interface.
The process is:
• Deactivate the volume group
• export the volume group
– /etc/lvmtab updated
– kernel memory updated
– the volume group device files deleted
• import the volume group

vgexport does not destroy the data on the disk.


LVM-24 UXIE-SUPPORTvB

Disks may need to moved to different hardware locations on the same system or to a different
system. In both cases, the volume group definition in /etc/lvmtab must change.
vgimport will build a volume group in the same way as vgcreate.
Vgexport removes volume group definition from the system completely by updating /etc/lvmtab
and kernel memory and deleting the volume group device files. . The logical volumes must be
unmounted before using vgchange and vgexport.

. The volume group must first be deactivated with vgchange(1m).


. There is an option to create a map file containing the names and numbers of the logical volumes
for use in re-importing the volume group (ASCII file).

vgimport(1m) will build a volume group in the same way as vgcreate(1m).Like vgcreate(1m),
the group special file in the /dev/vgxx directory must already exist. The vg number does not have
to be the same as it was on the exporting system, as long as it is unique on the system. If a mapfile
is available, it will create all the logical volumes with their correct names.
Otherwise, default names will be assigned (i.e. lvol1, lvol2, etc.).
This is a good method of moving disk(s) to another machine, or just to another
controller within the same machine (for performance enhancements).
NOTE: vgexport does not destroy the data on the disk.

The disks can be moved and vgimport(1m) can be used to re-add the volume group definition.
The root volume group (i.e., vg00) cannot be exported.

January 2004 LVM PAGE: 24


UXIE-SUPPORT STUDENT HANDOUT

QUESTION

What to do, to change these values on VG01?


 PE size
 Number of PE
 Max number of PV

LVM-25 UXIE-SUPPORTvB

In order to manage a bigger disk, you have to change some of these parameters.
To do it you must recreate the volume group with vgcreate command and new
parameters.

Before to do it, backup all files on a tape and re-create the volume group, then
restore your files.

January 2004 LVM PAGE: 25


UXIE-SUPPORT STUDENT HANDOUT

UXIE-SUPPORT
8 modules in virtual class: Module 1 Boot
Module 2 Recovery
Module 3 LVM
Module 4 Mirroring
Module 5 SD-UX Patch
Module 6 Swap Dump
Module 7 File Systems
Module 8 Ignite

Mirror-1 UXIE-SUPPORTvB

January 2004 MIRROR PAGE: 1


UXIE-SUPPORT STUDENT HANDOUT

OBJECTIVES

Understand main mirror features, so that they can


configure it in order to reach the customer needs.
Students will be able to verify and modify the
current configuration.

Mirror-2 UXIE-SUPPORTvB

January 2004 MIRROR PAGE: 2


UXIE-SUPPORT STUDENT HANDOUT

Why mirror Data?

•Safety
•Availability
•Performance
•Less Downtime
• Shared LVM

Mirror-3 UXIE-SUPPORTvB

Why Mirror Data?

Mirroring provides the ability to store identical copies of data on separate disks. The
advantages of mirroring are as follows:

•Increased safety of data . Data is not lost if a single disk fails or because of media errors
local to one part of a disk.
•Increased system availability. The system can be kept running even if either the root or
primary swap volumes fail when these Logical Volumes are mirrored.
•Improved performance. Hardware can read data from the most convenient mirror copy.
However, writes may take longer.
•Less administrative downtime. Backups can be done on one copy of the data while the
other copy is still in use.

Prior to HP-UX 11i, LVM mirroring supported the non-SLVM environment only. In other
words, the disks were only accessible by a single system and could not be shared by multiple
hosts.
Beginning with HP-UX 11i, LVM mirroring automatically enables SLVM for a two-node
environment supporting both non-SLVM and SLVM environments. All LVM systems can
mirror their data on disk, and the mirrored copy of the data is also accessible from a two-node
cluster.

SLVM mirroring is NOT supported for striped logical volumes and is ONLY supported in a
two-node environment. SLVM mirroring does not support spared disks in a shared volume
group. You should disable sparing by using the pvchange -z n <path> command on shared
volume disks.

January 2004 MIRROR PAGE: 3


UXIE-SUPPORT STUDENT HANDOUT

What is MirrorDisk/UX?

•Unbundled data mirroring package.

•Only Logical Volume are mirrored.

•Up to two mirrored copies.

Mirror-4 UXIE-SUPPORTvB

What is MirrorDisk /UX?


Data Mirroring is provided by an additional purchasable software product called
MirrorDisk /UX. Systems where the data held on the disk is critical can benefit from
the protection gained by data mirroring. Data mirroring (with strict allocation) allows
the replication of many of the system’s resources and then the ability to survive the
failure of one of these replicated resources.
MirrorDisk /UX software prevents data loss due to disk failures by maintaining up to
three copies of data on separate disks. Applications can continue to access data even
after a single disk failure. In addition, on -line backups can be performed to avoid user
and application disruption.
Examples:
A system with two disk interfaces, two sets of cabling, and two sets of disks that is set
up in a mirrored configuration would be able to survive the failure of half the system.
If a disk is lost temporarily, such as through a power failure, the system would
continue to operate. When the disk is restored to the system, i t would automatically
catch up with any data written to the other parts of the mirror while it was down.
Also, if a bad block appears on a disk, LVM would be able to spa re it out, and
reproduce the data held on that block by reading it from the mirror copy.
For on -line backup solutions, the mirrored configuration can be split and one half can
then continue to work while the other half is backed up. Once t he backup is complete,
the part of the mirror that was split off for the backup can be resynchronised with the
main system.
NOTE: Mirroring will not protect against writing bad data to the disk. If the d
written to the disk is bad, it will still go to all of the mirror copies.

January 2004 MIRROR PAGE: 4


UXIE-SUPPORT STUDENT HANDOUT

Mirror Policies (disk space allocation)

Logical Extent original


Physical Extent

copy
original
ONE-WAY MIRRORING
(option -m 1) Physical Extent

Logical Extent

Physical Extent

TWO-WAY MIRRORING
(option -m 2) copy
Mirror-5 UXIE-SUPPORTvB

Implementing Mirroring
The LVM driver already has to keep track of I/O requests to a Logical Extent and map the request
onto a Physical Extent on the right Physical Volume.
When mirroring is required, LVM just maintains two mappings. Every Logical Extent now points
to two Physical Extents so the I/O will be scheduled to both places. In fact, LVM extends the
functionality of traditional mirroring products by allowing each Logical Extent to map onto a
maximum o f three Physical Extents providing three copies of each I/O and enhanced protection
against data loss.
The number of mirror copies of data that will be kept, in addition to the original, has the possible
value of 0 (no mirroring), 1 (one mirror copy), or 2 (two mirror copies). The actual mirroring is
achieved by allocating a number (1, 2, or 3) of Physical Extents from the Volume Group to the
Logical Extent from the Logical Volume. Each Physical Extent in the Logical Ex tent will contain
identical data.
Three -way mirroring provides even higher availability than single mirroring, but it requires more
disk space. For example, with one mirror, when backing up the Logical Volume, the mirror copy
will not be available during the backup. With doubly -mirrored data, one copy can be taken off -
line for the backup while the other copy remains on -line in the event something happens to the
original data.

January 2004 MIRROR PAGE: 5


UXIE-SUPPORT STUDENT HANDOUT

MirrorDisk/UX Policies

Only mirrored logical volumes take care of these options.

•Allocation Where to store it?


Strict Non-strict Group
•Synchronisation How to update?
MWC NOMWC NONE
•Scheduling How to write?
Parallel sequential (default)

Mirror-6 UXIE-SUPPORTvB

What Causes Inconsistencies in Mirrors?


Disk writes can’t always complete simultaneously. Writes to mirror copies in progress when the system
crashes causes concern for inconsistency. These questions could be asked:
•Have all the writes actually been sent to the disk, or are they only queued by the disk driver?
•Have the writes completed, but LVM hasn’t been notified?
•Did a partial write of the data to the disk occur?
Unfortunately, there are no clear answers available to these questions. If writes were in progress to
several mirrors, the answers could be different for each Physical Volume (i.e., one disk has received no
writes, another received a partial write, and a third received a complete write). At this point, it can’t be
assumed that the disks are consistent. After rebooting, LVM will know that
the Volume Group was not cleanly closed (a crash occurred). At this point, LVM must guarantee that
the mirrors are consistent copies of one another (excluding any stale extents). There are three options
to guarantee the consistency of the mirror copies once again.
The first Recovery Method option is to maintain a list of extent s that are currently being written to
each mirror copy. This list has to b e kept on disk. Having it in memory when the system crashes would
be pointless. The list is updated before any writing to mirror copies begins. Writing to an extent cannot
begin unless that extent is already on the list and the list has already
been written to the disk. After a system crash, every extent on the list is considered “inconsistent”;
therefore, mirror recovery begins. T his process involves selecting one fresh extent (it does not matter
to LVM which one is selected) and marking all the remaining extents “stale”. The fresh extent is read
from and its data is written to the other extents. This recovery method is very efficient, but there is
runtime overhead to maintain the 1st.
This list is referred to as the Mirror Write Cache (MWC). When the System Administrator decides
which crash recovery method should be used to recover mirror copies, he/she must decide whether or
not to use the MWC and accept the overhead it incurs in order to get fast mirror recovery.

January 2004 MIRROR PAGE: 6


UXIE-SUPPORT STUDENT HANDOUT

Mirror Policies (disk space allocation)

original original
Logical volume Logical volume

Strict copy copy


(default)
group
Same VG Same VG
Different PVG

Logical volume
Subgroup of disk drives
associated by hardware link
Non-strict Same Disk
Mirror-7 UXIE-SUPPORTvB

Keeping mirror copies on separate disks is known as strict allocation. There are three options for
strictness: strict, non -strict, and PVG -strict. Strict allocation is the default when setting up
mirroring. With strict allocation, LVM will allocate each Physical Extent in a Logical Extent from
a different Physical Volume. HP recommends this because non -strict allocation will not provide
continuous availability in the event of a disk failure.
With non -strict allocation, it is possible for LVM to allocate more than one Physical Extent from
a Physical Volume to the same Logical Extent. This option does not provide any protection from
disk failure, but should allow smooth bad block sparing and online backups. This option will very
likely have a serious impact on overall system performance since during writes the disk head will
be forced to move from one Physical Extent to another in quick succession as it writes to all
mirrored copies.
The third strictness option, PVG -strict, demands that not only must all Physical Extents within a
Logical Extent come from different disk drives, they must also come from different Physical
Volume Groups ( PVGs). This allows the System Administrator to set up disks within PVGs to
reflect the disk interfaces, power supplies, and cabling so as to maximise the system’s ability to
survive the failure of any of these components.

As mentioned previously, one option for allocation is PVG -strict. PVG -strict demands that not
only must all Physical Extents within a Logical Extent come from different disk drives, they must
also come from different Physical Volume Groups. This allows the System Administrator to set
up disks within PVGs to reflect the disk interfaces, power supplies, and cabling so s to maximise
the system’s ability to cope with the failure of any of these components. This is the option that
allows the ability to force the Physical Extents of each mirror copy to a Physical Volume on
another controller.

January 2004 MIRROR PAGE: 7


UXIE-SUPPORT STUDENT HANDOUT

Mirror Resynchronisation Policies


In case of failure

MWC (mirror write cache) Fast recovery Runtime overhead


Memory cache is used (7.5MB)

NOMWC No runtime Slow recovery


overhead

NONE Applications No system managed


can do recovery. recovery.

Mirror-8 UXIE-SUPPORTvB

LVM’s standard method of recording updates to mirrored Logical Volumes is via Mirror Write Cache (MWC). When
writing to a mirrored Logical Volume, it is necessary to write to all mirror copies. If this cannot be completed
simultaneously, Physical Extents not updated are marked as stale so that the sys tem will be able to later complete writing
to the previously unavailable parts of the disk mirror. This catch -up process is called synchronising the Logical Volume.
The VG Status Area (VGSA) maintains the list of current and stale extents (within the VGRA ). Once the VGSA has been
updated, mirror consistency is no longer an issue. Mirror consistency is only a concern while the data writes are in
progress. As this synchronisation process requires system resources, it can slightly affect performance.

Two other techniques for maintaining mirrored volume consistency are available. The first is to just mark the entire disk
that failed to be written to as stale, then later recover it by scanning all the Physical Extents and writing back any that
require updating. The second technique is to use no consistency recovery method. During normal operation, the system will
behave in the same way as the previous case. But when enabling the Volume Group, the system would make no effort to re
cover the mirrored disks. This option should only be used when the application is able to recover the data itself. Examples
might include a database file that will get re created from the log files or a swap area where no data is held over during a
system reboot .

In order to determine which crash recovery method should be used to recover mirror copies, we first need to understand
exactly what we mean by “Mirror Consistency”. A mirrored Logical Extent is consistent if every Physical Extent (PE) that
it’s mapped to has the same data, excluding any stale extents. The stale extents are of no concern, because they have been
recorded on the Physical Volume as “stale”. If the system crashes, the Volume Group Status Area (VGSA – located in the
VGRA of the PV) is read from the disk upon reboot and LVM knows not to use any data from the stale extents. The VGSA
will track the state of each Physical Extent as long as necessary.

The second Recovery Method option is NOT to maintain a list of extents that are currently being written to each mirror
copy. With this option, a mirror recovery will be performed on every single extent after a system crash. This is VERY time
consuming, but it is only done in an emergency. During normal operation, there is no overhead at all because no list of
extent s being written to is maintained.

The last Recovery Method option is to NOT perform any mirror recovery at all. With this option, it is assumed that the
application using the Logical Volume is able to repair the mirrors. For example, a Relational Database Management
System (RDBMS) might be able to read its transaction log file,
figure out what transactions were in process during a system crash, and clean up accordingly (e.g., roll -forward or roll -
back). In that case, LVM is not needed for mirror consistency.

January 2004 MIRROR PAGE: 8


UXIE-SUPPORT STUDENT HANDOUT

Disk Write scheduling Policies

Parallel
(default)

sequential

Slower, but secured


Mirror-9 UXIE-SUPPORTvB

While accessing a mirrored Logical Volume, the LVM layer will split a request to write to a
Logical Extent into writes to all of the Physical Extents that it contains. These requests can be
scheduled to occur either in parallel (the default) or in sequence, where one request must complete
be fore the system attempts to write the next copy of the data. When using parallel
scheduling, any read requests will be sent to the drive with the shortest read request queue. When
using sequential scheduling, read requests are sent to the drives in strict order.
Parallel scheduling should be used where performance is an issue and sequential scheduling
should be reserved for situations where extreme caution in maintaining consistency of mirrors is
required. Sequential scheduling will almost always seriously impact the performance of writes to
LVM Volumes.

The I/O scheduling policy can be chosen. The table shows the effects of each policy. Note that
both lvcreate and lvchange have the option “ –d” to address the Scheduling Policy.
Parallel scheduling generally provides the best throughput and i s the default. Sequential
scheduling provides the highest level of data integrity. However, it requires additional system
resources resulting in degraded system performance. Sequential policy should only be used when
maintaining mirror consistency is of extreme importance.

January 2004 MIRROR PAGE: 9


UXIE-SUPPORT STUDENT HANDOUT

How to mirror a LV?

Only logical Volumes are mirrored

•Policies are set with lvcreate


•Modify them with lvchange
•To mirror a lvol use lvextend
•To break a mirror lvreduce

Mirror-10 UXIE-SUPPORTvB

LVM Commands for Mirroring


The following LVM commands are used to establish mirroring:

• lvcreate
• lvextend
• lvreduce
• lvchange

All these commands have similar options. If a Logical Volume wa s not


configured properly at creation time, or the configuration needs have changed,
it can be adjusted later.

Certain options for these commands are only available after the
MirrorDisk /UX product is installed on the system.
The following slides will review each command in detail.

January 2004 MIRROR PAGE: 10


UXIE-SUPPORT STUDENT HANDOUT

Mirror commands
#lvcreate –m 1/2 -M y -c n -s y -d p/s -n lvol1 /dev/vg01

Synchronisation policy Allocation policy Scheduling allocation

#lvextend –m 1 /dev/vg01/lvol1 /dev/dsk/c0t8d0

c0t5d0 c0t8d0
lvol1 lvol1
vg01

Mirror-11 UXIE-SUPPORTvB

MWC on/MCR off ( lvchange -M y -c n) Note: When -M is “y”, -c is overridden to “n”. This
default method ensures minimal loss of data on recovery using the mirror consistency record (not
to be confused with Mirror Consistency Recovery) to sync disks as all mirrored writes are
recorded in cache. A potential negative impact is that since each disk contains a mirror
consistency record, inconsistencies can occur between mirrored disks because records can only be
written to on e disk at a time. On recovery, the system will choose the mirror consistency record
with the most recent timestamp and sync all the extents of the mirror’s copies with t his record.

MWC off /MCR on ( lvchange -M n -c y): This option reduces the overhead of the MWC and its
required mirrored consistency record update to disk. This consistency recovery process involves
marking all but one of the non -stale extents as stale, marking the remaining extents as stale, then
copying data from t hat non -stale extents to the stale extents. This recovery process can be very
slow, b ut provides the highest level of mirror consistency.

MWC off/MCR off ( lvchange -M n -c n): Improves performance because there is no mirror
consistency record update. However, no consistency recovery of data on the mirror copy occurs
after a system crash.

January 2004 MIRROR PAGE: 11


UXIE-SUPPORT STUDENT HANDOUT

lvsplit & lvmerge


(added by HP)

All writes are kept in memory lvsplit

Independent Logical
Volumes

lvmerge

resynchronise
resynchronise

We can do backups

Mirror-12 UXIE-SUPPORTvB

Using Mirrored Logical Volumes for Online Backups Disk mirroring can also be used to
minimize the downtime require d for system backups. A mirrored Logical Volume can be split,
allowing one p art to still be used by the system while the other part is being backed up. An lvsplit
operation will freeze the mirror copy and create a “backup” Logical Volume
which points to it. The “backup” is used for backups while activity continues on the other Physical
Volume.

When this happens, lvsplit assigns the “split off” part of the mirrored volume a new Logical
Volume number. This can then be accessed just like any other Logical Volume. Later, the Logical
Volumes can be merged back into one.

Making the mirror available as its own Logical Volume allows a backup of data on the Logical
Volume to occur while the system is still running. By using three -way mirroring, it is possible to
keep the “high availability” feature of disk mirroring while still allowing mirrors to be split for
backup.

When splitting a Logical Volume, it is desirable to close it first, or at least flush any data being
stored to it. With a file system, unmounting it would be the best course of action as this ensures
that all data belonging on the disk is written out, and can therefore be backed up. If this is not
possible, sync should be executed prior to splitting the volume. It will then be necessary to fsck
the split -off volume. For raw disk areas, the procedure will be application-dependent.

January 2004 MIRROR PAGE: 12


UXIE-SUPPORT STUDENT HANDOUT

What to do in case of failure

master mirror

Boot in quorum mode


Restore the LVM structure (vgcfgrestore)
Activate the volume group

Mirror-13 UXIE-SUPPORTvB

If a disk fails (e.g., failure of disk or power supply), then scheduled I/O will be timed -out by the
disk driver. LVM will attempt retries over a period of time, but will eventually give up. The disk
is marked unavailable, an d this status is
posted to the other VGRAs via a high priority write. If mirroring is in us then I/O can continue to
the other Physical Extents if they are on a different, available Physical Volume.
On a reboot, if the stale disk is now available, the VGRAs are updated. The Physical Extents are
resynchronised from a good copy automatically (regardless of the mirror recovery settings). Two
mechanisms are at work to resolve this issue:

•Mirror recovery of dirty LTGs after a system crash


•Resynchronization of stale extents.

The disk’s VGRA holds a copy of the MWC. It is called the Mirror Consistency Record (MCR).
When a write I/O is requested, the LVM driver looks in the MWC. If an I/O to
this Logical Track Group (LTG) is already in progress, the MWC already shows dirty for the
LTG, and the new I/O is scheduled. If there is not an entry in the MWC for this LTG, then it is
marked dirty and the updated MWC is scheduled as a “high priority” write to the disk where it
becomes a Mirror Consistency Record (MCR). Once this is done, the temporarily -blocked write
I/O can be scheduled.
When the scheduled I/O completes, the I/O is marked clean in the MWC and
the update is completed to the MCR in the VGSA.

January 2004 MIRROR PAGE: 13


UXIE-SUPPORT STUDENT HANDOUT

How to create a bootable mirror

Break the mirror


Reduce the volume group
Prepare the new disk for LVM (vg01)
Create all lvol in vg01
Extend all lvol according to the PE distribution
Activate vg01
Check all file system
Copy important files to vg01(fstab, lvmtab)
Copy all device files
Create the BDRA of vg01

Mirror-14 UXIE-SUPPORTvB

Break the mirror


Reduce the volume group
Prepare the new disk for LVM (vg01)
Create all lvol in vg01
Extend all lvol according to the PE distribution
Activate vg01
Check all file system
Copy important files to vg01(fstab)
Copy all device files
Create the BDRA of vg01

January 2004 MIRROR PAGE: 14


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 13

You want to create a bootable mirrored disk


what to do first?

‰ lvlnboot -a <PV dev file>


‰ pvcreate -Bf <PV dev file>
‰ vgcfgrestore <PV dev file>
‰ pvcreate <PV dev file>

Mirror-15 UXIE-SUPPORTvB

As you want to create a bootable mirrored disk, you must first prepare the disk to
have a boot area, this means to have an ISL header.
So you need to prepare the disk with pvcreate and its -B option.
The f option is to force the command in case of the existence of a previous
PVRA, which must be over written.

January 2004 MIRROR PAGE: 15


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 14

Is it possible to mirror the swap area?

‰ YES

‰ NO

Mirror-16 UXIE-SUPPORTvB

Mirroring is working with logical volume. The swap area is a logical volume, so
it can be mirrored.
If you schedule to create a bootable mirrored disk, You must mirror the swap as
well.

January 2004 MIRROR PAGE: 16


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 15

You split a logical volume, which has a file


system on it, how to access it?
‰ no way, you can not access it.
‰ mount it to a mount point.
‰ use SAM.
‰ do lvchange first.

Mirror-17 UXIE-SUPPORTvB

After the split, the mirrored logical volume has a name automatically assigned
with the original name with b extension (ex: lvol1b).
To access files in this file system, you need only to mount it to a mount point.

January 2004 MIRROR PAGE: 17


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 16

In a root volume group made of 2 disks,


if the master fails, how do you boot from the mirror?

‰ hpux -lm
‰ hpux -lq -lm
‰ hpux -is -lm
‰ hpux -lq

Mirror-18 UXIE-SUPPORTvB

As you master disk fails, you do not have any more the quorum on your root
volume group. To boot from the mirror you need to force the system to boot from
the second disk and at the ISL prompt use the hpux -lq command.

January 2004 MIRROR PAGE: 18


UXIE-SUPPORT STUDENT HANDOUT

UXIE-SUPPORT
8 modules in virtual class: Module 1 Boot
Module 2 Recovery
Module 3 LVM
Module 4 Mirroring
Module 5 SD-UX Patch
Module 6 Swap Dump
Module 7 File Systems
Module 8 Ignite

SD-UX-1 UXIE-SUPPORTvB

January 2004 SD-UX PAGE: 1


UXIE-SUPPORT STUDENT HANDOUT

OBJECTIVES

Students will understand how SD-UX is working.


They will be able to install and remove product.
They can check product installation and manage
patch depot.

SD-UX-2 UXIE-SUPPORTvB

January 2004 SD-UX PAGE: 2


UXIE-SUPPORT STUDENT HANDOUT

What’s SD-UX ?

z Software management
z Included with the HP-UX
z Client/server technology
z Local or Remote operations

SD-UX-3 UXIE-SUPPORTvB

Software Distributor provides you with a powerful set of tools for centralised HP-UX software
management, sometimes referred to as SD-UX (Software Distributor for HP-UX).
Software Distributor commands are included with the HP-UX operating system and, by default,
manage software on a local host only. You can also enable remote operations, which let you
install and manage software simultaneously on multiple remote hosts from a central controller and
perform other multi-site software distribution capabilities.

Software Distributor lets you manage software on multiple computers connected via Local Area
Networks (LANs) and Wide Area Networks (WANs) from a single central site. The network
systems must support
the TCP/IP protocol. Each network computer can act as a server, allowing its resources to be
managed or accessed by other machines, and as a client, managing or using the resources of other
machines.

Because Software Distributor is based on distributed, client/server technology, it requires some


networking functionality on the host system for proper execution. These networking services are
only available in UNIX Run Level 2 (Multi-User mode) and above. Software Distributor cannot
run in Single-User mode.

Software Distributor running under HP-UX 11.00 and higher versions does not support NFS
diskless clusters

January 2004 SD-UX PAGE: 3


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 17

At which “run level” is SD-UX starting?

z Single user
z multi user level 2
z multi user level 3
z multi user level 4

SD-UX-4 UXIE-SUPPORTvB

Because Software Distributor is based on distributed, client/server technology, it requires some


networking functionality on the host system for proper execution. These networking services are
only available in UNIX Run Level 2 (Multi-User mode) and above. Software Distributor cannot
run in Single-User mode.

Or you MUST start necessary scripts.

January 2004 SD-UX PAGE: 4


UXIE-SUPPORT STUDENT HANDOUT

How to start SD-UX


In single user mode?
Which subsystems need to start?
z hostname /sbin/init.d/hostname

z network /sbin/init.d/net.init (@)


/sbin/init.d/net

z swagent /sbin/rc2.d/S100swagentd
/sbin/rc2.d/S870swagentd (*)
(*) only 11.0
(@) only 11.x

SD-UX-5 UXIE-SUPPORTvB

To be able to run SD-UX in single user mode, three sub systems must have been
started (using the start option):
Hostname
Network
swagent
The picture will give all scripts for all releases from 10.20 to 11.X.
Some scripts are specific for a release.

Use the start and stop script to restart swagentd:


/sbin/rc1.d/K900swagentd stop
/sbin/rc2.d/S100swagentd start
If the network is down touch /var/adm/sw/standalone

Always use /var/depot/tmp/<Your Path> for your temporary depots (Do not use
/tmp)
Remove your depots if you do not need it any more
Use swremove to remove depots
Unregister your removed depots

January 2004 SD-UX PAGE: 5


UXIE-SUPPORT STUDENT HANDOUT

options
i interactive Software Distributor Tasks
s Source [host][:/directory]
x option=value

command [Options] [software selection] [@ Target]


commands
• swlist Display information about software
• swinstall Install software
GUI
(default)
• swcopy Copy software (to a depot)
• swremove Remove software

• swpackage Package software (create depot)


TUI
• swreg Register depots
SD-UX-6 UXIE-SUPPORTvB

In 11.0, swlist shows all installed patches including superseded ones. The 11i default behavior is
to not show superseded patches. This can be overcome, returning to 11.0 standard behavior, by
setting -x show_superseded_patches=true on the swlist command line or in the defaults files.

In 11.0, GUI software selection using "Match What Target Has" or "Automatically select patches
for software installed on target" could be confusing because the bundles did not get marked
automatically for selection in the GUI.

The new 11i behavior provides a product-level view where you see which software is matched.
After inspecting the results of the automatic selection, you can continue with the installation or
change the view back to a bundle level.

Automatic handling of superseded patch dependencies. Reduce need for manual analysis.
Size of patch bundles will decrease. Ignore superseded patches in swlist
swremove [Options] [software selection] [@ Target]
Options d Operate on depots
i interactive
p preview
x option=value
Remove a particular version of HP Omniback swremove Omniback,r=2.0
Remove the entire contents of a local depot swremove -d \* @ /var/spool/sw

January 2004 SD-UX PAGE: 6


UXIE-SUPPORT STUDENT HANDOUT

Software Distributor options

allow_incompatible: Don’t do it!


auto_kernel_build: swremove only
autoselect_dependencies
enforce_dependencies
match_target Used on 10.x for software and patches
Only for base product installations on 11.x
(but can use with autoselect_patches)
patch_match_target New at 11.00, only applies to patches
patch_save_files Useful when disk space is tight
patch_commit Removes backup files (/var/adm/sw/save)
(11.x swmodify only) Marks patch_state attribute as committed.
target_type=tape

swinstall
autoreboot=false
match_target=false
reinstall=false
swcopy
register_new_depot=true
layout_version=1.0 0.8 for HP/UX 10.x
1.0 for HP/UX 11.x

SD-UX-7 UXIE-SUPPORTvB

Useful Command-line Flags


-p Preview Mode. The task quits after the analysis phase.
Nothing is written to disk.
-v Verbose Mode.
-vv Very Verbose Mode.
-x autoreboot=true Allows software or patches which requires a reboot of
the target system to be installed.

-x mount_all_filesystems=false Allows software or patch installations even if a


filesystem, like a cdrom drive, is not mounted.

-x allow_incompatible=true Allows the installation of software or patches that are


incompatible (wrong HPUX rev or machine
architecture) with the target system.

-x enforce_dependencies=false Allows the removal of software that has other products


or filesets dependent on it. For example, some
internally-developed tools require Perl5. If I want to
remove it so that I can replace it with a later version, I
can use this option.

January 2004 SD-UX PAGE: 7


UXIE-SUPPORT STUDENT HANDOUT

New features

CD auto-mounting

Patch management Support From patch Bundle

Patch auto selection from target


Product patch auto selection Patches installed with the product
Patch rollback Remove a patch
Update patches
Commit patches During Product update

Remove saved files for unused patches

SD-UX-8 UXIE-SUPPORTvB

SD introduced the automatic discovery and mounting of a CD with the release of 11.0. However,
SD always looked for the CD even if that was not what was wanted. That made the start-up of the
GUI slower than necessary.

Patch Management
SD becomes “Patch aware”.
Automated Patch install/removal.

Patch match target - automatic selection of patches based on what’s on the target already.(only
on 11.0)
Autoselect patches - automatic associated patch selection when swinstalling a product.
Patch Filtering - ability to filter out patches with a user-defined string.
Rollback - ability to remove a patch after it has been installed.
Update – remove old patches when updating to new software release
Commit – ability to free-up disk space when rollback no longer needed.

Either: commit at installation of patch: swinstall -x patch_save_files=false PHCO_1163


Or commit after installation of patch: swmodify -x patch_commit=true PHCO_1163

January 2004 SD-UX PAGE: 8


UXIE-SUPPORT STUDENT HANDOUT

SD-UX terms
Sources

local depot
Target

Network depot
SWINSTALL

media
Installed Product Database
/var/adm/sw/products IPD

SD-UX-9 UXIE-SUPPORTvB

swinstall [Options] [software selection] [@ Target]


i interactive
Options p Preview
s Source [host][:/directory]
x option=value
A controller is the program or command (swinstall, swcopy, etc.) that you invoke on your
system. Controller programs are the front ends in the Software Distributor process, providing the
user interface for the management tasks. The controller collects and validates the data it needs to
start a task, and to display information on the task’s status.

A depot is a directory, specially formatted for use by Software Distributor commands, and used as
a “gathering place” for software products. You can pull software from depots for installation. This
book also refers to target, which is the destination host or directory (either a root directory or a
software depot) on which the operation is to
be performed. For most operations, the target consists of the local host or depots on the local host.
If remote operations are enabled, the target may be one or more remote systems.

A source or software source is a physical medium (CD-ROM, tape, DVD, etc) or directory
location (depot) that contains software to be installed.
The controller programs communicate with hosts and depots through the agent called swagent.
The agent performs the basic software management tasks. The daemon that executes the agent is
called swagentd. The controller and agent both run on the local host unless remote operations are
enabled. Then the agent may run on a remote host.

January 2004 SD-UX PAGE: 9


UXIE-SUPPORT STUDENT HANDOUT

What’s a depot ?

z Directory
z CD
z Tape
z Network

SD-UX-10 UXIE-SUPPORTvB

Internal depot format changed at 11.0 10.x depots are 0.8 layout_version and 11.0 depots are 1.0
layout_version. New patch attributes, POSIX compatibility: 10.x patch depots should be separate
from product depots. 11.0 patches may be in same depot as products being patched.
10.x depots can reside on 11.0 server, but 11.0 depots can not be on 10.x server. 10.x SD-UX can’t
load from 11.0 depot. layout_version option must be used to keep 10.x depot at 0.8 format. Copy
depot from 10.x system to an 11.0 server:
swcopy -s<source> -x layout_version=0.8 \* @ <dest>
Package a depot for a 10.x product on an 11.0 system:
swpackage -s<psf or depot> -x layout_version=0.8 @ <depot>
A depot is a directory, specially formatted for use by Software Distributor commands, and used as a
“gathering place” for software products. A depot is created by copying software directly to it from
physical media using swcopy or by creating a software package the depot using swpackage. Depots
become available for use by other SD-UX commands when they are registered. Depots are
automatically registered by swcopy or manually registered by the swreg command.
A depot usually exists as a directory location or directory depot. Therefore, a host may contain
several depots. For example, a software distribution server on your network might contain a depot of
word processing software, a depot of CAD software, and a spreadsheet software depot. Software may
also reside in a tape depot, which is formatted as a tar archive. If a depot resides on a system
connected to a network, then that system can act as a network source for software. Other systems on
the network can install software products from that server instead of installing them each time from
media.
A network depot offers these advantages over installing directly from tape or CD-ROM:
Several users can pull software down to their systems (over the network) without having to transport
media to each user. Installation from a network server is faster than from media.
You can combine different software products from multiple media or network servers into a single
depot.

January 2004 SD-UX PAGE: 10


UXIE-SUPPORT STUDENT HANDOUT

Installed Products Database (IPD)


depot

Bundle
Products
Keeps track of installed software:
Sub-Products
Filesets

Files
swlist [Options] [Product] [@ Target]
Options
a Selects an attribute to display
l level [depot | bundle | product | subproduct | fileset | file ]
s source Software source to list
SD-UX-11 UXIE-SUPPORTvB

Bundles:
Provide a way to group related software
Allow easy identification of installed products (swlist -l bundle)
Have definitions that can be packaged independently of the software
included in the bundle.
Can be used as operands for swinstall, swremove, swcopy,
etc.
swlist [Options] [Product] [@ Target]
a Selects an attribute to display
Options l level [depot | bundle | product | subproduct | fileset | file ]
s source Software source to list
List state attribute (Must return configured)
swlist -a state PHCO_15206

January 2004 SD-UX PAGE: 11


UXIE-SUPPORT STUDENT HANDOUT

Software structure

Bundles

Products

Sub-Products

Filesets

files Scripts

SD-UX-12 UXIE-SUPPORTvB

Syntax
bundle[.product[.subproduct][.fileset]][,version]
product[.subproduct][.fileset][,version]
Version: ,r <operator> revision
Operator =, >=, <, >, !=
Example r=A.01.02 r>=A.12
Bundles Collections of filesets, possibly from several different products, “encapsulated” by HP
for a specific purpose. Bundles can reside in software depots and be copied, installed, removed,
listed, configured and verified as single entities. All HP-UX OS software is packaged in bundles.
Because bundles are groups of filesets, they
not necessarily supersets of products. Customer creation of bundles is not supported.

Products Collections of subproducts (optional) and filesets. The Software Distributor commands
maintain a product focus but still allow you to specify subproducts and filesets. Different versions
of software can be defined for different platforms and operating systems, as well as different
revisions (releases) of the product itself. Several different versions could be included on one
distribution media or depot.

Subproducts If a product contains several filesets, subproducts can be used to group logically
related filesets.

Filesets Filesets include the all the files and control scripts that make up a product. Filesets can
only be part of a single product but they can be included in several different HP-UX bundles. You
can also specify filesets to be used by different platforms and OSs.

January 2004 SD-UX PAGE: 12


UXIE-SUPPORT STUDENT HANDOUT

UXIE-SUPPORT
8 modules in virtual class: Module 1 Boot
Module 2 Recovery
Module 3 LVM
Module 4 Mirroring
Module 5 SD-UX Patch
Module 6 Swap Dump
Module 7 File Systems
Module 8 Ignite

Patch-1 UXIE-SUPPORTvB

January 2004 PATCH PAGE: 1


UXIE-SUPPORT STUDENT HANDOUT

OBJECTIVES

Understand how a patch is working.


Students are able to find the right patch and
to verify its validity. They will be able to
handle patch depot and check patch
installation.

Patch-2 UXIE-SUPPORTvB

January 2004 PATCH PAGE: 2


UXIE-SUPPORT STUDENT HANDOUT

Reasons for Installing Patches?

z New functionality.
z New hardware support.
z Bug fixes.

Patch-3 UXIE-SUPPORTvB

New functionality HP is constantly incorporating new functionality in the HP-UX operating


system. In order to make this functionality available to the installed customer base without
requiring a full operating system update, HP sometimes provides these new features in the form of
a patch.

A patch might deliver a new version of "sendmail" that includes antispamming functionality.

New hardware support HP-UX may require a patch to support new types of interface cards and
devices. A patch might provide support for a SCSI 160/m interface card.

Bug fixes HP has a policy of fixing and releasing patches for defects as soon as they are found
instead of releasing a new operating system revision periodically. This requires patches.
Patches can be put together with dependencies. These are attributes that define the target system
on which the patch should be installed.

This can include a machine type, such as S800 server or S700 workstation.
It can specify other patches that must be installed on the system and can specify products that are
patched by the patch.
If a patch patches product "xyz" and a user attempts to install the patch on a system without the
product being present SD-UX will complain.
If the patch is "forced" on the system, unnecessary files will be installed.
If the product "xyz" is installed at a later time it will appear to be patched but the patch files will
have been overwritten by the product.
Patches, like any SD-UX packaged product, can contain scripts which are executed at different
phases during the installation.

January 2004 PATCH PAGE: 3


UXIE-SUPPORT STUDENT HANDOUT

Patch Format & Structure

Patch Format Patch Structure


(tar or SD format) (Shell archive)

Patch.text Patch documentation


(instructions, characteristics)
#sh Patch

Patch.depot Source depot file


(binary data)

Patch-4 UXIE-SUPPORTvB

The standard patch structure for HP-UX 11.x is a shell archived file (share file) which includes a
text file and an SD (Software Distributor) depot.
The text file is designated by the patch name and is followed by .text (e.g., PHSS_4014.text). The
text file includes information about the patch and any special instructions.
The SD depot is designated by the patch name and is followed by .depot (e.g.,
PHSS_4014.depot). The depot actually contains a single tar file or a hierarchical directory of files.
SD understands both formats without special designation. The files contained in the tar format or
the hierarchical directory will replace the original files on a system.
Patch tapes can come in two formats:
tar - Contains the .text and .depot files that need to be extracted via the tar
command and then installed via the standard installation commands.
SD - An actual SD depot that can be installed via swinstall

January 2004 PATCH PAGE: 4


UXIE-SUPPORT STUDENT HANDOUT

Patch Naming Conventions

PHXX_YYYYY
YYYYY = Sequence number

XX = Patch Type

CO => Commands
KL => Kernel
NE => Network
SS => Subsystem

H = HP-UX

P = Patch

The depot patch file is unique and never modified.

Patch-5 UXIE-SUPPORTvB

The Hewlett Packard Patch naming convention begins with a P for Patch, H for
HP-UX, and is followed by the type of patch and a sequence number.
The type of patch is the area to be patched. These include:
CO - general HP-UX commands
KL - kernel patch
NE - network specific patch
SS - subsystem patch (e.g. X11, Strobes, etc.)
The sequence number is a unique number assigned to a patch. This is to ensure
there is no overlap in assigned patch names in order to avoid obtaining the wrong
patch.
Some patches will require the operating system to reboot after installation. All
kernel patches will require this. The other areas will vary depending upon
whether the kernel needs to be recompiled.

Since SD can only be run in multi-user mode, ensure all users are logged off
before loading a patch, which requires a reboot.

January 2004 PATCH PAGE: 5


UXIE-SUPPORT STUDENT HANDOUT

Main Patch characteristics

• Critical patch
• Needs a reboot
• Superseded (replaced by) patch
• Dependent patch
• Bad patch, Warning patch

Patch-6 UXIE-SUPPORTvB

Patches have some characteristics:


They can be critical as they solve a critical problem,which is defined by HP
as a problem,which can create customer data corruptions.
They could need a reboot to be efficient.
They could be superseded by one patch or by one collection of patches.
They can be a dependent patch. One trouble is sometimes solved by more than
one patch with different type. In this case, these patches MUST be installed all
together: They are dependent.

A new patch, for some reasons, can create more troubles than excepted.
Link with some customer’s environment, they become more dangerous than
required. They take the characteristic Warning. This means, before to install
them, verify if this environment is not the one of your customer.

After a while these patches will be bad as they have not to be installed.
Later they will be replaced and become superseded.

January 2004 PATCH PAGE: 6


UXIE-SUPPORT STUDENT HANDOUT

Patch characteristics

It depends on 7 other patches

Patch-7 UXIE-SUPPORTvB

Over the web, you can consult the patch catalog web site.
It is clear for each patch his characteristics. This site is updated every day and
you must consult it before any patch installation.

January 2004 PATCH PAGE: 7


UXIE-SUPPORT STUDENT HANDOUT

Where to find Patches?

z Support plus CD
z Web site
z Response Centers

Patch-8 UXIE-SUPPORTvB

Patches can be found:


from the support plus CD (one every 2 months)
from the patch catalog web site (see later)
from the response center, which is updated during the night.

January 2004 PATCH PAGE: 8


UXIE-SUPPORT STUDENT HANDOUT

The patch catalog web site

http://wtec.cup.hp.com/~patches/catalog/
Patch-9 UXIE-SUPPORTvB

This is the patch catalog web site.


A browse can be done selecting the HP-UX release, the HP-UX platform (700 or
800) and with a pattern search in all text files.
A list of corresponding patches will be displayed.

January 2004 PATCH PAGE: 9


UXIE-SUPPORT STUDENT HANDOUT

The patch catalog web site results

Patch-10 UXIE-SUPPORTvB

This is the on-line patch catalog web site.


It is possible to see the current status of any patch.
The text file is displayed in dynamic links, so it is very easy to navigate through
it.

January 2004 PATCH PAGE: 10


UXIE-SUPPORT STUDENT HANDOUT

Atlanta ftp (15.51.240.6) site


#ftp hpatlse.atl.hp.com

Patch-11 UXIE-SUPPORTvB

When you have finished your selection, you can download your patches from the
Atlanta FTP server.
First You select hpux_patches, then your platform and after your release.
At this point never do a list command (dir) as this could take hours to finish.

January 2004 PATCH PAGE: 11


UXIE-SUPPORT STUDENT HANDOUT

http://banba.fc.hp.com/tools/deps.html
Patch-12 UXIE-SUPPORTvB

The most difficult problem to solve is to get a valid patch list with all dependent
ones. To help you in this task, some web site propose you some tools able to do it.
At this point you must verify the validity of all patches as it can change every
hour.

January 2004 PATCH PAGE: 12


UXIE-SUPPORT STUDENT HANDOUT

When to Install Patches?

z Install Operating System


z Install Applications
z Install Diagnostics
z Install Patches
z Verify The installation

Patch-13 UXIE-SUPPORTvB

The process at which patches are installed at HP-UX 11.X is very important. The
operating system has to be installed first. Before jumping into installing patches,
make sure that all applications and diagnostics are installed prior to adding
patches. Failure to do so could cause the patch not to install or the system to
crash.
If a patch has been installed prior to the installation of an application and there
are conflicts, the patch or patches must be removed prior to the installation of the
application.

January 2004 PATCH PAGE: 13


UXIE-SUPPORT STUDENT HANDOUT

Superseded and Dependencies

supersedes
Patch A Patch C

dependency
automatic shift of dependency to
new (superseding) patch.
Patch B

Patch C supersedes Patch A.


Patch A is superseded.

Patch-14 UXIE-SUPPORTvB

The advantage to have an IPD is the patch dependency and the superseding is
automatically updated.
In the example PatchB, which had a dependency with patchA, which is
superseded by patchC, is automatically link with patchC.

January 2004 PATCH PAGE: 14


UXIE-SUPPORT STUDENT HANDOUT

Manage Patch Depot

• Depot Media • Depot directory

swcopy

swpackage

swinstall

System
Patch-15 UXIE-SUPPORTvB

The patch_match_target and match_target options cannot both be set to true in the same swinstall
command.

Use the match_target option to update from HP-UX 10.x. Use the patch_match_target option to
install new patches on systems that are already running HP-UX 11i. This option selects patches
from a depot that apply to software already installed on an 11i system.

The 11i autoselect_patches option (true by default) automatically selects patches to match
software selected for installation. It lets you install patches at the same time you install base
software. In addition to the base software selected by the match_target option, the
autoselect_patches option provides the means for selecting appropriate patches during the update
process.

January 2004 PATCH PAGE: 15


UXIE-SUPPORT STUDENT HANDOUT

Creating a Patch Depot


/

tmp depot

patches . . . . . .

Patch_B.depot swcopy DEPOT


/depot/patches

Patch_A.depot swcopy

Selecting the depot, the 2 patches are installed at the same time.

Patch-16 UXIE-SUPPORTvB

Creating a patch depot can be very useful if there are a number of systems to
patch. Another reason to create a depot is when multiple patches each would
cause a reboot. Registering the patch depot causes only one reboot.
The first step is to create a depot directory, place the patches in the directory
using swcopy, and register the directory with swreg.
To unregister a depot, use swreg with the -u option.
Example:
# swreg -l depot -u /depot/patches

January 2004 PATCH PAGE: 16


UXIE-SUPPORT STUDENT HANDOUT

Post Install & Clean-up

Patch verification

#swlist -l fileset -a state | grep PH | grep -v \# | grep -v configured

#cleanup (new command) remove superseded patches


PHCO_27780 (other features)

save /var/adm/sw/save

#cleanup -c [1 2 3] superseded
#cleanup -s IPD clean
#cleanup -d depot clean

Patch-17 UXIE-SUPPORTvB

One possible trouble is one patch has been installed but the needed reboot doesn’t
occur; So the patch is in the status not configured.
The first thing to do is to check if any patch is in this case.
The command will give us this list.
If after the command, nothing appears this means no patch have the status not
configured.

The cleanup command uses the show_superseded_patches option of swlist.


One patch needs to be install (PHCO_24347) in order to display the correct result.
The cleanup command can remove superseded patches, which are superseded
once, two times or three times.
If we use option c1 all superseded patches in the IPD are removed.
At this point no way to remove a patch from the system if it supersedes one,
which has been removed.
We can use the command swremove with the option commit_patches=true.
This command will remove all superseded patches in the IPD.

January 2004 PATCH PAGE: 17


UXIE-SUPPORT STUDENT HANDOUT

UXIE-SUPPORT
8 modules in virtual class: Module 1 Boot
Module 2 Recovery
Module 3 LVM
Module 4 Mirroring
Module 5 SD-UX Patch
Module 6 Swap Dump
Module 7 File Systems
Module 8 Ignite

Swap-1 UXIE-SUPPORTvB

January 2004 SWAP PAGE: 1


UXIE-SUPPORT STUDENT HANDOUT

OBJECTIVES

Students will understand the main features


of the swap.They will be able to check and
verify the current swap.
They will be able to modify it in order to
reach customer needs.

Swap-2 UXIE-SUPPORTvB

January 2004 SWAP PAGE: 2


UXIE-SUPPORT STUDENT HANDOUT

What’s the SWAP?

• The term "swap" dates back to early UNIX


implementations .
• Complete processes were remove from memory
to a secondary storage.

• "deactivation" scheme replaces nowadays


the swap process (paging). It is managed by
the swapper process.

Swap-3 UXIE-SUPPORTvB

The kernel always tries to maintain a threshold of free pages in order to keep the system
running efficiently. As long as this threshold, referred to as "lotsfree", is maintained, no
paging occurs. When the number of free pages drops below this threshold, a daemon
known as "vhand" is started. The daemon will select pages that have not been recently
referenced. If necessary, the page will be copied to the swap area before being put on
the free list. This is referred to as a "page out". A "page fault" occurs when a process tries
to access an address that is not currently in memory. The page will then be copied into
RAM, either from the swap space or from the executable on disk.

On systems with very demanding memory needs (for example, systems that run many
large processes), the paging daemons can become so busy swapping pages in and out
that the system spends too much time paging and not enough time running processes.
When this happens, system performance degrades rapidly, sometimes to such a degree
that nothing seems to be happening. At this point, the system is said to be "thrashing”
because it is doing more overhead than productive work.
Process deactivations begin when free memory falls below minfree.

When memory falls below minfree, the swapper daemon ‘intelligently’ chooses a process
to deactivate biased toward non-interactive, sleeping and old processes.
When a process is deactivated it is flagged as deactivated, all pages (including the uarea
or working set) are swapped out, all threads are taken off the run queue.
A deactivated process will get swapped back into physical memory once the free memory
rises above minfree. At this point only the uarea is paged in and the process is put back
on the run queue.
Deactivated processes will not reactivate unless there is enough free memory, even if an
external source is waiting / communicating with that process. For example a getty or a
network daemon.

January 2004 SWAP PAGE: 3


UXIE-SUPPORT STUDENT HANDOUT

How it works?

• Physical memory is a finite resource


on a computer.
• Only so many processes can fit into physical
memory at any one moment in time.
• New processes need some free memory space.
• Therefore, processes must be removed from
RAM to free up space for other pages,
it must first be copied to the swap space.

Swap-4 UXIE-SUPPORTvB

Prior to 10.00, all virtual memory was handled using swapping, that is, a low priority program
was completely swapped out to make room for another process. When processes were a few
megs, this made sense. But today, processes grow to hundreds of megabytes and to swap one of
these monsters to allow a login or a shell to run doesn't make any sense.
So starting with 10.00, swap is replaced by deactivation and paging. This means that a low
priority process will be deactivated when memory pressure is high. Once deactivated, the kernel
will begin paging deactivated processes to the swap area. The key is that not the entire process is
moved--just a few pages, enough to allow the higher priority process to run. When the deactivated
process needs to run, only a few pages are needed from the swap area.
swapmem_on=1 introduces the concept of over-allocation or additive virtual memory. When this
setting is not=0,then virtual memory is calculated as:
Virtual Memory = Total Swap + (RAM * 0.75)
In other words, rather than requiring every process to be mapped or reserved in the swap area, up
to 75% of RAM is assumed to be occupied with processes which won't need to mapped directly. If
the system runs out of swap space, the processes already in RAM will remain there as if locked
into RAM. Once some swap space is free, normal paging
continues.
Now you don't need a minimum of RAMx1 in order to run the system. Virtual memory becomes
larger by the addition of 75% or RAM. Now you can have 4000 megs of RAM and 100 megs of
swap which means 4000*0.75 = 3000 + 100 swap = 3100 megs of virtual memory. Most large
RAM systems are designed to keep all the processes in RAM to avoid paging (which really
impacts performance).
With enough RAM, there is no need to allocate a large amount of swap space. On the other hand,
some processes like Oracle, Sybase, SAP, I2 and other large database systems may require 10-30
GBytes of RAM when fully operational.
In those cases, add swap space to your available RAM*0.75 until you have enough. Then add
more RAM if performance is impacted by excessive paging.

January 2004 SWAP PAGE: 4


UXIE-SUPPORT STUDENT HANDOUT

What’s the SWAPPER?

• Manages physical memory resources


and Swap areas.
• Maintains a minimum free memory.
• Activates and deactivates processes
• Allocates swap area at process creation.
(demand paging)

Swap-5 UXIE-SUPPORTvB

When available RAM is very low (below the "minfree" threshold), the swapper begins
deactivating processes. When a process is targeted for deactivation, that process is removed from
the run queue. Since the process can no longer execute, the pages associated with
the process are less likely to be referenced. The normal paging mechanism (vhand) will then be
more likely to push a deactivated process's pages out to the "swap" device.

HP considered changing the commands and documentation, but decided against it.
The term "swapper" is deeply ingrained in everyone's minds, and the swapper still exists even
though it no longer does that job by name.

HP-UX, like other versions of Unix, use Virtual Memory to load processes into memory. In
simple terms, Virtual Memory consists of two parts, physical memory or ram and swap. The
physical memory is where the programs are running, and swap is the "spill over". Swap devices
are usually parts of a physical hard drive. Swap allows the total number of processes to exceed
the amount of physical ram, and can be allocated as needed.

January 2004 SWAP PAGE: 5


UXIE-SUPPORT STUDENT HANDOUT

QUESTION

When does SWAPPER start?

Swap-6 UXIE-SUPPORTvB

The swapper is the first process, which starts. It must have initialized the
memory before to start the init process. Its process ID is 0 and it is always
running in background.

January 2004 SWAP PAGE: 6


UXIE-SUPPORT STUDENT HANDOUT

What could be a SWAP?

• Logical volume
–Raw lvol (reserved for swapping)
–Only locally
–Quickly (direct access with large I/O)
• File System swap
–Slower
–Local or remote
–Dynamic
• Pseudo swap
–Memory swap
–Turnoff (swapmem_on=0)

At least one device MUST be configured.

Swap-7 UXIE-SUPPORTvB

The logical volume used as swap MUST not have a file system.
Swap cannot be modified or removed without rebooting..
If any paging area becomes unavailable while the system is running, for example if a network failure
occurs while paging to a remote system, the system will immediately halt.
Pseudo swap is the exception to the rule. Pseudo swap was designed to allow a system administrator
to take advantage of systems with large amounts physical ram without configuring large swap areas.
Pseudo swap is not a substitute for device swap, but an enhancement to swap. When the system boots,
the amount of pseudo swap is calculated.
This calculation is 75% of physical memory and this value is a non-tuneable kernel parameter.
The kernel will see this enhancement as additional swap area that it can allocate when spawning new
processes. The system will use pseudo swap only for reserving space and does not page processes in
and out of pseudo swap. In the event that processes need to be paged out of physical memory, the
kernel will swap to device or filesystem swap. Pseudo swap is turned on by default and can be turned
off my changing the kernel parameter swapmem_on to off.
File system swap, unlike device swap, is a file system that not only supports files and their data
structures, but also has space available for swapping.
File system swap is a form of secondary swap. It can be configured dynamically. File system
swap allows a process to use an existing file system if it needs more than the designated device swap
space. File system swap is used only when device swap space is insufficient to meet demand-paging
needs. File system swap consumes a variable amount of space because it only uses that portion of a
file system that it needs. You can limit file system swap to a fixed size to prevent it from consuming
too much space.
When file system swap is enabled in a file system, a directory called /paging is created under the root
directory of the file system. A file is created for each swapchunk used in the file system. by default, a
swapchunk is 2 MB. Note that once a file system has been enabled for file system swap, it isn't
possible to unmount that file system until the swap is disabled at the next system reboot.

January 2004 SWAP PAGE: 7


UXIE-SUPPORT STUDENT HANDOUT

How is configured SWAP?

• Default SWAP
–lvol2 is the default SWAP
–Memory is configured
• Auxiliary swap
–Logical volume
• File System swap
–configured with /etc/fstab file during boot
• runtime configuration
–with swapon command

Swap-8 UXIE-SUPPORTvB

Pseudo swap is enabled through the tunable kernel parameter called swapmem_on. If the value
for swapmem_on is 1, then pseudo swap is turned on or enabled. The percentage of physical
memory that pseudo swap adds to swap_avail is not a tunable kernel parameter and is always
75%. This information is valid for all versions of HP-UX 10.X and 11.X.
All HP-UX systems require that a process using "N" pages of memory reserve "N" pages of swap
space.
With very large memory systems, it becomes less desirable to have enormous amounts of disk
space set aside for swap. Using pseudo swap, this requirement is relaxed.
Pseudo swap in essence deceives the system into allowing more processes to run without the need
to reserve more swap space on disk after the physical swap space is exhausted.

January 2004 SWAP PAGE: 8


UXIE-SUPPORT STUDENT HANDOUT

How swapon is working?

• Enable swap devices dynamically.


• All entries in /etc/fstab are automatically
setup at the boot.
• New entries in /etc/fstab are set by:
#swapon -a command
• No possibility to remove swap device.(reboot)
• No way to umount a file system used by swap.

Swap-9 UXIE-SUPPORTvB

Both file system and device swap can be enabled from the command line using the "swapon(1M)" command.
The examples below show several common uses of "swapon".
This example enables the /dev/vg01/myswap logical volume for use as device swap. the entire logical volume
will be claimed as swap, so it will no longer be available for use as a file system. If the logical
volume contained a file system in the past, you may need to use the -f option to "force" an overwrite of the
old file system structures.
$ swapon /dev/vg01/myswap

This example enables device swap on the whole disk /dev/dsk/c0t2d0. If the disk contains a file system that
was created with "newfs -R 200 /dev/rdsk/c0t2d0", you can reserve the file system and
simply enable swap on the available space reserved at the end of the disk by including the "-e" option on
"swapon". If you wish to overwrite the file system on the disk, use the "-f" force option instead.
$ swapon /dev/dsk/c0t2d0

This example enables the file system mounted on /myfs2 for use as file system swap. the "-p" option sets the
priority for the swap area to 4. and the "-l" ensures that "vhand" can take no more tha 80 MB from the file
system for use as swap.
Note: You cannot unmount a file system while the file system is activated for use as file system swap.
$ swapon -p 4 -l 80M /myfs2

All swap areas are automatically deactivated at shutdown. To ensure that a swap area is automatically
enabled at the next system boot, it must be added to the /etc/fstab file. The syntax for swap entries in
/etc/fstab is described later. Issuing "swapon -a" immediately activates all swap entries in /etc/fstab.
$ swapon -a

January 2004 SWAP PAGE: 9


UXIE-SUPPORT STUDENT HANDOUT

QUESTION

#swapon -p 4 -l 10M /home


This enables a swap area (max 10MB) in the /home file system
with a priority of 4.

After a reboot is this SWAP device still active?

Swap-10 UXIE-SUPPORTvB

This command create a dynamic swap.


After a reboot, this swap is not any more available.

January 2004 SWAP PAGE: 10


UXIE-SUPPORT STUDENT HANDOUT

How to verify the SWAP?


Use the swapinfo command

# swapinfo -t

You will find:

•dev device swap


•reserve swap allocated by processes
•memory pseudo swap
•localfs local file system

Swap-11 UXIE-SUPPORTvB

After configuring one or more swap areas on your system, monitor the use of those swap areas over time
using "swapinfo(1M)". The "swapinfo" command lists the configured swap areas and reports what percentage
of each is currently in use. If you are running low on swap space, you may need to configure an additional
swap area to ensure that your users are able to run the applications they need.
#swapinfo -tam
Mb Mb Mb PCT START/ Mb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 128 10 118 8% 0 - 1 /dev/vg00/lvol2
localfs 60 0 60 0% 60 0 4 /var/paging
reserve - 52 -52
memory 91 68 23 75%
total 279 130 149 47% - 0 -

The "memory" line shows pseudo swap usage, and tends to be the most confusing of all of the swapinfo
output. Pseudo swap is only used for reserving a process. So from this example, there is a total of 91 megs of
pseudo swap configured, and of that 91 megs, 68 megs of pseudo swap are being used by processes running
in memory. The remainder of pseudo swap that it is not used is 23 megs. One of the reasons why pseudo
swap is so confusing is the fact pseudo swap usage does not degrade system performance like device swap or
filesystem swap. In another words, system performance would be the same on a system that had 3% or 99%
percent of pseudo swap used. It is usually recommended that the "memory" line simply be ignored when
looking at swapinfo.
The "reserve" line only deals with amount of swap we are using for reserving processes in device and
filesystem swap areas. From this example, we have a total of 188 megs of combined device and filesystem
swap and of that, only 52 megs are being used for reserving processes. Now if you take the total of megs
used by memory and reserve, we have 120(52+68) megs of swap allocated to reserving space for active
processes. So, from two lines, we have accounted for about 92% of swap, which is being used only for
reserve for running processes.
The "localfs" line shows some information on much space the system will use for filesystem swap that was
configured on /var. An interesting thing about that line is the priority of the swap, which is 4. This means that
all of device swap (/dev/vg00/lvol2), which is set to priority 1, will be used before this filesystem swap area
is used.
The "dev" line is one of the most important pieces of information that the swapinfo command can show. If
the percent used line is greater that 0, then the system is swapping. This is a clear sign that there is not
enough physical ram installed on the system. There are only two methods to make a system stop swapping,
the first is install more physical memory and the other is to reduce the processes running on the system. The
dev line that was used in this example was altered to show what the output would look like if the system was
paging. It is unlikely that a system would page processes to the swap area if it still had 149 megs of
reserve space left.

January 2004 SWAP PAGE: 11


UXIE-SUPPORT STUDENT HANDOUT

SWAPINFO example
# swapinfo -t
Kb Kb Kb PCT START/ Kb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 303104 16184 286920 5% 0 - 1 /dev/vg00/lvol2
reserve - 85856 -85856
memory 88952 54808 34144 62%
total 392056 156848 235208 40% - 0 -
#

Reserve: swap space still allocated by SWAPPER over device swap.

# swapinfo -t
Kb Kb Kb PCT START/ Kb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 303104 16184 286920 5% 0 - 1 /dev/vg00/lvol2
localfs 10240 0 10240 0% 10240 0 4 /home/paging
reserve - 85428 -85428
memory 88952 60116 28836 68%
total 402296 161728 240568 40% - 0 -
#

Swap-12 UXIE-SUPPORTvB

The "memory" line is the value of all available memory on the system, it will reflect not only memory that is
used for pseudo swap but also all other uses of the memory as well, including buffer cache. Therefore you
will see values for "Mb USED" and "PCT USED" even when no memory has been reserved for pseudo swap.
It is important to note that swap space is allocated from the swap space with the lowest priority first, and
pseudo swap is always configured as the highest priority. Therefore, it will always be used last, after all other
swap spaces are full.

January 2004 SWAP PAGE: 12


UXIE-SUPPORT STUDENT HANDOUT

Swap size limitations

Max SWAP size = maxswapchunks*swchunk*DEV_BSIZE

default MAX
nswapdev 10 25
nswapfs 10 25
maxswapchunks 256 16384
swchunk 2048 16384
DEV_BSIZE(byte) 1024

So the maximum default SWAP is 512MB

Swap-13 UXIE-SUPPORTvB

Several kernel tuneable parameters limit the amount of swap that can be made
available.

The default maximum amount of swap space you can configure, for both device
swap and file system swap combined, is approximately 512 MB. The tuneable
system parameter, "maxswapchunks", controls this maximum.

This parameter (default value of 256) limits the number of swap space chunks.
The default size of each chunk of swap space is 2 MB. the size of a swap chunk
can be modified with the "swchunk" kernel tuneable parameter.

The system parameter "nswapdev" in /stand/system sets the maximum number of


dynamically configured swap devices at 10 (default). The maximum is 25. More
than "nswapdev" swap devices would require a kernel regeneration. - The system
parameter, "nswapfs", determines the maximum number of file systems you can
enable for swap. The default is 10 and the maximum is 25.

January 2004 SWAP PAGE: 13


UXIE-SUPPORT STUDENT HANDOUT

QUESTION

What happen if your swap is fully used?

Are you able to start a new process?

Swap-14 UXIE-SUPPORTvB

If the swap is fully used, the swapper can NOT find any area for new processes.
Your system is in an hang mode, waiting for new free swap areas.
Before to reach such situation, you may have seen very bad performance.

January 2004 SWAP PAGE: 14


UXIE-SUPPORT STUDENT HANDOUT

What should be the SWAP size?


SWAPPER will allocate the needed swap for your process.
If you have NOT enough free swap, the process is
NOT started.

You need to know:

• How much swap is recommended per Application?


• How many applications will run simultaneously?
• Lot’s of System resources used. (NFS)

Swap-15 UXIE-SUPPORTvB

The 'rule of thumb' as twice the physical memory is generally enough.

Keep in mind that, if you're using applications like SAP, your swap requirements could increase
in very large proportions. i.e. the last SAP version requires up to 20 GB swap space.
Depending on how much additional swap you add, you could have to increase 'maxswapchunks’
in the kernel.
If you use 'sam' to add a secondary swap, the program will automatically do that for you and build
a new kernel. If your system has insufficient main memory for all the information it needs to
work with, it will move pages of information to your swap area or swap entire processes to your
swap area. Pages that were most recently used are kept in main memory while those not recently
used will be the first moved out of the main memory.

Many system administrators spend an inordinate amount of time trying to determine what is the
right amount of swap space for their system. This is not a parameter you want to leave to a rule
of thumb. You can get a good estimate of the amount of swap you require by considering the
following three factors:

1/ How much swap is recommended by the applications you run?


Use the swap size recommended by your applications.

2/ How many applications will you run simultaneously? Sum the swap space
recommended for each application.

3/ Will you be using substantial system resources on periphery


functionality such as NFS? The nature of NFS is to provide access to file systems,
some of which may be very large, so this may have an impact on your swap space
requirements.

January 2004 SWAP PAGE: 15


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 18

Is it possible to define a swap NOT in vg00?

‰ YES
‰ MAY BE
‰ NO
‰ DON’T KNOW

Swap-16 UXIE-SUPPORTvB

YES
But you MUST configure it in the/etc/fstab file.

January 2004 SWAP PAGE: 16


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 19

Where is stored the primary swap value?

‰ /stand/vmunix
‰ BDRA
‰ /etc/fstab
‰ /etc/lvmtab

Swap-17 UXIE-SUPPORTvB

The primary swap information is stored in the kernel and in the BDRA.
Two right answers.

January 2004 SWAP PAGE: 17


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 20

Swap area with the same priority are used?

‰ In the order they have been created


‰ in interleave fashion.
‰ In the order they appear in /etc/fstab
‰ it is only used for checking

Swap-18 UXIE-SUPPORTvB

The system will used the first available swap area in an interleave fashion.
This means:
a little one, then a little the next one, such as for a stripping disk.

January 2004 SWAP PAGE: 18


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 21

In this swapinfo example, is the system paging?

‰ Yes
‰ No

Swap-19 UXIE-SUPPORTvB

NO.
One dev line with 128 MB total swap space.
50 MB reserved, so 50 MB of processes running and 50 MB of swap swap
reserved leaving 78 MB free.
0% used on the dev line, so there is no paging.

No memory problems here!

January 2004 SWAP PAGE: 19


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 22

In this swapinfo example, is the system in trouble?

‰ Yes
‰ No

Swap-20 UXIE-SUPPORTvB

YES.
Dev line shows a 128 MB swap device. 44 MB has been paged to this device.
Reserve line shows 80 MB of memory has been used and therefore reserved on
the device swap.
This system is in trouble! 97% of the virtual memory is gone. If they try to run
more processes they will probably get out of space errors.
44 MB actually written to by paging + 80 MB reserved = 124 MB of 128 MB
total available.

What could you recommend to this customer?

If they don’t want to swap, they should install more RAM.

If they don’t mind paging and just want to be able to run


more processes, turn on the pseudo swap and/or add more
swap devices.

January 2004 SWAP PAGE: 20


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 23

In this swapinfo example, is the system able to use more virtual memory?

‰ Yes
‰ No

Swap-21 UXIE-SUPPORTvB

YES.
The system is out of swap, but it can use more virtual memory.
The screen shows 177 MB worth of processes:

83 MB paged
45 MB reserved on device
49 MB used on pseudo swap

But the memory line shows usage other than pseudo swap (NFS, system
resources) so it is not a good indicator.

The important thing to understand is that with pseudo swap turn on, the device
swap can be used for paging and not for reservation.

January 2004 SWAP PAGE: 21


UXIE-SUPPORT STUDENT HANDOUT

UXIE-SUPPORT
8 modules in virtual class: Module 1 Boot
Module 2 Recovery
Module 3 LVM
Module 4 Mirroring
Module 5 SD-UX Patch
Module 6 Swap Dump
Module 7 File Systems
Module 8 Ignite

Dump-1 UXIE-SUPPORTvB

January 2004 DUMP PAGE: 1


UXIE-SUPPORT STUDENT HANDOUT

OBJECTIVES

Customer Engineers are responsible


to configure the dump process in order to
minimize customer down-time, while still
retrieve enough information to fix the
problem.

Dump-2 UXIE-SUPPORTvB

January 2004 DUMP PAGE: 2


UXIE-SUPPORT STUDENT HANDOUT

Reasons for having a DUMP?

• To have a snapshot of computer’s memory


at the time of a failure.

• To analyse it later in order to determine the cause


of the crash.

Dump-3 UXIE-SUPPORTvB

When your system crashes, it is important to know why, so that you can take
actions to hopefully prevent it from happening again. Sometimes, it is easy to
determine why: for example, if somebody trips over the cable connecting your
computer to the disk containing your root file system (disconnecting the disk).
At other times, the cause of the crash might not be so obvious. In extreme cases,
you might want or need to analyse a snapshot of the computer's memory at the
time of the crash, in order to determine the cause of the crash.

January 2004 DUMP PAGE: 3


UXIE-SUPPORT STUDENT HANDOUT

Who starts a (core) DUMP?

 HP-UX after a crash:


• HPMC (automatic dump)
• PANIC (automatic dump)

 USER (operator)
• Hang mode (TOC transfer of control)

Dump-4 UXIE-SUPPORTvB

HPMC (High Priority Machine Check) Hardware Problem


PANIC Software Bug
HANG No response from the system

After the panic, HPMC, or TOC, the system will start writing out its memory to the dump area
reserved on the disk for this purpose.
This is performed by a low-level routine in the kernel.
Once the kernel has finished dumping, the system is rebooted.
After the system reboots, the savecore (1m) command is called by the /sbin/rc1.d/S440savecore
startup script using the /etc/rc.config.d/savecore configuration file.
This table shows the various structures that control saving the core dump:

10.20 11.X
The memory image directory /var/adm/crash /var/adm/crash
Contains kernel file that was running vmunix vmunix
Contains dump information INDEX INDEX
dump count, incremental bounds bounds
core chunks core.x.y image.x.y
Directory that contains the memory core.x crash.x
chunks and the vmunix file

NOTE These are large files. Every time the system dumps another pair will be
created in 10.20 (core.1, vmunix, and so on) and in 11.X (crash.1, vmunix, and so on).

January 2004 DUMP PAGE: 4


UXIE-SUPPORT STUDENT HANDOUT

What is a (core) DUMP?

• Image of the physical memory.

• Full memory image or portions of it(selective dump).

• Copied to predefined dump devices.

• Early dump

• partial dump

Dump-5 UXIE-SUPPORTvB

When the system crashes, HP-UX tries to save the image of physical memory, or certain portions of it, to
predefined locations called dump devices. Then, when you next reboot the system, a special utility copies
the memory image from the dump devices to the HP-UX file system area. Once there, you can analyse the
memory image with a debugger or save it to tape for shipment to someone else for analysis.
Prior to HP-UX Release 11.0, devices to be used as dump devices had to be defined in the kernel
configuration, and as of Release 11.0 they still can be. However, beginning with Release 11.0, a new, more
flexible method for defining dump devices is available. Beginning with Release 11.0, there are three places
where dump devices are configured:

{In the kernel (same as releases prior to Release 11.0)


{During system initialization when the initialization script for crashconf runs (and reads entries
from the /etc/fstab file)
{During runtime, by an operator or administrator manually running the /sbin/crashconf
command.

An early dump corresponds to a crash, which occurs in the ISL part of the boot process. The system is
unable to manage the dump process as it can not initialize the dump devices at this point. To troubleshoot,
it is possible to force the system to create a dump in a different disk drive on the same bus. To use disk
c0dt4d0 on hardware path 10/0.4.0

Use the command: ISL>hpux -aD (10/0.4.0)/stand/vmunix

The dump is backed up with the command: #savecrash -D /dev/dsk/c0t4d0 -O 0

A partial dump is a dump, which had no enough dump device space to fully create the memory dump. Only
a part of the memory dump has been backed up. You can still analyze it and find some information if you
are lucky to store the problem in the dump device.

January 2004 DUMP PAGE: 5


UXIE-SUPPORT STUDENT HANDOUT

Core Dump Save Cycle


System crash

Memory
1
dump
areas
2

Files

1 Initiate by HP-UX or TOC (button or command)

2 Savecrash script is configured by the /etc/rc.config.d/savecrash file


crashutil command will complete the partial savecrash backup.

Dump-6 UXIE-SUPPORTvB

An HP-UX system crash is an unusual event. When a system panic occurs it means that HP-UX
encountered a condition that it did not know how to handle (or could not handle). Sometimes you
know right away what caused the crash (for example: a power-failure, or a forklift backed into the
disk array, etceteras). Other times the cause is not readily apparent. It is for this reason that HP-
UX is equipped with a dump procedure to capture the contents of memory at the time of the crash
for later analysis
HP-UX will dump as much of your computer’s physical memory contents to the dump devices as
dump space permits. A panic message will also be written to the system console and logged in the
file /var/adm/shutdownlog (or /etc/shutdownlog), if shutdownlog exists.

When the system crashes, HP-UX tries to save the image of physical memory, or certain portions
of it, to predefined locations called dump devices. Then, when you next reboot the system, a
special utility copies the memory image from the dump devices to the HP-UX file system area.
Once there, you can analyse the memory image with a debugger or save it to tape for shipment to
someone else for analysis.
Savecrash could be disabled and can be activated from the shell (SAVECRASH=0)
but all swap devices will be erased by Unix during the bot

#savecrash -p option will delay dedicated devices to be backed up


(or SAVE_PART=1)

January 2004 DUMP PAGE: 6


UXIE-SUPPORT STUDENT HANDOUT

Where to configure dump devices?

•In the KERNEL SAM /stand/system file #crashconf

dump default dump lvol

dump none dump hardwarepath <options>

•at initialisation time initialisation script for crashconf


reads entries in /etc/fstab

•during runtime #crashconf command

Dump-7 UXIE-SUPPORTvB

Beginning with Release 11.0, there are three places where dump devices are configured:
1/ In the kernel (same as releases prior to release 11.X) known as easy
configuration.
The default is dump default (obviously!) which tells the kernel to use the
primary swap as its dump area.
dump lvol Tells the kernel to look in the BDRA for dump logical
volumes.
dump hardwarepath <option> Tells the kernel to dump on a specific
disk device. The <option> field allows a disk section to be specified for
backwards compatibility. Otherwise the whole disk is used.

NOTE If the kernel file needs to be modified, the system will have to be rebooted
in order for the change to take effect.

2/ During system initialisation when the initialisation script for savecrash


runs (and reads entries from the /etc/fstab file).
3/ During runtime, by an operator or administrator manually running the
/sbin/crashconf command.
SAM can configure crashconf with 3 parameters:
Enable crashconf
Allow crashconf to scan fstab
Allow crashconf to override the kernel

January 2004 DUMP PAGE: 7


UXIE-SUPPORT STUDENT HANDOUT

Main utilities

•savecrash Script used to save the dump in files


configured by
/etc/rc.config.d/savecrash

•crashconf Script used to change the dump configuration


configured by
/etc/rc.config.d/crashconf

•crashutil utility used to save dump in different format

Dump-8 UXIE-SUPPORTvB

Savecrash defines some variables use to configure the script


SAVECRASH 0 disable 1 enable
SAVECRASH_DIR directory where to store the files of the dump
SWAP_LEVEL SWAPEACH swapon called after each swap backed
NOSWAP swapon never called
SWAPEND swapon called at the end of the backup
SAVE_PART 1 save only swap dump device
Crashconf defines some variables use to configure the script
CRASHCONF_ENABLED 1 enable
CRASHCONF_READ_FSTAB 1 enable
CRASHCONF_REPLACE 1 override the kernel
Savecore reads the dump areas to create the core.N directory and vmunix.N file in the filesystem.
If the primary swap area is not used as a dump area, then savecore clears the dump areas in background. If
primary swap is used as a dump area then it is cleared first. Remaining dump areas are cleared in background.
If the dump areas are also used for swap, then as they are cleared they are reenabled for swap automatically.

Savecore (10.20) is controlled Savecrash (11.X) is controlled


by the variables found in by the variables found in

/etc/rc.config.d/savecore. /etc/rc.config.d/savecrash
SAVECORE=1 Enable savecore SAVECRASH=1 Enable savecore
SAVECORE_DIR=/var/adm/crash SAVECRASH_DIR=/var/adm/crash

Change these variables if the core files should be created elsewhere.

January 2004 DUMP PAGE: 8


UXIE-SUPPORT STUDENT HANDOUT

How to configure a DUMP?


3 MAIN CRITERIA

• System recovery Time.


Wait the end of the process to start the boot
(Non-swap devices, back up later in BG)
• Crash Information Integrity.
Full dump
• Disk Space Needs.
Selective dump, Compressed dump

Dump-9 UXIE-SUPPORTvB

The new capabilities give you a lot more flexibility, but you need to make some important decisions
regarding how you will configure your system dumps. There are three main criteria to consider:
System recovery time
Crash information integrity
Disk space needs

What’s a selective dump:


The physical memory is broken down into 8 different page classifications, displayed using the
crashconf command :

UNUSED
KCODE (kernel code)
KSDATA (kernel static data)
FSDATA (file system metadata)
BCACHE (other buffer cache)
KCODE (kernel dynamic code)
KDDATA (dynamic kernel data)
USTACK (user stack) USERPG (other user pages)

Default configuration is represented in Bold characters.

January 2004 DUMP PAGE: 9


UXIE-SUPPORT STUDENT HANDOUT

Dump time user options (10s delay)

Default dump type can be overridden (at the reboot)

(Full or selective dump)

Dump already in progress can be interrupted (aborted)

(using the escape key)

Enhanced status display

(dump progress shown)

Dump-10 UXIE-SUPPORTvB

If you are running HP-UX Release 11.0 or later, an operator at the system console at the time of the system
crash will see the panic message and a message similar to the following:

*** A system crash has occurred. (See the above messages for details.)
*** The system is now preparing to dump physical memory to disk, for use
*** in debugging the crash.
*** The dump will be a SELECTIVE dump: 21 of 128 megabytes.
*** To change this dump type, press any key within 10 seconds.
[ A Key is Pressed ]
*** Select one of the following dump types, by pressing the corresponding key:
N) There will be NO DUMP performed.
S) The dump will be a SELECTIVE dump: 21 of 128 megabytes.
F) The dump will be a FULL dump of 128 megabytes.
*** Enter your selection now.

If the reason for the system crash is known, and a dump is not needed, the operator can override any dump
device definitions by entering N (for no dump) at the system console within the 10-second override period. If
disk space is limited, but the operator feels that a dump is important, the operator can enter S (for selective
dump) regardless of the currently defined dump level.

January 2004 DUMP PAGE: 10


UXIE-SUPPORT STUDENT HANDOUT

Dump devices

Primary Swap (lvol2)

Auxiliary Swap (secondary swap)

file system (define in /etc/fstab)

Dedicated Logical Volume (any lvol)

Hardware path

I/O card firmware limitations:


first 2GB or the first 4GB
Dump-11 UXIE-SUPPORTvB

A dump device cannot be larger than 2 GB, unless patch PHCO_17275 is installed

The I/ O card firmware limitations that restrict dumps to the first 2GB or 4GB of a disk are being
removed over time. New cards do not have these limits. Look for firmware upgrades for your
older cards.
In the case of logical volumes, it is not necessary to define each volume that you want to use as a
dump device. If you want to dump to logical volumes, the logical volumes must meet all of the
following requirements:

Each logical volume to be used as a dump device must be part of the root
volume group (vg00). For details on configuring logical volumes as
kernel dump devices, see the lvlnboot (1M) manpage.

The logical volumes to be used as dump devices must be contiguous (no


disk striping, or bad- block reallocation is permitted for dump logical
volumes)

The logical volume cannot be used for file system storage, because the
whole logical volume will be used To use logical volumes for dump
devices (regardless of how many logical volumes you want to use),
include the following dump statement in the system file:

You can define entries in the fstab file to activate dump devices during the HP-UX initialization
(boot) process, or when crasconf reads the file.

NOTE Unlike dump device definitions built into the kernel, with run time dump
definitions you can use logical volumes from volume groups other than the root volume group.

January 2004 DUMP PAGE: 11


UXIE-SUPPORT STUDENT HANDOUT

Default (foreground) Default dump process


Default (background)

crash 1 2

Primary Swap
lvol2

Dump devices

SAVECRASH
Auxiliary swap
/var/adm/crash/crash.x
SWAP
–INDEX
Dump devices –vmunix
–image.1.1
–image.1.2

Dedicated device
DUMP

Swapon is called as lvol2 has been backed up


Dump-12 UXIE-SUPPORTvB

Savecrash is a script, which is configured by the file /etc/rc.config.d/savecrash and can be


disabled.
After the operator is given a chance to override the current dump level, or the 10-second override
period expires, HP-UX will write the physical memory contents to the dump devices until one of
the following conditions is true:
The entire contents of memory are dumped (if a full dump was configured or requested by the
operator)
The entire contents of selected memory pages are dumped (if a selective dump was configured or
requested by the operator)
Configured dump device space is exhausted
Depending on the amount of memory being dumped, this process can take from a few seconds to
hours.

NOTE While the dump is in occurring, status messages on the system console will
indicate the dump’s progress.

January 2004 DUMP PAGE: 12


UXIE-SUPPORT STUDENT HANDOUT

Default (foreground) Speed up the boot process


Default (background)
The primary swap is not used.
savecrash has been disabled.

crash 1 2 Savecrash disabled 3 #savecrash -p &

Auxiliary swap /var/adm/crash.x


SWAP
4 #crashutil
Dump devices

Dedicated device /var/adm/crash.x


DUMP

Dump-13 UXIE-SUPPORTvB

The risk is to erase the secondary swap so its better not to use it ( if we have enough disk space) if a new
crash occurs, the new dump will override the previous one.
Faster reboot and availability of a system after a crash.
If the primary swap does not contain a core dump the system can use primary swap immediately. If
additional dump devices are configured, the dump is retrieved from there in background. Dump logical
volumes and disk sections can co-exist, however the easiest way to do this is with logical volumes. Total
dump area space must be physical memory plus kernel size. Dump Logical volumes must be in root volume
group: this means that root must be on a logical volume too. A dump logical volume cannot be greater than 2
GB in size and must lie in the first 2GB of the physical volume (ISL>hpux loader restriction).
Dump volumes must use contiguous extent allocation and no bad block relocation. (-C y -r n). Don’t use
primary swap for dump if a fast reboot is needed

Using crashutil to Complete the Saving of a Dump


If you are using devices for both paging (swapping) and dumping, it is very important not to disable
savecrash processing at boot time. If you do, there is a chance that the memory image in your dump area will
be overwritten by normal paging activity. If, however, you have separate dump and paging devices (no single
device is used for both purposes), you can delay the copying of the memory image to the HP-UX file system
area in order to speed up the boot process, to get your system up and running as soon as possible. You do this
by editing the file /etc/rc.config.d/savecrash and setting the environment variable called SAVECRASH=0.
If you have delayed the copying of the physical memory image from the dump devices to the HP-UX file
system area in this way, run savecrash manually to do the copy when your system is running and when you
have made enough room to hold the copy in your HP-UX file system area. If you chose to do a partial save
by leaving the SAVECRASH environment set to 1, and by setting the environment variable called
SAVE_PART=1 (in the file /etc/rc.config.d/savecrash) the only pages that were copied to your HP-UX file
system area during the boot process are those that were on paging devices. Pages residing on dedicated dump
devices are still there. To copy the remaining pages to the HP-UX file system area when your system is
running again, use the command called
crashutil. See the crashutil (1M) for details.

January 2004 DUMP PAGE: 13


UXIE-SUPPORT STUDENT HANDOUT

What’s the purpose of crashconf?

ΠRuntime configuration (dynamic) no reboot


ΠLogical volumes for non-root VG can be used
ΠDump devices can be configured in /etc/fstab
Πconfigured in /etc/rc.config.d/crashconf
ΠDisplay the current configuration
(last displayed device is the first used)
Πdetermine the amount of dump space needed

Dump-14 UXIE-SUPPORTvB

Crashconf options:
CRASH_CONF_ENABLED=1 run crashconf at startup
CRASH_INCLUDED_PAGES=“”
CRASH_EXCLUDED_PAGES=“”memory page classes
CRASHCONF_READ_FSTAB=1 read entries from /etc/fstab
CRASHCONF_REPLACE=0 add to initial (kernel) config

In some circumstances, such as when you are using the primary paging device along with other
devices, as a dump device, you care about what order they are dumped to following a system
crash. In this way you can minimize the chances that important dump information will be
overwritten by paging activity during the subsequent reboot of your computer.
The rule is simple to remember:
No matter how the list of currently active dump devices is built (from a kernel build, from the
/etc/fstab file, from use of the crashconf command, or any combination of these) dump devices are
used (dumped to) in the reverse order from which they were defined. In other words, the last
dump device in the list is the first one used, and the first device in the list is the last one used.
Therefore, if you have to use a device for both paging and dumping, it is best to put it early in the
list of dump devices so that other dump devices are used first.

To have crashconf replace any existing dump device definitions with the logical volume
/dev/vg00/lvol3 and the device represented by block device file /dev/dsk/c0t1d0:
/sbin/crashconf -r /dev/vg00/lvol3 /dev/dsk/c0t1d0

January 2004 DUMP PAGE: 14


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 24

Where is stored the dump configuration?

† /etc/rc.config.d/savecrash
† /stand/vmunix
† /etc/rc.config.d/crashconf
† /dev/vg00/lvol2

Dump-15 UXIE-SUPPORTvB

The dump configuration is stored in the kernel, so in the vmunix file.

The /etc/rc.config.d/savecrash file is only the configuration file,which is used to


manage the behavior of the savecrash utility.

The /etc/rc.config.d/crashconf file is only the configuration file,which is used to


manage the behavior of the crashconf utility.

/dev/vg00/lvol2is only the primary swap device.

January 2004 DUMP PAGE: 15


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 25

The operator is able to change the dump


process at the reboot before it occurs ?

† Yes
† No
† MAY BE
† DON’T KNOW

Dump-16 UXIE-SUPPORTvB

YES. The operator has the choice between full dump, selective dump and no
dump.

January 2004 DUMP PAGE: 16


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 26

Modifications done by crashconf need


a reboot to be effective ?

† Yes
† No
† MAY BE
† DON’T KNOW

Dump-17 UXIE-SUPPORTvB

NO. Crasconf modifications are dynamic.No reboot is needed to take care of


them.

January 2004 DUMP PAGE: 17


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 27

Modifications done by crashconf does NOT need


a reboot as they never override the kernel ?

† Yes
† No
† MAY BE
† DON’T KNOW

Dump-18 UXIE-SUPPORTvB

NO. Even no reboot is needed, the reason given is wrong.


If crashconf can NOT override the kernel, its modifications are valid only
up to the next reboot. To have permanent modification, the operator has to
change the crashconf configuration file (/etc/rc.config.d) to allow the utility
to override the kernel or to use SAM to do it.

January 2004 DUMP PAGE: 18


UXIE-SUPPORT STUDENT HANDOUT

UXIE-SUPPORT
8 modules in virtual class: Module 1 Boot
Module 2 Recovery
Module 3 LVM
Module 4 Mirroring
Module 5 SD-UX Patch
Module 6 Swap Dump
Module 7 File Systems
Module 8 Ignite

FS 1 UXIE-SUPPORTvB

January 2004 File System PAGE: 1


UXIE-SUPPORT STUDENT HANDOUT

OBJECTIVES

Students will be able to understand how a file system


is working. They will create and manage a File
System and its basic lay-out. They will know all
commands or utilities in order to fix a File
System's problem.

FS 2 UXIE-SUPPORTvB

January 2004 File System PAGE: 2


UXIE-SUPPORT STUDENT HANDOUT

What is a File System?


Directory
¾ Hierarchical structure
¾ File expansion
Results 0221 ¾ Structureless Files
Orders 0412 ¾ File & Device independence
forecast 4397 ¾ Security
personnel 0123 ¾ File sharing Security

Files

8014 0123 4397 0412 0221 4935

FS 3 UXIE-SUPPORTvB

One of the most powerful and attractive features of UNIX is the file system, which manages data
stored on the computer’s mass storage devices. The file System's facilities make it easy to organize
stored information in a very natural way and to retrieve and to modify it, as necessary. Many of the
file system’s features were unique when UNIX was first created, and they have been so popular that
they have since been duplicated in other commercial operating systems.
UNIX keeps track of files internally by assigning each one a unique identifying number, which are called
Inode numbers and are used only within the UNIX system.
UNIX provides users a way of organizing files grouping them into directories.A directory performs
the same function as a file drawer in a filing cabinet. Internally, a directory is just a special file, which
contains a list of file names and their corresponding inode numbers.
Main filesystem features are:

D Hierarchical structures: Users can group together related information and efficiently
manipulate a group of files as a unit. The resulting organization resembles the operation of
manual filing systems.
D File expansion: Files grow dynamically, as needed, taking up the amount of mass
storage space required to store their current content. The user is not forced to decide in
advance how a file will grow.
D Structureless files: UNIX imposes no internal structure on a file’s contents. The
user is free to structure and interpret the contents of a file in whatever way is appropriate.
D File & device independence: UNIX treats files and input/output devices
identically. The same procedures and programs used to process information stored in files
can be used to read data from a terminal, print it on a printer, or pass it to another program.
DSecurity: Files can be protected against unauthorized use by different users
of the UNIX system.
D File sharing: Several application programs can access a file concurrently to read
or update its contents. UNIX provides methods to coordinate file access and maintain
data integrity.

January 2004 File System PAGE: 3


UXIE-SUPPORT STUDENT HANDOUT

Different File systems

¾ NFS Network File System


¾ LOFS Loopback File System
¾ CDFS CD-ROM File System
¾ HFS Unix File System
¾ VxFS Veritas File System

FS 4 UXIE-SUPPORTvB

NFS: NFS allows many systems to share the same files by using a client/server approach.
Since access techniques are transparent, remote files access appears similar to local file access. Both HFS
& JFS could be shared with other system using NFS. NFS provides transparent access to files anywhere
on the network. An NFS server makes a directory available to other hosts by “exporting” the directory.
An NFS client provides access to the NFS server’s directory by “mounting” the directory to user on the
NFS client, the directory looks like part of the local file system.
LOFS: LOFS allows mounting an existing directory onto another directory. The effect is
similar to that of a symbolic link.
CDFS: Compact Disk Read-Only-Memory file system. The information is virtually
permanent; You can read data from a CD, but you cannot write to one. The format of this file system
conforms to the ISO-9660 standard.
HFS: High Performance File System (or Hierarchical File System) represents the standard
implementation of the UNIX file system (UFS). Prior HP-UX release 10.01 this was HP’s only disk
based file system. HP’s long term strategy is for VxFS to become the default one.
HFS continues to exist for compatibility.
JFS: The HP-UX Journaled File System, also known as Veritas File System is an extent-
based journaling file system which offers fast file system recovery and on-line features such as on-line
backup, on-line resizing, on-line reorganisation. An intent log contains recent file system data structure
updates. After a failure the system checks the intent log and performs the required rollback or roll
forward.
There are two JFS products, base and online. The base JFS supports the fast recovery feature and is
included in all 10.01and later system releases. Online JFS, also referred to as Advanced VxFS is an
optional product which adds extensions to JFS. (on-line: defragmentation & reorganisation, expansion&
contraction, backup)

As compared to HFS, VxFS is particularly useful in environments that require high performance or deal
with large volumes of data. This is because the unit of file storage, called an extent, can be multiple
allowing considerably faster I/O than with HFS.

January 2004 File System PAGE: 4


UXIE-SUPPORT STUDENT HANDOUT

How it works?

The real user’s data

Metadata structure How to find the data in the file system?

 Superblock
 Inodes
 Directories

FS 5 UXIE-SUPPORTvB

Disk space allocated to a file system is subdivided into multiple blocks. The block may be used
for two different purposes:
Some blocks store the actual data contained in files. These data blocks account
for the majority of the blocks in most system.
However, some blocks in every file system store the file system’s metadata. A file system’s
metadata describe the structure of the file system. The metadata that are common to most file
system types are describe below:

Superblock: Every file system has a superblock that contains general information about the
file system. The superblock identifies the file system size and status and contains pointers to all of
the other file system metadata structures. Since the superblock contains such critical information,
HP-UX maintains multiple redundant copies of it.

Inodes: Every files has an associated inode containing the attributes of the file. The
inode identifies the file’s type, permissions, owner, group & size. A file’s inode also contains
pointers to the data blocks associated with the file. Each inode is identified by a unique inode
number within the file system.

Directories: Users & applications generally reference files by name, not by inode number.
The purpose of a directory is to make a connection between names and their associated inode
number.

January 2004 File System PAGE: 5


UXIE-SUPPORT STUDENT HANDOUT

Physical disk drive


One track

One cylinder group

One cylinder

/usr/sbin/mkfs size nsect ntrack blksize fragsize ncpg minfree rps nbpi

FS 6 UXIE-SUPPORTvB

The control structures are then spread across these cylinder groups. This has the effect of
localizing the control structures with the data and thus reducing the size of most of the head
movements, thus increasing the performance.
Understanding the mechanical nature of disk drives.
Another area where dramatic improvements in performance were possible was in the layout of
data. The Bell filesystem tended to layout its data in the simplest of fashions. The UFS on the
other hand likes to calculate which blocks would be the fastest to use in any given situation. To
this end lots of details about the disk need to be programmed into the filesystem at creation time.
New filesystems are created by the command mkfs, which takes as its arguments many
characteristics of the disk device.
/usr/sbin/mkfs size nsect ntrack blksize fragsize ncpg minfree rps nbpi
size The size of the filesystem was needed to correctly set-up the superblock and the rest of the filesystem.
Nsect The number of sectors per track, actually these are the number of DEV_BSIZE units. All IO to block style device
such as disk is performed in DEV_BSIZE units.
Ntrack The number of tracks per cylinder, this is the number of data surfaces the in the disk drive.
Blksize The block size you have chosen for this filesystem.
Fragsize The size of the fragments to split blocks into.
Ncpg The number of cylinders per group, typically 16 but there are several limiting factors that can affect this.
Minfree A tuning parameter used to limit the fragmentation of the filesystem.
Rps In order to be able to choose the optimum location for data the system needs to know how fast the disk spins. This
value is used in conjunction with another tuning parameter rot_delay, which can not be set explicitly from the mkfs
command, and needs to be set afterwards using tunefs.
Nbpi The size of the inode table is set as a ratio of bytes to inodes. For HP-UX 10 and 11 this default to a value of 6144
again there are other limiting factors that will affect the actual number of inodes created.

January 2004 File System PAGE: 6


UXIE-SUPPORT STUDENT HANDOUT

Blocks and Fragments


Fragments (default 1k)

Blocks (default 8k)


Name Size
file1 200
file2 1K

19,8K space wasted Total space used 24K file3 3K

FS 7 UXIE-SUPPORTvB

Blocks and Fragments


Another change that McKusick made was to break up the relationship where by the block size
governs both the minimum unit of disk space that can be allocated and the size used for data
transfers.
For the UFS a larger block size is used and then this can be divided into smaller pieces to allow
small files to be stored more efficiently. Unfortunately these smaller pieces were give the name
fragments. If the block size of a Bell filesystem were to be raised to 8K (the default size on HP-
UX) then it could lead to the wasting of large amount of disk space when handling small files.
Since Unix tends to have large numbers of small files, this sort of arrangement is going to be
highly wasteful of disk space.
On the other hand a larger block size allows more data to be transferred between the disk and the
system for a given number of disk seeks etc…
The UFS solution allows the blocks to be broken into 1,2,4 or 8 fragments of 1,2,4 or 8K in size.
Small files are then able to utilize these fragments with the following restrictions:
>The fragments are only used at the end of a file
>All the fragments are contiguous
>All the fragments come from the same block
> Only small files that use direct pointers (see the section on UFS disk inodes) can use fragments.
Since increasing the block size is now no longer wasteful of disk space, the block size can be
increased to allow for more efficient transfers of larger amounts of data. Of course if the size of
the pieces of data to be transferred is small then this too could be wasteful. The block size needs
to be chosen to match the application. Generally speaking where large or sequential transfers are
to be made then the larger the block size the higher the transfer speed will be.

January 2004 File System PAGE: 7


UXIE-SUPPORT STUDENT HANDOUT

HFS file system lay-out

Main Super Cylinder Cylinder


Boot Block Group Groups
Super Data
Block Copy 0
Inodes
Summary
Block Blocks

Super Cylinder
Block Group
Data Data
Blocks Copy 1 Inodes Blocks

Super Cylinder
Block Group
Data Data
Blocks Copy 2 Inodes Blocks

FS 8 UXIE-SUPPORTvB

As part of the development of the 4.2BSD version of Unix a redesign of the filesystem was
undertaken by one of the developers, Kirk McKusick. Minor changes were made to this design for
4.3BSD and it is this design that has been until recently the main filesystem used in HP-UX.
HP-UX refers to this filesystem as HFS, but it has many names:

¾-The High performance Filesystem (HFS in HP-UX)


¾-The Berkeley filesystem
¾-The Fast Filesystem (FFS in SYS V.4)
¾-The McKusick filesystem
¾-The Unix Filesystem (UFS)

Berkeley simply refer to the filesystem as the UFS, the Unix filesystem, and since the code to
handle it comes from them, our source code refers to it as UFS, and so this is the name that will be
used in the rest of the module. The UFS sets out to tackle many of the issues that we earlier
identified with the original Bell filesystem. The overall filesystem is subdivided into smaller
pieces known as cylinder groups.

January 2004 File System PAGE: 8


UXIE-SUPPORT STUDENT HANDOUT

The Boot Block

Main Super Cylinder Cylinder


Boot Block Group Group
Super Data
Block Copy 0 Summary
Block Inodes Blocks

FS 9 UXIE-SUPPORTvB

At the start of the disk in block zero is an area that technically does not belong to the filesystem, it
can be used for other purposes. Typically it was used, as a boot block where the disk was
configured as the systems boot device.

January 2004 File System PAGE: 9


UXIE-SUPPORT STUDENT HANDOUT

The Super Block


Main Super Cylinder Cylinder
Boot Block Group Group
Super Data
Block Copy 0 Inodes
Summary
Block Blocks

/usr/sbin/mkfs size nsect ntrack blksize fragsize ncpg minfree rps nbpi

z filesystem type
z filesystem size ¾ size
z cylinder group size z clean flag
z block size ¾ blksize z last mount point
z fragment size ¾ fragsize Dynamic data z free inode count
Static data z free block count
z tracks per cylinder ¾ ntrack
z sectors per track ¾ nsect z free fragment count
z inodes per cylinder group z directory count
z rotational delay
z rotation speed ¾ rps

FS 10 UXIE-SUPPORTvB

After this comes the "Superblock", which contains the fundamental description of the filesystem
as a whole. In the Bell superblock the free maps were also held.
Superblocks
The cylinder groups provide a convenient place to store the redundant copies of the superblock.
Since loss of the superblock data can make the recovery of data from a filesystem extremely
difficult it is useful to have spare copies of it on the disk for use by fsck or fsdb in an emergency.
When the filesystem is created mkfs attempts to spread these superblock copies around the disk,
aiming to get copies onto different disk surfaces and into different radial positions. This improves
the chance of being able to successfully locate an intact superblock copy after a mechanical head
crash. These days it is rare to attempt recovery under such circumstances but there are companies
who specialize in this sort of operation.
The redundant superblock copies are not accessed or updated during normal operation. Although
they contain a full copy of the superblock, the dynamic fields are not kept up to date. These blocks
are only updated when the size of the filesystem is changed with the extendfs command and when
the -A option is used with tunefs.
The superblock gives an overall description of the disk and includes the physical
characteristics of the device. Also included is current summary information such as the amount of
free space.

January 2004 File System PAGE: 10


UXIE-SUPPORT STUDENT HANDOUT

The Cylinder Group

Main Super Cylinder Cylinder


Boot Block Group Group
Super Data
Block 0 Summary
Block Copy Inodes Blocks

z pointer in the cylinder group (offset)


z pointers to first inode, block
z layout summary (# dir, blocks,inodes, fragments)
z free inode map (fixed structure)
z free fragment map (fixed structure)

FS 11 UXIE-SUPPORTvB

Localizing the control structures and the data


By holding the free map for the cylinder group within the group itself and also a portion of the
inode table, filesystem updates can often happen without the need for large head movements
between the data and its controlling structures.
The reduction in the number of these large and frequent head movements is good for
performance.
Unfortunately not all of these movements can be eliminated as we shall see, but UFS does a good
job in reducing them in most circumstances.
Cylinder groups Each cylinder group contains an area of control structures in addition to user
data.
This includes the redundant superblock copy, the free maps for both fragments and inodes, and
local summary information.

January 2004 File System PAGE: 11


UXIE-SUPPORT STUDENT HANDOUT

The Cylinder Groups Summary

Main Super Cylinder Cylinder


Boot Block Group Group
Super Data
Block 0 Summary
Block Copy Inodes Blocks

Number of directory Number of free inodes

0000000 0000 0002 0000 0124 0000 017c 0000 000c Cylinder group 0

0000010 0000 0000 0000 012c 0000 0180 0000 0000 Cylinder group 1

0000020 0000 0000 0000 0128 0000 0180 0000 0000 Cylinder group 2

0000030 0000 0000 0000 0000 0000 0000 0000 0000


Number of free blocks Number of free fragments

FS 12 UXIE-SUPPORTvB

Inodes The inode table for UFS is split up so that each cylinder group contains a part of the
overall table. This way the inode for a file can be stored close to its data.
Directories The view of the Unix filesystem that users see, with its hierarchical structure, is
formed by a storing filenames in special files called directories.
The directory files hold the filenames and the inode numbers.

January 2004 File System PAGE: 12


UXIE-SUPPORT STUDENT HANDOUT

The Inodes

Main Super Cylinder Cylinder


Boot Block Group Group
Super Data
Block Copy 0 Inodes Summary
Block Blocks

Inode content (128 bytes) Inode type

z type z Regular file


z permission z pointers to data block z Directory
z number of name z Major & minor number z Special file
z ownership + one of
z link to target name z Symbolic link
z size (length) z pipe information z Pipe
z size (# fragments) z Continuation inode
z usage time

Max 2048 inodes per cylinder group (limited by the free inode map)

FS 13 UXIE-SUPPORTvB

Each file on the filesystem is described by an inode, whether it is a regular file, a device file or a
fast symbolic link, there is an inode.
Type: Specifies the type of the file associated with this inode.
Permissions: Gives executable, read or write for the user, group & file permissions.
Number of names: Gives the link count of the file.
Ownership: Stores the UID & the GID.
Size: stores the offset of the byte in the file.
Size: The number of fragments held by the file
Time: this is the generation number. Each time an inode is re-allocated the generation
number is incremented. This is very useful for NFS operations.
Continuation inode: It is how HP-UX implements access control list for extended
file permission.
Pointers: UFS inode contains 12 direct pointers, which store the data block address.
If the file grows beyond this point, then a system of indirection is used. The thirteenth pointer is a
single indirection, using an 8K block size. This block could contain 2048 pointers, enough to map
16 Mbytes of file space.
The fourteenth pointer is for double indirection and the last one for triple indirection.

Indirect pointers
z pointers
Data
Blocks
z Data
z pointers
pointers
Blocks

Data
pointers pointers Blocks
pointers
January 2004 File System PAGE: 13
UXIE-SUPPORT STUDENT HANDOUT

#fsdb -F vxfs /dev/vg00/friends


> im
Symbolic hard links
> 9i #ln /friends/tom/janice /friends/harry/jan
inode structure at 0x00000132.0100
type IFREG mode 100666 nlink 2 uid 0 gid 3 size 20
atime 1035132206 0 (Sun Oct 20 18:43:26 2002 MET)
mtime 1035132700 620000 (Sun Oct 20 18:51:40 2002 MET)
ctime 1035132700 620000 (Sun Oct 20 18:51:40 2002 MET) tom harry
aflags 0 orgtype 1 eopflags 0 eopdata 0
fixextsize/fsindex 0 rdev/reserve/dotdot/matchino 0 .4 6 .
blocks 1 gen 0 version 0 8 iattrino 0
de: 518 0 0 0 0 0 0 0 0 0
..
2 2 ..
des: 1 0 0 0 0 0 0 0 0 0  Can not cross FS boundaries 7 jane
ie: 0 0  Can not link directories
8 john 10 hammer
ies: 0
> 518b;p 96c 9 janice 9 jan
00000206.0000: t h i s i s j a n i c e d
00000206.0010: a t a 0a 00 00 00 00 00 00 00 00 00 00 00 00 Inodes
00000206.0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000206.0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
9
00000206.0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000206.0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
518
Data
Harry
harry Blocks
jan

friends tom 9 -rw-rw-rw- 2 root sys 20 Oct 20 18:51 jan


9 -rw-rw-rw- 2 root sys 20 Oct 20 18:51 janice
Jane
john
brutus janice

FS 14 UXIE-SUPPORTvB

Hard symbolic link: Although most of inodes are associated with exactly one
directory entry, links make it possible to associate multiple directory entries with single inode.
This, in effect, allows your users to reference a single file via several different file names.
The example above shows a file /friends/tom/janice that has been hard linked with
/friends/harry/jan. Both names now reference the same inode (9), and thus share the same
permissions, owner and time stamp. Since both file names reference the same inode, they also
both share the same data blocks. Changes made to the first file will be reflected to the second one
and vice versa.
A hard link may be created with the “ln” command. The first argument identifies the name of the
existing file, and the second identifies the name of the new link.
Creating a hard link creates a new directory entry for the new link, increments the link count field
in the inode.
Be aware of the two hard link limitations:

Hard links cannot cross file system boundaries.


Hard links cannot link directories.

January 2004 File System PAGE: 14


UXIE-SUPPORT STUDENT HANDOUT

#fsdb -F vxfs /dev/vg00/friends


> im
> 9i
inode structure at 0x00000132.0100
Symbolic soft links
type IFREG mode 100666 nlink 2 uid 0 gid 3 size 20 #ln -s /friends/tom/janice /friends/brutus/jan
atime 1035132206 0 (Sun Oct 20 18:43:26 2002 MET)
mtime 1035132700 620000 (Sun Oct 20 18:51:40 2002 MET) tom brutus
ctime 1035132700 620000 (Sun Oct 20 18:51:40 2002 MET)
aflags 0 orgtype 1 eopflags 0 eopdata 0 4 . 5 .
fixextsize/fsindex 0 rdev/reserve/dotdot/matchino 0
blocks 1 gen 0 version 0 8 iattrino 0
2 .. 2 ..
de: 518 0 0 0 0 0 0 0 0 0 7 jane
des: 1 0 0 0 0 0 0 0 0 0
 Can cross FS boundaries 8 john
ie: 0 0
ies: 0  Can link directories 9 janice 11 jan
> 11i
inode structure at 0x00000132.0300
type IFLNK mode 120777 nlink 1 uid 0 gid 3 size 19
atime 1035132461 590001 (Sun Oct 20 18:47:41 2002 MET) Inodes
Inodes
mtime 1035132382 590001 (Sun Oct 20 18:46:22 2002 MET)
ctime 1035132382 590001 (Sun Oct 20 18:46:22 2002 MET) 9 11
aflags 0 orgtype 2 eopflags 0 eopdata 0 /friends/tom/janice
fixextsize/fsindex 0 rdev/reserve/dotdot/matchino 0
blocks 0 gen 0 version 0 2 iattrino 0 518
> p 96c harry harry
00000132.0300: 00 00 a1 ff 00 00 00 01 00 00 00 00 00 00 00 03 Data
00000132.0310: 00 00 00 00 00 00 00 13 = b2 de - 00 09 00 b1 Blocks
00000132.0320: = b2 dd de 00 09 00 b1 = b2 dd de 00 09 00 b1 friends tom
00000132.0330: 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000132.0340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 Jane
00000132.0350: / f r i e n d s / t o m / j a n john
00000132.0360: i c e 00 00 00 00 00 00 00 00 00 brutus janice

jan
11 lrwxrwxrwx 1 root sys 19 Oct 20 18:46 jan -> /friends/tom/janice
9 -rw-rw-rw- 2 root sys 20 Oct 20 18:51 janice
FS 15 UXIE-SUPPORTvB

Soft symbolic links: They , like hard links, make it possible to associate multiple
file names with a single file. Unlike hard links, however, symbolic links:

Can cross files system boundaries


Can link directories.

In the above example, /friends/brutus/jan is a symbolic link to /friends/tom/janice.


Both files have distinct directory entries and inodes. How as shown, jan is nothing more than a
pointer to janice.
Symbolic links are particularly useful when You must move files from one file system to another,
but still wish to be able to use the file’s original path name.
At version 9.X of HP-UX, system executables were stored in the /bin directory. At HP-UX 10.X,
many operating system executables were moved to /usr/bin. However, a symbolic link exists from
/bin to /usr/bin so users and applications can still use the version of old path names.

January 2004 File System PAGE: 15


UXIE-SUPPORT STUDENT HANDOUT

The Data Blocks

Main Super Cylinder Cylinder


Boot Block Group Group
Super Data
Block 0 Summary
Block Copy Inodes Blocks

block
1696

1704

1737
Inode
size=20,000

direct pointers
1696 used
1704
1737
fragment free
indirect pointers
z at the end of the file
blocks held=20 fragment z contiguous
z same block

FS 16 UXIE-SUPPORTvB

One of the problem identified with the original Bell filesystem was the small block size (1KB)
leading to inefficient transfers.
The UFS tackles this issue by using a larger block size (8KB), but then sub dividing these blocks
into fragments. Unfortunately the term blocks is frequently misapplied. It is often used when
referring to fragments and even occasionally when referring to sectors.
Where a file is only using direct pointers then it is able to make use of fragments to avoid having
to allocate a whole block, where not all of the space is currently required. There are restrictions on
the use of fragments:

¾ Fragments are only ever used at the end of files.


¾ All fragments must be contiguous.
¾ All fragments used must come from the same block.

January 2004 File System PAGE: 16


UXIE-SUPPORT STUDENT HANDOUT

File system lay-out example


directory
rwxr-xr-x directory
uid gid rwxr-xr-x
uid gid
29
3 1
3
2 1 56 /
4
5
6
7
8
9
dir1
Regular file
rwxr-xr-x
. 5
uid gid 5 .. 2
72 29 4 file1 9 file1
30

55 . 2
56 .. 2
2
57
dir1 5

bla bla bla bla bla


bla bla bla bla bla 71
bla bla bla bla bla
6 72
bla bla bla bla bla
bla bla bla bla bla 73 File system structure
bla bla bla bla bla
bla bla bla bla bla
74 Inode table
75 Data blocks

FS 17 UXIE-SUPPORTvB

This is an example. How to find the file file1 in the file system layout.:

c In the main directory inode table (2) find the data block which contains all the
information (data block 56) of the root directory.
d In the data block 56 find the inode which describes the directory dir1 (inode 5).
e In the inode 5 find the data block which list all files in the dir1directory (data block
29).
f In the data block 29 find the file1’s inode (inode 9).
g In the inode 9 find the data block of file1 (data block 72).
h In the data block 72 find the content of file1.

January 2004 File System PAGE: 17


UXIE-SUPPORT STUDENT HANDOUT

HFS disadvantages

D Long “fsck” time after an improper shutdown.

D Layout rules optimized for small to medium sized files.

D Loss of 10% min free area.

D Modern disk drives do not have constant geometry’s.

D Inodes are still held in a table.

FS 18 UXIE-SUPPORTvB

Whilst UFS remains an extremely successful filesystem design, it is not without it’s problem. The
most apparent to most system administrators is the long time taken to check the consistency of the
filesystem after an improper shutdown.
The layout rules used by the filesystem group the data for small files around the inode,
deliberately scattering larger files to increase the probability of being able to do this. These rules
have resulted in a filesystem that is highly resistant to the fragmentation issues that cause
problems to so many filesystems designs when used with active file populations.
When the file populations is expected to remain static and the file are large then these layout rules
do not result in an optimum arrangement.
Part of the mechanism for avoiding fragmentation problems is ensuring there is always an
adequate supply of free space. To this end UFS has the min free parameter, typically surrendering
10% of its capacity to the goal of avoiding fragmentation. Again where the file population is
expected to remain static then fragmentation is unlikely to occur.
Modern disk drives do not have constant geometry’s. When the filesystem was designed disk
drives had a fixed number of sectors per track. Modern disk drives have a constant length, which
gives much higher storage capacity, but obviates many of the layout optimizations of the UFS
filesystem.
Lastly the number of inodes is fixed when the filesystem is created. More can only be created
when the filesystem grows or when it is recreated. Even the case of growing the system is of only
limited use and the number of new inodes will only be a function of percentage of the new space
allocated to the filesystem.

January 2004 File System PAGE: 18


UXIE-SUPPORT STUDENT HANDOUT

VxFS or JFS
Veritas File System or Journaled File System

¾ Extent based allocation


Main features ¾ Online features
¾ Online Backup
¾ Fast Filesystem Recovery
 under HP-UX 11i

Only VxFS version3.3 supported


Disk lay-out version 2 3 &4

| Access Control list


| dynamic inode allocation
| shrink enhancement
| extended attributes | large files | clone file system
| large UIDs

FS 19 UXIE-SUPPORTvB

UFS has been the only mounted read/write file system officially supported by HP-UX since its
first release. HP-UX Release 10.01 introduced a new file system type from Veritas Software known as a
Veritas File System (VxFS). The file System is also known as a Journaled File System (JFS).
Currently VxFS is still an optional non-root file system while UFS remains the default format.
The long term file system strategy is for VxFS to become the default file system and UFS to exist in
maintenance mode only (lvol1).
Extent based allocation: JFS allocates spaces to files in the form of extents, adjacent disk
blocks that are treated as a unit. An extent can vary in size from one block to many megabytes and is
identified by a starting block address and its length in filesystem blocks. This allows JFS to issue large I/O
requests, which is more efficient than reading or writing a single block at a time.
Fast Filesystem Recovery: VxFS filesystem provides fast recovery after a system failure by
utilising a tracking feature called intent logging. Intent logging is a scheme that records pending changes to
the file system.
Large Files & File systems: VxFS version 3.3 supports up to 2TB files and 2TB file systems.
Online System Administration: The OnlineJFS (Advance JFS) product is an optional product which
can be purchased and provides many features that can be performed online. These features include
filesystem reorganisation and resizing using fsadm command.
Online Backup: part of the OnlineJFS product provides a method for performing
online backup of data using the “snapshot” feature of JFS.

January 2004 File System PAGE: 19


UXIE-SUPPORT STUDENT HANDOUT

disk version 4 file system lay-out


ALLOCATION UNIT 32k blocks
AU 0

Structural files Data blocks


AU 1

Data blocks
Main features
AU x

Better scalability
Larger Files
Larger Filesystem
AU n-1

Data blocks Efficiency


Flexibility
Less fragmentation
AU n

Data blocks

FS 20 UXIE-SUPPORTvB

Version 1 disk layout of Vxfs was relatively straightforward and resembled a lot to UFS. This
disk layout version is not supported on HP-UX. Version 2 disk layout introduced the concept
of structural files whereby many filesystem structures are themselves encapsulated into files.
This allows better scalability for large files and filesystems. The basic idea is that expanding
and reducing filesystem resources, such as the number of inodes, will simply involve extending
or truncating one or more structural files.
The version 2 consists of the following key structural elements at fixed locations:
¾ Superblock
¾ OLT (Object Location Table)
¾ Intent Log
¾ Replica of OLT
¾ One or more AU (allocation Unit)

Version 3 & 4 also contain the above structural elements. The key difference is that now the structural
files are part of the first AU. By placing these structure in the first AU instead of before the first AU, we
must account this space by allocating a structural file for each structure.. These new structural files are
the Superblock file (IFLAB), the OLT file (IFOLT) and the Intent Log file (IFLOG).

Starting from version , the AU information has been moved to structural files, thus Allocation Units will
only contain inode extents and data extents. Storing the AU information in files improves scalability as
well as making access to the structural data more localized. The AU header information has been moved
to the Allocation Unit State File (IFEAU), the Allocation Unit Extent Summary File (IFAUS), and the
Extent Map File (IFEMP).

There is not significant difference between version 3 and version 4.

January 2004 File System PAGE: 20


UXIE-SUPPORT STUDENT HANDOUT

Disk structural files lay-out


Boot
0
block

Super
8
Block

FSH 9 10 Fileset header

IAU 11 Inode allocation unit (fileset 1) structural files

CUT 15 Current usage table

ILT 16 Inode list table (fileset 1)

EAU 32 Extent allocation unit (state)

OLT 33 Object allocation table

DEV 34 Device configuration inodes For EAU AUS & EMP (extend map file)

AUS 35 allocation unit summary

IAU 36 Inode allocation unit (fileset 999) User’s files

Intent
40 73
log

FS 21 UXIE-SUPPORTvB

Vxfs filesystem begins with a Superblock at a fixed location of 8KB from the start of the
filesystem. Similar to the UFS superblock, the VxFS superblock contains information to describe
the filesystem and offsets used to access other structures in the filesystem.

VxFS filesystem consists of multiple filesets. The filesystem is a collection of block/extent,


While a fileset is a collection of inode/file built upon the blocks/extents.
Each fileset has an index number associated with it. There are 2 filesets which exist in all VxFS
file system, namely the Structural Fileset (index 1) and the Unnamed Fileset (index 999). The
first one contains structural files, which encapsulate structural elements.The second one, also
known as the Primary fileset, contains files visible and accessible to the user and is the default
fileset being mounted.
The OLT contains information used at mount time to locate filesystem structures.
Prior to disk layout version 4 the CUT (current Usage Table) was used to store information about
a fileset that changes frequently such as the number of blocks currently used and fileset version
number.
All inodes of a fileset are kept in a file called ILT (Inode list Table). There is one for the
structural fileset and 2 for the non-structural fileset, which contain the general inodes for the user
files and the other for the attribute inodes which store attribute data associated with the user files.
There is one Inode Allocation Unit Table (IAU) for each Inode List Table (ILT), which
stores information used to allocate inode for that ILT.
Each IAU contains 1 block header, 1 block summary, 1 block free inode bitmap (one bit per
inode) and1 block for inode extended operation bitmap (the inode is in use).

January 2004 File System PAGE: 21


UXIE-SUPPORT STUDENT HANDOUT

VxFS structural tables summary


Each entry is
256B long
Super Block(8) FSH(9) fileset1

oltext[0] 33 OLT(33) Fileset Header 1


fsh_iauino 4
olt_fsino[3 35]
Data block olt_iext[16 1288] ILT(16)
FSH(10) fileset 999
address olt_cutino 6 3 fs 9
in black olt_devino[8 40] 4 iau 11
Fileset Header 999 Free space
olt_sbino 33 5 ilt 16
fsh_iauino 64 management
olt_logino[9 41] 6 cut 15
Inode number
olt_oltino[7 39] 7 olt 33
(from OLT) 8 dev 34
in bold red 9 log 32
64 iau 36 Intent Log(32) DEV(34)
Device
Structural
configuration
changes log
IAU(11) CUT(15) IAU(36) inode

Inode allocation Current usage Inode allocation


unit for fileset 1 table unit for fileset 999
inode
management

FS 22 UXIE-SUPPORTvB

Intent log: With VxFS, an intent log is normally kept. Before starting a filesystem update
such as creating a new file, the intended metadata changes are logged. After the changes have
happened successfully a record is made of this successful completion.
DEV: Since disk space is shared by the filesets within the filesystem, the free space
maps are not a per fileset source, but a filesystem wide one. DEV points to the extend allocation
unit (EAU) state file, the extent allocation unit (AUS) summary file and the extent allocation unit
(EMP) free map file.
EAU: gives the status of each allocation unit (2 bits). Could be
expanded or not, clean or dirty, free or allocated.
AUS: gives the number of free extents per allocation unit.
EMP: divides the filesystem in piece of 2MB and gives within
pieces the number of free blocks.

The VxFS filesystem is divided into two filesets: the structural fileset (index1) and the unnamed
fileset (index 999).Each fileset has an inode list (ILT) and an inode allocation unit (IAU).
FSH : gives us the inodes that describe the files that hold the filesets own inodes
(ILT)and the inode allocation unit (IAU).
ILT : contains all the inodes for the fileset (256 bytes each).
IAU : is divided in 4 parts: first a header structure followed by a summary block
(number of free inode and number of number of inode blocks free). The third part is the free inode
map (1 bit per inode). The last part is the extended inode operation map (used for long operation,
longer than what the intent log can process).

January 2004 File System PAGE: 22


UXIE-SUPPORT STUDENT HANDOUT

The VxFS Super Block


File system size #fsdb -F vxfs /dev/vg00/lvol3 version
> 8b;p S
super-block at 00000008.0000
magic a501fcf5 version 4
ctime 1033745578 189381 (Fri Oct 4 17:32:58 2002 MET)
Block size
log_version 9 logstart 0 logend 0
bsize 1024 size 204800 dsize 204800 ninode 0 nau 0 Number of blocks per AU
defiextsize 0 oilbsize 0 immedlen 96 ndaddr 10
AU length aufirst 0 emap 0 imap 0 iextop 0 istart 0
bstart 0 femap 0 fimap 0 fiextop 0 fistart 0 fbstart 0
nindir 2048 aulen 32768 auimlen 0 auemlen 8
auilen 0 aupad 0 aublocks 32768 maxtier 15
Number of inode per dir inopb 4 inopau 0 ndiripau 0 iaddrlen 8 bshift 10
inoshift 2 bmask fffffc00 boffmask 3ff checksum e2a9e200
OLT block address free 142169 ifree 1024 Free inode
efree 15 15 21 17 15 9 13 6 8 1 2 0 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
flags 0 mod 0 clean 3c
time 1035103993 550000 (Sun Oct 20 10:53:13 2002 MET)
oltext[0] 33 oltext[1] 1282 oltsize 1
iauimlen 1 iausize 4 dinosize 256
checksum2 62b Free extent
checksum3 0

FS 23 UXIE-SUPPORTvB

The Veritas filesystem shares many of the same concepts as the earlier Unix filesystems. It starts
with a superblock, has inodes to describe the files and structures to describe free space. Also like
UFS it divides the filesystem into smaller units for ease of management. Here however no attempt
is made to match these to the physical layout of the disk. Modern disk drives don’t have constant
geometry so the idea of the cylinder groups no longer work. So VxFS uses an allocation unit of
32K blocks, the block size can be chosen when the filesystem is created or the default can be used.
The default for small filesystems (less than 8GB) is 1KB, larger filesystems use larger defaults.

The role of the superblock is to provide an overall description of the filesystem. As with the UFS
superblock contains both static and dynamic data. Unlike UFS, the different types of information
are held separately in sub structures.

The allocation policies for disk space within VxFS work by trying to allocate large contiguous areas
to files. In order to make this allocation easier, the superblock not only keeps track to the total amount
of free space but also the number of 1K free extents, 2K, 4K, 8K…and so on … This filesystem
has a 32MB allocation unit size.
The “oltext” fields give the address of the “Object Location Table” . There are two of these as the
filesystem keeps redundant copies of key information. Loss of OLT data would be as bad as loosing
the superblock.

January 2004 File System PAGE: 23


UXIE-SUPPORT STUDENT HANDOUT

The OLT table


> 33b;p oltext
OLT at 0x00000021.0000
OLT head entry:
olt_magic 0xa504fcf5 olt_size 56 olt_totfree 872
olt_time 1033745578 189381 (Fri Oct 4 17:32:58 2002 MET)
olt_checksum 0x3da0aacd
olt_esize 1 olt_extents[33 1282] Fileset header inode
olt_nsize 0 olt_next[0 0]
OLT fshead entry:
olt_type 2 olt_size 16 olt_fsino[3 35]
OLT initial iext entry: ILT table block address
olt_type 4 olt_size 16 olt_iext[16 1288]
OLT cut entry:
olt_type 3 olt_size 16 olt_cutino 6
OLT device entry: CUT table inode
olt_type 5 olt_size 16 olt_devino[8 40]
OLT super-block entry:
olt_type 6 olt_size 32 olt_sbino 33
olt_logino[9 41] olt_oltino[7 39]
DEV table inode
OLT free entry:
olt_type 1 olt_fsize 872
> Super block inode

OLT inode
Intent log inode
FS 24 UXIE-SUPPORTvB

The object location table is used by the VxFS filesystem to locate most of its disk based data
structures. The header uses a magic number to confirm that it is an OLT and a checksum to ensure
its integrity. Also within the header there is a copy of the disk address for the two OLT copies:
¾ The OLT fshead entry references the inode for the fileset headers within the
“ATTRIBUTE” fileset (again there are two copies).

¾ The OLT initial iext entry points to the disk blocks that hold the initial portions of the
“ATTRIBUTE” fileset. The rest of the blocks can be found from the ILT for this fileset.
(again there are two copies). Most of inodes are replicated and You will notice that the
second copy has an inode number 32 greater than the primary copy.

¾ The OLT cut entry gives the inode number for the current usage table, which holds
data about the fileset that changes regularly, like the number of blocks in use.

¾ The OLT device entry references the inodes for the device configuration record, which
provides information about the size of the device (filesystem) and pointers to the free
space summaries and maps.

¾ The OLT superblock entry. OLT even keeps track of the superblock and its second
copy (there are only 2!) by using an inode in the ATTRIBUTE fileset. This entry also
keeps track of the log and the OLT itself.

¾ The OLT free space entry describes the unused space in the OLT.

January 2004 File System PAGE: 24


UXIE-SUPPORT STUDENT HANDOUT

INODES & DIRECTORIES

10 direct pointers & 2 indirect pointers

z orgtype 1 (same as UFS but extents)


Inode types z orgtype 2 or immediate Within the inode (up to 96 bytes)
z orgtype 3 or extent

6 extents (used after the 10 direct pointers)

z inode number (4 bytes)


z record length (2 bytes)
Directory z name length ( 2 bytes)
z filename
z padding (up to the next four byte boundary)

FS 25 UXIE-SUPPORTvB

VxFS has several arrangements for its inodes, referred to as “org types”.
Orgtype 1: indicates that this inode arrangement references data and is similar to the UFS
arrangements, which uses direct and indirect pointers. There are 10 direct pointers but since these
point to extents rather than single blocks, there is a extent size value as well as the pointer to the
data. Should the file need more than 10 extents, then there are 2 indirect pointers (single and
double).
Orgtype2: There is room to store 96 bytes of information in an inode. So for small
directories and symbolic links rather than storing the information in a data block it is stored
directly inside the inode itself in the same way that fast symbolic links work for UFS.
Orgtype3: Here only 6 extents can be described from the inode,
but each entry has 4 fields:
¾ type (DATA, NULL, INDIR)
¾ position within the file (offset)
¾ block number
¾ number of blocks (length)

Directory VxFS directories do not store the entries “.” and “..” since it is known that all
directories must have these. It is a waste to actually use disk space to store them.
The inode store the “..” information in the “dotdot” field. Directory uses padding to cope with
deleted files leaving holes in the list of the directory entries. The arrangement is the same as that
used by UFS.

January 2004 File System PAGE: 25


UXIE-SUPPORT STUDENT HANDOUT

Mount command
File System 1
/ “mountpoint”
File System 1
/
bin dev etc

bin dev etc

/
bin lib user

File System 2
bin lib user

File System 2

FS 26 UXIE-SUPPORTvB

January 2004 File System PAGE: 26


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 28

What does the mount command do?

‰ List all file systems.


‰ List all mounted file systems.
‰ Display the content of mnttab file.
‰ Display the content of fstab file.

FS 27 UXIE-SUPPORTvB

The mount command will only display the content of the file /etc/mnttab.
For performance purposes, all mounted file systems are not checked.

In the release 11.00 if this file is corrupted the system can hang.
You must purge the file. As the file does not exist any more, HP-UX will really
check all mounted file systems.

Sometimes, the content of the mnttab is not exact. Specially after a strong reboot
or a failure. Take care during troubleshooting of the result of a mount command.
This could not reflect the reality.
So the first thing to do, is to purge the mnttab file. HP-UX will recreate it.

January 2004 File System PAGE: 27


UXIE-SUPPORT STUDENT HANDOUT

preen check used by bcheckrc or -p option


Full check
fsck command
intent log (default) only for Vxfs
nolog only for Vxfs

#fsck -F hfs /dev/vg00/lvol1 #fsck -F vxfs -o full,nolog /dev/vg00/lvol7


** /dev/vg00/lvol1 pass0 - checking structural files
** Last Mounted on /stand pass1 - checking inode sanity and blocks
** Phase 1 - Check Blocks and Sizes pass2 - checking directory linkage
** Phase 2 - Check Pathnames pass3 - checking reference counts
** Phase 3 - Check Connectivity pass4 - checking resource maps
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups

Lost+found

FS 28 UXIE-SUPPORTvB

When a machine loses power without being properly shut down, data can be lost. On a multi-user
machine, data loss is almost assured. The reason for this is that, for efficiency's sake, not every
change to a file is written directly to the files itself. Instead, the changes are recorded in memory
buffers that temporarily hold part of the files' contents. That way small changes can be collected in
the buffer and written to disk all at once, side-stepping the extreme slowness of disk I/O. The act
of writing a buffer's contents to disk is called flushing or syncing. When a machine goes down
without syncing, the changes in memory are lost.

™check several filesystem in parallel. This option works faster than checking the
systems in succession. It is generally used for the initial check and repair of all the
filesystem on a machine after it loses power. If fsck finds problems that it cannot correct
automatically, it is necessary to

™repair a filesystem by hand, which involves saving the contents of unlinked inodes in a
top-level directory of the filesystem called lost+found/. This action implies a loss of data,
which is the reason fsck will not do it automatically. Normally this interactive mode and
the parallel mode described above cannot be used in conjunction.

™repair a filesystem automatically, which is equivalent to repairing it by hand and


answering "yes" to every question about whether to place an inode’s contents in
lost+found/. This option is usually mutually exclusive with the parallel-checking option.

fsck performs low-level operations on a disk, treating it as a raw device rather than a file
system..For that reason, it is necessary to unmount a file system before checking it. Otherwise
high-level filesystem activity would cause the memory buffers to drift out of sync with the disk,
effectively causing the whole problem that required you to run fsck, all over again.

The root (/) filesystem cannot be unmounted, since it contains the fsck executable itself

January 2004 File System PAGE: 28


UXIE-SUPPORT STUDENT HANDOUT

UXIE-SUPPORT
8 modules in virtual class: Module 1 Boot
Module 2 Recovery
Module 3 LVM
Module 4 Mirroring
Module 5 SD-UX Patch
Module 6 Swap Dump
Module 7 File Systems
Module 8 Ignite

IGN1 UXIE-SUPPORTvB

January 2004 IGNITE PAGE: 1


UXIE-SUPPORT STUDENT HANDOUT

OBJECTIVES

Students will be able to understand how IGNITE is


working. They will create and manage an Ignite server
and its basic configuration files. They will know all
commands or utilities in order to create or recover a
system from a golden image.

IGN2 UXIE-SUPPORTvB

Version B Version A
IUX Revision Number B.3.0 A.3.0
HP-UX Releases Supported 10.20, 11.0 or higher 10.01, 10.10 and 10.20
HP-UX Releases Supporting IUX Servers & Systems 11.0 or higher 10.01, 10.10 and 10.20
Minimum Memory Size for IUX Servers & Systems 64 MB 32 MB

January 2004 IGNITE PAGE: 2


UXIE-SUPPORT STUDENT HANDOUT

What is Ignite-UX ?

IUX Server
¾ process for initial system deployment
¾ client/server model:
} can install multiple target machines simultaneously
} allows target customization and status monitoring
¾ ability to build and reuse standard configurations IUX IUX IUX
¾ ability to do site-specific customization Target Target Target
¾ ability to automate installation process
¾ extensive system manifest capability
¾ ability to install software from multiple sources in a single session

IGN3 UXIE-SUPPORTvB

Ignite-ux supports all HP-UX releases starting with 10.01. Ignite-UX intended to address the need
of end user customers to perform system installations and deployment. It provides the means for
creating and reusing standard system configuration. It provides the ability to archive a standard
system configuration, and to use that archive to replicate systems, with the added benefit of
speeding up the process. It also permits post-installation customizations, and is capable of both
interactive and unattended operating modes.
True Client/Server Model: The install sessions for multiple targets can be controlled from
a single server. A user interface is provided to run on the server and manage multiple
simultaneous install sessions. Alternatively, a single install session can be controlled from the
target machine if that is more convenient for the user. The install server itself can be running any
HP-UX release 10.01 or later.
Enhance User Interface: The user interface has a Windows95-like “tabbed dialog” feel.
This allows the tool to present more configuration capabilities.
Support for Multiple Software sources: Loads can occur from multiple software
sources in a single install session. (different depot sources)
Archive Installation: Ignite-UX supports tar and cpio archives.
Easy Customizability: Many tasks that are typically done as separate steps after an
install have been incorporated into the installation process.
Save Customized configurations for future use: It is possible to create a configuration for
your particular needs, save it away, and then quickly apply to multiple targets.
Non-interactive Installs: Ignite-UX allows You to set up a configuration and then to
install it on a target machine with no further user interaction.
Generate a Manifest: Ignite-UX provides a tool to scan a system and produce a
report detailing hardware, software and kernel configurations.

January 2004 IGNITE PAGE: 3


UXIE-SUPPORT STUDENT HANDOUT

Ignite-UX versus SD-UX

Ignite-UX Software Distributor

Complete installs of Manages software on an


Purpose
system software. existing system.

Performs disk and file Cannot modify file system if


Disk space
system layout based on there is insufficient space.
considerations
software selected.
Knows about standard
Handles SD depots and bundles, products and file
Objects
archives in tar or cpio sets.
handled
format.

IGN4 UXIE-SUPPORTvB

SD-UX: Software Distributor for HP-UX was released in version 10.00 as the method for
distributing both the operating system software and all application software. It provides many
options for flexible software management tasks.
SD-UX has both GUI and terminal interfaces. It provides a portable and standard distribution and
administration toolset. It simplifies system administration by making software distribution faster,
more flexible, traceable and configurable. You can look at the features, status and attributes of
your installed software, and verify the correctness of an installed product.

Ignite-UX: One of the major task faced by System Administrator at large UNIX sites is that
of software configuration and distribution. Most large users have a defined process by which they
select, test and customize a vendor’s operating system ( a process which often results in a
prototype system or “golden image).This image is then installed on new machine by a variety of
means, one of the most common being raw disk copies (dd). After the new system is installed, a
local administrator is then needed to customize the machine’s site and CPU specific information.
When systems are upgraded or reinstalled, this process must be repeated.
Ignite-UX is a tool, which is targeted at the new system installation process. It will help creating a
golden disk, distribute it, customize it, and reinstall it to local or remote systems with a minimum
of administrator intervention.

January 2004 IGNITE PAGE: 4


UXIE-SUPPORT STUDENT HANDOUT

Basic Use Model for Ignite-UX


¾ Cold Installation from Media
¾ Cold Installation over the Network from Target
¾ Cold Installation over the Network from Server
¾ Redeploy over the Network from Server

Target Machine Server


Network

CD Tape

IGN5 UXIE-SUPPORTvB

Cold Installation from Media: The most basic use model is the installation of a single new
machine from sort of media. No server need to be involved in this scenario. The user is given the
choice of either a terminal version of the tabbed dialog or the novice task wizard
interface.
Cold Installation over the Network from the target: Do a network boot on the
target machine, and control the installation from the target machine via a TUI.

Cold Installation over the Network from the server: It is often convenient to
control/monitor remote installations from a central point. After You do a network boot on the
target machine, all further interactions can take place on the server. Information about each
installation is kept on the server in a per-target directory identified by the link level address of the
target machine. Another advantage is that You can use the GUI instead of the TUI. You may also
choose to run a non-interactive install.

Redeploy over the network from the server: For a variety of reasons, it is sometimes
necessary to re-install an existing HP-UX machine:
¾ Switch a large number of machine to a new release.
¾ Fix a hard problem, which occurs on a machine.
¾ Set up with a different use. (other software with different configuration)

The main difference between these cases and the cold installation use models, is that these
scenarios already have a running system. Ignite-UX can take advantage of this fact and allow re-
installs without having to physically go to the target machine and boot it.

January 2004 IGNITE PAGE: 5


UXIE-SUPPORT STUDENT HANDOUT

Minimum Ignite-UX Requirements

Ignite-UX Server
¾ A Series 700/800 system running HP-UX 10.X or 11.X
¾ A Display for TUI or an X11 Display for GUI
¾ Ignite-UX Software: http://www.software.hp.com
¾ Product Media: Application CD / Vendor Software
¾ Computer: HP 9000 with PA 1.1 or later processor
¾ NFS server

Ignite-UX Client
¾ A Series 700/800 system
¾ Computer: HP 9000 with PA 1.1 or later processor
¾ Memory: 32 MB Minimum
¾ Disk Space: 2 GB or more

IGN6 UXIE-SUPPORTvB

It is recommended that the Ignite-UX server be at least a HP-UX 10.20 version of the OS, but is
supported on any HP-UX 10.X or 11.X version.
Ignite-UX will be loaded under the directory /opt/ignite. The data files Ignite-UX creates will be
placed under /var/opt/ignite. Ignite-UX installation requires ~75MB of disk space.

The Ignite-UX server requires NFS to be configured and working. It will add lines to the
/etc/exports file and run exportfs. It will transfer some of its files using tftp. The minimum
directories needed by tftp are set up in the /etc/inetd.conf file.

A display for TUI or X11 display for GUI.


The display can be redirected to another X-windows system by setting the DISPLAY
environment variable. For example, in the KORN shell or POSIX shell, You would type the
following, using your system_name:
export DISPLAY=system_name:0.0

Product media to load onto the server your Ignite-UX and any software depots you plan to
distribute to target systems.
Clients and server must be on the same subnet if you plan to do the initial boot of the client over
the network.A boot “helper” system can be used to get between subnets. The bootsys command
also works between subnets.

January 2004 IGNITE PAGE: 6


UXIE-SUPPORT STUDENT HANDOUT

Ignite-UX Client configuration


Basic
File system

Advanced

Software System

GO

IGN7 UXIE-SUPPORTvB

This is the first screen you see once you launch an install on a particular target machine. Notice
that there are five tabs across the top of the screen. Selecting each tab brings up a different screen.
Basic: This screen lets you to select the starting configuration for your install. These
configurations include the default shipped by HP as well the configurations you have customized
and save via the Save as button in the user interface. This screen also lets you do some basic
layout of your system, including what disk to use for root, how much swap space to allocate, what
language to use, and so on.
Software: This screen allows you to select which software packages you wish to install.
System: This screen allows you to specify system-specific parameters normally set
during the first boot including hostname, IP address, time zone, root password and other
networking information.
File System: This screen lets you lay out disks and file systems. It supports a rich set of
configuration options.
Advanced: This screen lets you to select which scripts you want to run as part of the
installation process.

You do not need to visit all the screens. You may hit the GO button in any screen at any time.

January 2004 IGNITE PAGE: 7


UXIE-SUPPORT STUDENT HANDOUT

Ignite-UX configuration files

System Attributes

Command & Disk & FS


Script Hooks Layout

Software Source System Identity


Selections & Network

IUX Process
Control

IGN8 UXIE-SUPPORTvB

Ignite-UX is controlled by a group of configuration files. A configuration file can be thought as a


recipe on how to construct a target system. Some of the elements included are:

¾ Disk and file system layout.


¾ Software to be installed.
¾ Networking information.
¾ Special Ignite-UX processing options.

Configuration files can be created in many ways including:

¾ Shipped as part of Ignite-UX.


¾ Created by Ignite-UX tools.
¾ Created when a target system is installed.
¾ Manually edited.

January 2004 IGNITE PAGE: 8


UXIE-SUPPORT STUDENT HANDOUT

Client-initiated Archive
¾ Compressed archive of an entire installed & configured system
¾ Ease the installation process for multiple installs. Golden Image (OS Archives)
¾ Multiple golden images Specific to environment can be created.

+ =
Archive Config files
& scripts Golden System
OS+Patch+SW

#make_tape_recovery Server

Client Tape
#make_net_recovery
Network

IGN9 UXIE-SUPPORTvB

In addition to supporting the standard OS installations from an SD depot, Ignite-UX supports a


new customer use model called Golden Image Installation. This model recognizes that many, if
not all, client nodes in a network may be identical (or mostly identical) to each other. It is
possible to take advantage of this fact by building an archive which contains all of the files you
want installed on each of the clients and then using Ignite-UX to install them.
This approach has many advantages:
•Because the compressed system image is unpacked directly to disk over the network,
the installation process can be much faster than an equivalent process using SD-UX. The
timesaving depend on the size of the installation being done, the extent of customization
on each installed system, and the capacity of the network. A typical system image can be
extracted in about 20 minutes compared to about an hour for a SD-UX install.

•Instead of troubleshooting a client, it is often more cost-effective to completely reinstall


with a known-good system image. Customers who have followed this approach use the
theory "Don’t troubleshoot, just reinstall".

•When coupled with dataless nodes (all volatile data is on a separate file server), system
replacement time or move time is drastically reduced.

•Once a system image has been created, it is simple to apply it to multiple client
machines. Very little or no user interaction is required during these subsequent installs,
reducing the chance of error.

January 2004 IGNITE PAGE: 9


UXIE-SUPPORT STUDENT HANDOUT

INDEX file content


INDEX

cfg1 Name CONFIG


description LVM or Whole disk
BASIC system config file system type
(default) control config file system size
archive config Hardware
networking
SCRIPTS
Patches
OPTIONS APPS
Local config
CONTROL

cfg2 How to recover

+ /var/opt/ignite/clients/0x{lla}/config ARCHIVE

Source
impact
language
Used to repeat the install keyboard

IGN10 UXIE-SUPPORTvB

January 2004 IGNITE PAGE: 10


UXIE-SUPPORT STUDENT HANDOUT

Execution hooks

Scripts
Scripts

or
or

commands
commands

IGN11 UXIE-SUPPORTvB

Ignite-UX supports user-supplied commands and scripts to be inserted into various stages in the
installation process:
z pre_load_script & pre_load_cmd are executed prior to any software loading, but
after the file system have been created.
z post_load_script & post_load_cmd are executed after all the software has been
loaded, but prior to the system reboot and software configuration steps. This is the
correct hook to use for customisations.
z Post_config_script & post_config_cmd are executed after all software has been
configured and the final system kernel has been built and booted under.

Scripts will be loaded from the Ignite-UX server using tftp.

January 2004 IGNITE PAGE: 11


UXIE-SUPPORT STUDENT HANDOUT

I am 0x0060B0189D0.
I want to install.
Pull an Install

Target Server

Here’s an IP address you


can borrow and the name of
boot lan.1.2.3.123 install
your
boot file.
instl_bootp

IP, boot_lif From instl_boottab file

tftp
boot_lif (ISL, AUTO, HPUX)
INSTALL (Ignite kernel)
INSTALLFS (Ignite file system)

tftp

INSTALL process

IGN12 UXIE-SUPPORTvB

A “pull” install refers to an installation that is initiated on the target machine. From the target’s
boot menu the command: boot lan.<server’s_ip_address> install is issued.
This command causes the target machine to send an instl_bootp request over the network. The
Internet Services Daemon (inetd) is listening for these requests, and will start instl_bootd on the
server.
instl_bootd is the boot protocol server for Ignite-UX clients. It responds to boot requests from clients wishing to install an
operating system. instl_bootd can respond to clients without the server's prior knowledge of the client.When instl_bootd
receives a boot request, it allocates an IP address from a list of available addresses held in the file
/etc/opt/ignite/instl_boottab. instl_bootd then responds to the client by returning the IP address & the name of the boot
file.

The target machine will get the boot_lif from the server. The boot_lif file contains:
•ISL: The initial system loader.

•AUTO: This file gives instructions to ISL. It will tell ISL to run the secondary loader called HP-UX.

•HP-UX: This is the secondary loader. It will pull the Ignite-UX kernel (INSTALL) from the server.

Once the Ignite-UX kernel has been pulled from the server, the kernel will pull the Ignite-UX file
system. The Ignite-UX file system is used by the Ignite-UX kernel running on the install client. It
contains the default parameters used when a client connects to the server to install the HP-UX
operating system.
Ignite-UX gets configuration information from many different configuration files. You should
consider INSTALLFS as a configuration file. Configuration information that is needed early in
the install process should be placed in INSTALLFS. The first 8K includes:
•The IP address of the IUX server
•Routing information needed to communicate with the server
•Instructions on whether or not DHCP should be used to obtain an IP address that can be used for the remainder
of the install.
•Where the Ignite-UX UI should run (or even if it should be run) Time zone information

January 2004 IGNITE PAGE: 12


UXIE-SUPPORT STUDENT HANDOUT

Push an Install
bootsys

Target Server

INSTALL
/stand
Customized INSTALLFS

LIF AUTO

reboot

IGN13 UXIE-SUPPORTvB

The bootsys command can be used to start a system installation on one or more clients without the
need to interact on the console of the client system. Each client must be currently booted under
HP-UX version 9.0 or newer, be accessible on the network, and have enough disk space in the
/stand directory to hold the two files /opt/ignite/boot/INSTALL and
/opt/ignite/boot/INSTALLFS.

bootsys copies the Ignite-UX kernel (INSTALL) and RAM file system (INSTALLFS) to each
client and then sets the system AUTO file in the LIF area of the root disk to automatically boot
from this kernel at the next system reboot.

When the target machine boots, it will run the IUX kernel (INSTALL). The Ignite-UX kernel
will know how to find the IUX server via information in the INSTALLFS.

Since the target machine is booting off its own hard disk it does not use instl_bootd and will not
require a spare IP address. This type of install can be done on machines that are not capable of
booting over the network. It can also be used to install across gateways, since routing information
will be preserved in the INSTALLFS.
bootsys requires remsh permission for root on the client. If it cannot remsh as root, it will
prompt the administrator for the root password.

Client systems can block the use of the bootsys command by the existence of the file
/.bootsys_block in the client's root directory.

January 2004 IGNITE PAGE: 13


UXIE-SUPPORT STUDENT HANDOUT

QUIZZ 29

You push an install from the server, but it fails.


How do you recover the previous system?
‰ No way, must re-install.
‰ You have only to reboot.
‰ Break at ISL and type hpux.
‰ Push the installation again.

IGN14 UXIE-SUPPORTvB

If the system fails during a push installation, it will try to install for ever.
Why: because the content of the AUTO file is not pointing any more to vmunix
but it points to the INSTALL file copied in stand by the Ignite server.

You can modify the AUTO file or break under the ISL and boot with hpux
command.
If you need to load an image from the ignite server, best thing to do, is to do an
BO LAN from the BCH prompt.

January 2004 IGNITE PAGE: 14