Using the Shell Prompt........................................................................................................5 Running Commands from the Shell ....................................................................................

5 Using Virtual Terminals......................................................................................................5 Choosing Your Shell............................................................................................................6 Checking Your Login Session.............................................................................................6 Checking Directories and Permissions................................................................................7 Checking System Activity...................................................................................................8 Exiting the Shell...................................................................................................................9 Using the Shell in Linux.....................................................................................................9 Locating Commands..........................................................................................................10 Starting Background Processes..........................................................................................12 Using Foreground and Background Commands................................................................12 Working with the Linux File System.................................................................................13 Using File-Redirection Metacharacters.............................................................................16 Listing Files.......................................................................................................................16 Copying Files.....................................................................................................................16 Moving and Renaming Files..............................................................................................17 Deleting Files and Directories...........................................................................................17 Changing Directories.........................................................................................................18 Making Directories............................................................................................................18 Removing Directories........................................................................................................18 Making Links to Files or Directories.................................................................................18 Concatenating Files............................................................................................................19 Viewing Files with more and less......................................................................................19 Viewing the Start or End of Files......................................................................................20 Searching Files with grep...................................................................................................20 Finding Files with find and locate.....................................................................................20 Basic User and Group Concepts........................................................................................21 Creating Users and Groups................................................................................................22 Working with File Ownership and Permissions................................................................22 Mounting and Unmounting Filesystems............................................................................25 System information related commands..............................................................................26 Memory Reporting with the free Command......................................................................26 Virtual Memory Reporting with the vmstat.......................................................................26 Reclaiming Memory with the kill Command....................................................................27 Determining How Long Linux Has Been Running...........................................................27 Runlevels............................................................................................................................28 Using the vi Text Editor.....................................................................................................29 Automated Tasks...............................................................................................................33 Cron....................................................................................................................................33 NFS....................................................................................................................................36 Setting Up an NFS Server..................................................................................................36 Getting the services Started...............................................................................................41 The Daemons.....................................................................................................................41 Verifying that NFS is running............................................................................................42 Setting up an NFS Client...................................................................................................43

1

Mounting Remote Directories............................................................................................43 Getting NFS File Systems to be Mounted at Boot Time...................................................44 Mount Options...................................................................................................................45 NIS ....................................................................................................................................46 How NIS works .............................................................................................................46 How NIS+ works ..........................................................................................................47 Managing System Logs......................................................................................................47 Logrotate............................................................................................................................50 The difference between hard and soft links.......................................................................52 File Compression and Archiving.......................................................................................57 Package Management with RPM.......................................................................................60 Compiling from the original source...................................................................................69 yum....................................................................................................................................73 sysctl .................................................................................................................................78 Linux Partitions .................................................................................................................79 Partition Types...................................................................................................................83 LVM...................................................................................................................................89 UNIX Sumary....................................................................................................................93 Introduction .......................................................................................................................93 The UNIX operating system .........................................................................................94 The kernel .................................................................................................................94 The shell.....................................................................................................................94 Files and processes ........................................................................................................95 The Directory Structure ................................................................................................95 Starting an Xterminal session .......................................................................................95 Part One ............................................................................................................................97 1.1 Listing files and directories .....................................................................................97 ls (list) .......................................................................................................................97 1.2 Making Directories .................................................................................................98 mkdir (make directory) .............................................................................................98 1.3 Changing to a different directory ............................................................................98 cd (change directory).................................................................................................98 Exercise 1a.................................................................................................................98 1.4 The directories . and .. .............................................................................................98 1.5 Pathnames ...............................................................................................................99 pwd (print working directory) ...................................................................................99 .................................................................................................................................100 Exercise 1b...............................................................................................................100 1.6 More about home directories and pathnames .......................................................100 Understanding pathnames........................................................................................100 ~ (your home directory) ..........................................................................................101 Summary .....................................................................................................................101 Part Two ..........................................................................................................................102 2.1 Copying Files.........................................................................................................102 cp (copy)..................................................................................................................102 Exercise 2a...............................................................................................................102

2

2.2 Moving files...........................................................................................................102 mv (move)................................................................................................................102 2.3 Removing files and directories .............................................................................103 rm (remove), rmdir (remove directory)...................................................................103 Exercise 2b...............................................................................................................103 2.4 Displaying the contents of a file on the screen .....................................................104 clear (clear screen)...................................................................................................104 cat (concatenate)......................................................................................................104 less............................................................................................................................104 head..........................................................................................................................104 tail............................................................................................................................105 2.5 Searching the contents of a file .............................................................................105 Simple searching using less.....................................................................................105 grep (don't ask why it is called grep).......................................................................105 wc (word count).......................................................................................................106 Summary .....................................................................................................................107 Part Three ........................................................................................................................107 3.1 Redirection ..........................................................................................................107 3.2 Redirecting the Output ........................................................................................108 Exercise 3a...............................................................................................................108 3.3 Redirecting the Input ...........................................................................................109 3.4 Pipes.......................................................................................................................110 Exercise 3b...............................................................................................................110 Summary .....................................................................................................................111 Part Four ..........................................................................................................................111 4.1 Wildcards...............................................................................................................111 The characters * and ?..............................................................................................111 4.2 Filename conventions ...........................................................................................111 4.3 Getting Help...........................................................................................................112 On-line Manuals.......................................................................................................112 Apropos....................................................................................................................112 Summary .....................................................................................................................113 Part Five ..........................................................................................................................113 5.1 File system security (access rights) .......................................................................113 Access rights on files...............................................................................................114 Access rights on directories.....................................................................................114 Some examples........................................................................................................115 5.2 Changing access rights...........................................................................................115 chmod (changing a file mode).................................................................................115 Exercise 5a...............................................................................................................116 5.3 Processes and Jobs ................................................................................................116 Running background processes................................................................................116 Backgrounding a current foreground process..........................................................117 5.4 Listing suspended and background processes .......................................................117 5.5 Killing a process ...................................................................................................117 kill (terminate or signal a process)...........................................................................117

3

ps (process status)....................................................................................................118 Summary .....................................................................................................................118 Part Six ............................................................................................................................119 Other useful UNIX commands .................................................................................119 quota.........................................................................................................................119 df..............................................................................................................................119 du..............................................................................................................................120 compress..................................................................................................................120 gzip...........................................................................................................................120 file............................................................................................................................120 history......................................................................................................................121 Part Seven .......................................................................................................................121 7.1 Compiling UNIX software packages ..................................................................121 Compiling Source Code...........................................................................................122 make and the Makefile.............................................................................................122 configure..................................................................................................................122 7.2 Downloading source code......................................................................................123 7.3 Extracting the source code ..................................................................................123 7.4 Configuring and creating the Makefile ...............................................................124 7.5 Building the package ............................................................................................124 7.6 Running the software.............................................................................................125 7.7 Stripping unnecessary code....................................................................................126 Part Eight ........................................................................................................................127 8.1 UNIX Variables.....................................................................................................127 8.2 Environment Variables..........................................................................................127 Finding out the current values of these variables.....................................................128 8.3 Shell Variables.......................................................................................................128 Finding out the current values of these variables.....................................................128 So what is the difference between PATH and path ?...............................................128 8.4 Using and setting variables....................................................................................129 8.5 Setting shell variables in the .cshrc file.................................................................129 8.6 Setting the path......................................................................................................130 Unix - Frequently Asked Questions (1) [Frequent posting]............................................131 Unix - Frequently Asked Questions (2) [Frequent posting]............................................136 Unix - Frequently Asked Questions (3) [Frequent posting]............................................150 Unix - Frequently Asked Questions (4) [Frequent posting]............................................166 Unix - Frequently Asked Questions (5) [Frequent posting]............................................175

4

Using the Shell Prompt
If your Linux system has no graphical user interface (or one that isn’t working at the moment), you will most likely see a shell prompt after you log in., Typing commands from the shell will probably be your primary means of using the Linux system. The default prompt for a regular user is simply a dollar sign: $ The default prompt for the root user is a pound sign (also called a hash mark): #

Running Commands from the Shell
In most Linux systems, the $ and # prompts are preceded by your username, system name, and current directory name. For example, a login prompt for the user named jake on a computer named pine with /tmp as the current directory would appear as: [jake@pine tmp]$ You can change the prompt to display any characters you like—you can use the current directory, the date, the local computer name, or any string of characters as your prompt, for example. Although there are a tremendous number of features available with the shell, it’s easy to begin by just typing a few commands. Try some of the commands shown in the remainder of this section to become familiar with your current shell environment. In the examples that follow, the $ and # symbols indicate a prompt. The prompt is followed by the command that you type (and then you press Enter or Return, depending on your keyboard). The lines that follow show the output resulting from the command.

Using Virtual Terminals
Many Linux systems, including Fedora and Red Hat Enterprise Linux, start multiple virtual terminals running on the computer. Virtual terminals are a way to have multiple shell sessions open at once without having a GUI running. You can switch between virtual terminals much the same way that you would switch between workspaces on a GUI. Press Ctrl+Alt+F1 (or F2, F3, F4, and so on up to F6 on Fedora and other Linux systems) to display one of six virtual terminals. The next virtual workspace after the virtual terminals is where the GUI is, so if there are six virtual terminals, you can return to the GUI (if one is running) by pressing Ctrl+Alt+F7. (For a system with four virtual terminals, you’d return to the GUI by pressing Ctrl+Alt+F5.)

5

Choosing Your Shell
In most Linux systems, your default shell is the bash shell. To find out what your current login shell is, type the following command: $ echo $SHELL /bin/bash In this example, it’s the bash shell. There are many other shells, and you can activate a different one by simply typing the new shell’s command (ksh, tcsh, csh, sh, bash, and so forth) from the current shell. Most full Linux systems include all of the shells described in this section. However, some smaller Linux distributions may include only one or two shells. The best way to find out if a particular shell is available is to type the command and see if the shell starts. You might want to choose a different shell to use because: ✦ You are used to using UNIX System V systems (often ksh by default) or Sun Microsystems and other Berkeley UNIX–based distributions (frequently csh by default), and you are more comfortable using default shells from those environments. ✦ You want to run shell scripts that were created for a particular shell environment, and you need to run the shell for which they were made so you can test or use those scripts. ✦ You might simply like features in one shell over those in another. For example, a member of my Linux Users Group prefers ksh over bash because he doesn’t like the way aliases are always set up with bash. If you don’t like your default shell, simply type the name of the shell you want to try out temporarily. To change your shell permanently, use the usermod command. For example, to change your shell to the csh shell for the user named chris, type the following as root user from a shell:
# usermod -s /bin/csh chris

Checking Your Login Session
When you log in to a Linux system, Linux views you as having a particular identity, which includes your username, group name, user ID, and group ID. Linux also keeps track of your login session: it knows when you logged in, how long you have been idle, and where you logged in from. To find out information about your identity, use the id command as follows:
$ id uid=501(chris) gid=105(sales) groups=105(sales),4(adm),7(lp)

In this example, the username is chris, which is represented by the numeric user ID (uid) 501. The primary group for chris is called sales, which has a group ID (gid) of 105. The user chris also belongs to other groups called adm (gid 4) and lp 6

(gid 7). These names and numbers represent the permissions that chris has to access computer resources. (Permissions are described in the “Understanding File Permissions” section later in this chapter.) You can see information about your current login session by using the who command. In the following example, the -u option says to add information about idle time and the process ID and -H asks that a header be printed:
$ who -uH NAME LINE TIME IDLE PID COMMENT chris tty1 Jan 13 20:57 . 2013

The output from this who command shows that the user chris is logged in on tty1 (which is the monitor connected to the computer), and his login session began at 20:57 on January 13. The IDLE time shows how long the shell has been open without any command being typed (the dot indicates that it is currently active). PID shows the process ID of the user’s login shell. COMMENT would show the name of the remote computer the user had logged in from, if that user had logged in from another computer on the network, or the name of the local X display if you were using a Terminal window (such as :0.0).

Checking Directories and Permissions
Associated with each shell is a location in the Linux file system known as the current or working directory. Each user has a directory that is identified as the user’s home directory. When you first log in to Linux, you begin with your home directory as the current directory. When you request to open or save a file, your shell uses the current directory as the point of reference. Simply provide a filename when you save a file, and it is placed in the current directory. Alternatively, you can identify a file by its relation to the current directory (relative path), or you can ignore the current directory and identify a file by the full directory hierarchy that locates it (absolute path). The structure and use of the file system is described in detail later in this chapter. To find out what your current directory is, type the pwd command:
$ pwd /usr/bin

In this example, the current/working directory is /usr/bin. To find out the name of your home directory, type the echo command, followed by the $HOME variable:
$ echo $HOME /home/chris

Here the home directory is /home/chris. To get back to your home directory, just type the change directory (cd) command. (Although cd followed by a directory name changes the current directory to the directory that you choose, simply typing

7

cd with no directory name takes you to your home directory):
$ cd

Instead of typing $HOME, you can use the tilde (~) to refer to your home directory. So, to return to your home directory, you could simply type:
cd ~

To list the contents of your home directory, either type the full path to your home directory, or use the ls command without a directory name. Using the -a option to ls enables you to view the hidden files (dot files) as well as all other files. With the -l option, you can see a long, detailed list of information on each file. (You can put multiple single-letter options together after a single dash, for example, -la.)
$ ls -la /home/chris total 158 drwxrwxrwx 2 chris sales 4096 May 12 13:55 . drwxr-xr-x 3 root root 4096 May 10 01:49 .. -rw------- 1 chris sales 2204 May 18 21:30 .bash_history -rw-r--r-- 1 chris sales 24 May 10 01:50 .bash_logout -rw-r--r-- 1 chris sales 230 May 10 01:50 .bash_profile -rw-r--r-- 1 chris sales 124 May 10 01:50 .bashrc drw-r--r-- 1 chris sales 4096 May 10 01:50 .kde -rw-rw-r-- 1 chris sales 149872 May 11 22:49 letter

Displaying a long list (-l option) of the contents of your home directory shows you more about file sizes and directories. The total line shows the total amount of disk space used by the files in the list (158 kilobytes in this example). Directories such as the current directory (.) and the parent directory (..)—the directory above the current directory—are noted as directories by the letter d at the beginning of each entry (each directory begins with a d and each file begins with a -). The file and directory names are shown in column 7. In this example, a dot (.) represents /home/chris and two dots (..) represents /home. Most of the files in this example are dot (.) files that are used to store GUI properties (.kde directory) or shell properties (.bash files). The only non-dot file in this list is the one named letter. The number of characters shown for a directory (4096 bytes in these examples) reflects the size of the file containing information about the directory. While this number can grow above 4096 bytes for a directory that contains a lot of files, this number doesn’t reflect the size of files contained in that directory.

Checking System Activity
In addition to being a multiuser operating system, Linux is also a multitasking system. Multitasking means that many programs can be running at the same time. An instance of a running program is referred to as a process. Linux provides tools for listing running processes, monitoring system usage, and stopping (or killing) processes when necessary.

8

The most common utility for checking running processes is the ps command. Use it to see which programs are running, the resources they are using, and who is running them. Here’s an example of the ps command:
$ ps -au USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 2146 0.0 0.8 1908 1100 ttyp0 S 14:50 0:00 login -- jake jake 2147 0.0 0.7 1836 1020 ttyp0 S 14:50 0:00 -bash jake 2310 0.0 0.7 2592 912 ttyp0 R 18:22 0:00 ps –au

In this example, the -a option asks to show processes of all users who are associated with your current terminal, and the -u option asks that usernames be shown, as well as other information such as the time the process started and memory and CPU usage. On this shell session, there isn’t much happening. The first process shows that the user named jake logged in to the login process (which is controlled by the root user). The next process shows that jake is using a bash shell and has just run the ps -au command. The terminal device ttyp0 is being used for the login session. The STAT column represents the state of the process, with R indicating a currently running process and S representing a sleeping process. The USER column shows the name of the user who started the process. Each process is represented by a unique ID number referred to as a process ID (PID). (You can use the PID if you ever need to kill a runaway process.) The %CPU and %MEM columns show the percentage of the processor and random access memory, respectively, that the process is consuming. VSZ (virtual set size) shows the size of the image process (in kilobytes), and RSS (resident set size) shows the size of the program in memory. START shows the time the process began running, and TIME shows the cumulative system time used. Also try typing top, free and vmstat commands.

Exiting the Shell
To exit the shell when you are done, type exit or press Ctrl+D. You’ve just seen a few commands that can help you quickly familiarize yourself with your Linux system. There are hundreds of other commands that you can try. You’ll find many in the /bin and /usr/bin directories, and you can use ls to see a directory’s command list: ls /bin, for example, results in a list of commands in the /bin. Then use the man command (for example, man hostname to see what each command does. There are also administrative commands in /sbin or /usr/sbin directories.

Using the Shell in Linux
When you type a command in a shell, you can include other characters that change or add to how the command works. In addition to the command itself, these are some of the other items that you can type on a shell command line: 9

✦ Options—Most commands have one or more options you can add to change their behavior. Options typically consist of a single letter, preceded by a dash. You can also often combine several options after a single dash. For example, the command ls -la lists the contents of the current directory. The -l asks for a detailed (long) list of information, and the -a asks that files beginning with a dot (.) also be listed. When a single option consists of a word, it is usually preceded by a double dash (--). For example, to use the help option on many commands, you enter --help on the command line. You can use the --help option with most commands to see the options and arguments that they support. For example, hostname --help. ✦ Arguments—Many commands also accept arguments after certain options are entered or at the end of the entire command line. An argument is an extra piece of information, such as a filename, that can be used by the command. For example, cat /etc/passwd displays the contents of the /etc/passwd file on your screen. In this case, /etc/passwd is the argument. ✦ Environment variables—The shell itself stores information that may be useful to the user’s shell session in what are called environment variables. Examples of environment variables include $SHELL (which identifies the shell you are using), $PS1 (which defines your shell prompt), and $MAIL (which identifies the location of your mailbox). See the “Using Shell Environment Variables” section later in this chapter for more information. You can check your environment variables at any time. Type declare to list the current environment variables. Or you can type echo $VALUE, where VALUE is replaced by the name of a particular environment variable you want to list. ✦ Metacharacters—These are characters that have special meaning to the shell. They can be used to direct the output of a command to a file (>), pipe the output to another command (|), and run a command in the background (&), to name a few. Metacharacters are discussed later in this chapter.

Locating Commands
If you know the directory that contains the command you want to run, one way to run it is to type the full path to that command. For example, you run the date command from the /bin directory by typing:
$ /bin/date

Of course, this can be inconvenient, especially if the command resides in a directory with a long path name. The better way is to have commands stored in wellknown directories, and then add those directories to your shell’s PATH environment variable. The path consists of a list of directories that are checked sequentially for the commands you enter. To see your current path, type the following:
$ echo $PATH /bin:/usr/bin:/usr/local/bin:/usr/bin/X11:/usr/X11R6/bin:/home/chris/bin

Here are some places you can look to supplement what you learn in this chapter:

10

✦ Check the PATH—Type echo $PATH. You see a list of the directories containing commands that are immediately accessible to you. Listing the contents of those directories displays most standard Linux commands. ✦ Use the help command—Some commands are built into the shell, so they do not appear in a directory. The help command lists those commands and shows options available with each of them. (Type help | less to page through the list.) For help with a particular built-in command, type help command, replacing command with the name that interests you. The help command works with the bash shell only. ✦ Use --help with the command—Many commands include a --help option that you can use to get information about how the command is used. For example, type date --help | less. The output shows not only options, but also time formats you can use with the date command. ✦ Use the man command—To learn more about a particular command, type man command. (Replace command with the command name you want.) A description of the command and its options appears on the screen.
$ type bash bash is /bin/bash

To try out a bit of command-line editing, type the following:
$ ls /usr/bin | sort -f | less

This command lists the contents of the /usr/bin directory, sorts the contents in alphabetical order (regardless of case), and pipes the output to less. The less command displays the first page of output, after which you can go through the rest of the output a line (press Enter) or a page (press space bar) at a time (press Q when you are done). To view your history list, use the history command. Type the command without options or followed by a number to list that many of the most recent commands. For example:
$ history 7 382 date 383 ls /usr/bin | sort -a | more 384 man sort 385 cd /usr/local/bin 386 man more 387 useradd -m /home/chris -u 101 chris 389 history 8

A number precedes each command line in the list. There are several ways to run a command immediately from this list, including: ✦ !n—Run command number. Replace the n with the number of the command

11

line, and that line is run. For example, here’s how to repeat the date command shown as command number 382 in the preceding history listing:
$ !382 date Thu Apr 13 21:30:06 PDT 2006 ✦ !!— Run previous command. Runs the previous command line. Here’s how you’d immediately run that same date command: $ !! date Thu Apr 13 21:30:39 PDT 2006

Starting Background Processes
If you have programs that you want to run while you continue to work in the shell, you can place the programs in the background. To place a program in the background at the time you run the program, type an ampersand (&) at the end of the command line, like this:
$ find /usr > /tmp/allusrfiles &

This example command finds all files on your Linux system (starting from /usr), prints those filenames, and puts those names in the file /tmp/allusrfiles. The ampersand (&) runs that command line in the background. To check which commands you have running in the background, use the jobs command, as follows:
$ jobs [1] Stopped (tty output) vi /tmp/myfile [2] Running find /usr -print > /tmp/allusrfiles & [3] Running nroff -man /usr/man2/* >/tmp/man2 & [4]- Running nroff -man /usr/man3/* >/tmp/man3 & [5]+ Stopped nroff -man /usr/man4/* >/tmp/man4 The first job shows a text-editing command (vi) that I placed in the background and stopped by pressing Ctrl+Z while I was editing. Job 2 shows the find command I just ran. Jobs 3 and 4 show nroff commands currently running in the background.

Job 5 had been running in the shell (foreground) until I decided too many processes were running and pressed Ctrl+Z to stop job 5 until a few processes had completed.

Using Foreground and Background Commands
Continuing with the example, you can bring any of the commands on the jobs list to the foreground. For example, to edit myfile again, type:
$ fg %1

As a result, the vi command opens again, with all text as it was when you stopped the vi job. ✦ %—Refers to the most recent command put into the background (indicated by the plus sign when you type the jobs command). This action brings the command to the foreground.

12

✦ %string—Refers to a job where the command begins with a particular string of characters. The string must be unambiguous. (In other words, typing %vi when there are two vi commands in the background results in an error message.) ✦ %?string—Refers to a job where the command line contains a string at any point. The string must be unambiguous or the match will fail. ✦ %--—Refers to the previous job stopped before the one most recently stopped. If a command is stopped, you can start it running again in the background using the bg command. For example, take job 5 from the jobs list in the previous example: [5]+ Stopped nroff -man man4/* >/tmp/man4 Type the following:
$ bg %5

After that, the job runs in the background. Its jobs entry appears as follows: [5] Running nroff -man man4/* >/tmp/man4 &

Working with the Linux File System
The Linux file system is the structure in which all the information on your computer is stored. Files are organized within a hierarchy of directories. Each directory can contain files, as well as other directories. If you were to map out the files and directories in Linux, it would look like an upside-down tree. At the top is the root directory, which is represented by a single slash (/). Below that is a set of common directories in the Linux system, such as bin, dev, home, lib, and tmp, to name a few. Each of those directories, as well as directories added to the root, can contain subdirectories. Figure 2-1 illustrates how the Linux file system is organized as a hierarchy. To demonstrate how directories are connected, the figure shows a /home directory that contains subdirectories for three users: chris, mary, and tom. Within the chris directory are subdirectories: briefs, memos, and personal. To refer to a file called inventory in the chris/memos directory, you can type the full path of /home/chris/memos/inventory. If your current directory is /home/chris/memos, you can refer to the file as simply inventory. Some of the Linux directories that may interest you include the following: ✦ /bin—Contains common Linux user commands, such as ls, sort, date, and chmod. ✦ /boot—Has the bootable Linux kernel and boot loader configuration files (GRUB). ✦ /dev—Contains files representing access points to devices on your systems. These include terminal devices (tty*), floppy disks (fd*), hard disks (hd*), RAM (ram*), and CD-ROM (cd*). (Users normally access these devices directly through the device files.) 13

✦ /etc—Contains administrative configuration files. ✦ /home—Contains directories assigned to each user with a login account. ✦ /media—Provides a standard location for mounting and automounting devices, such as remote file systems and removable media (with directory names of cdrecorder, floppy, and so on). ✦ /mnt—A common mount point for many devices before it was supplanted by the standard /media directory. Some bootable Linux systems still used this directory to mount hard disk partitions and remote file systems. ✦ /proc—Contains information about system resources. ✦ /root—Represents the root user’s home directory. ✦ /sbin—Contains administrative commands and daemon processes. ✦ /sys (A /proc-like file system, new in the Linux 2.6 kernel and intended to contain files for getting hardware status and reflecting the system’s device tree as it is seen by the kernel. It pulls many of its functions from /proc. ✦ /tmp—Contains temporary files used by applications. ✦ /usr—Contains user documentation, games, graphical files (X11), libraries (lib), and a variety of other user and administrative commands and files. ✦ /var—Contains directories of data used by various applications. In particular, this is where you would place files that you share as an FTP server (/var/ftp) or a Web server (/var/www). It also contains all system log files (/var/log).

Using File-Matching Metacharacters ✦ *—Matches any number of characters. ✦ ?—Matches any one character. ✦ [...]— Matches any one of the characters between the brackets, which can include a dash-separated range of letters or numbers. Try out some of these file-matching metacharacters by first going to an empty 14

directory (such as the test directory described in the previous section) and creating some empty files:
$ touch apple banana grape grapefruit watermelon

The touch command creates empty files. The next few commands show you how to use shell metacharacters with the ls command to match filenames. Try the following commands to see if you get the same responses:
$ ls a* apple $ ls g* grape grapefruit $ ls g*t grapefruit $ ls *e* apple grape grapefruit watermelon $ ls *n* banana watermelon

The first example matches any file that begins with an a (apple). The next example matches any files that begin with g (grape, grapefruit). Next, files beginning with g and ending in t are matched (grapefruit). Next, any file that contains an e in the name is matched (apple, grape, grapefruit, watermelon). Finally, any file that contains an n is matched (banana, watermelon). Here are a few examples of pattern matching with the question mark (?):
$ ls ????e apple grape $ ls g???e* grape grapefruit

The first example matches any five-character file that ends in e (apple, grape). The second matches any file that begins with g and has e as its fifth character (grape, grapefruit). Here are a couple of examples using braces to do pattern matching:
$ ls [abw]* apple banana watermelon $ ls [agw]*[ne] apple grape watermelon

In the first example, any file beginning with a, b, or w is matched. In the second, any file that begins with a, g, or w and also ends with either n or e is matched. You can also include ranges within brackets. For example:
$ ls [a-g]* apple banana grape grapefruit

Here, any filenames beginning with a letter from a through g is matched.

15

Using File-Redirection Metacharacters
Commands receive data from standard input and send it to standard output. Using pipes (described earlier), you can direct standard output from one command to the standard input of another. With files, you can use less than (<) and greater than (>) signs to direct data to and from files. Here are the file-redirection characters: ✦ <—Directs the contents of a file to the command. ✦ >—Directs the output of a command to a file, deleting the existing file. ✦ >>—Directs the output of a command to a file, adding the output to the end of the existing file. Here are some examples of command lines where information is directed to and from files:
$ mail root < ~/.bashrc $ man chmod | col -b > /tmp/chmod $ echo “I finished the project on $(date)” >> ~/projects

In the first example, the contents of the .bashrc file in the home directory are sent in a mail message to the computer’s root user. The second command line formats the chmod man page (using the man command), removes extra back spaces (col b), and sends the output to the file /tmp/chmod (erasing the previous /tmp/chmod file, if it exists). The final command results in the following text’s being added to the user’s project file: I finished the project on Sat Jan 25 13:46:49 PST 2006

Listing Files
The ls (list) command lists files in the current directory. The command ls has a very large number of options, but what you really need to know is that ls -l gives a “long” listing showing the file sizes and permissions, and that the -a option shows even “hidden” files—those with a dot at the start of their names. The shell expands the * character to mean “any string of characters not starting with ‘.’.” (See the discussion of wildcards in the “Advanced Shell Features” section earlier in this chapter for more information about how and why this works.) Therefore, *.doc is interpreted as any filename ending with .doc that does not start with a dot and a* means “any filename starting with the letter a.” For example: ✦ ls -la—Gives a long listing of all files in the current directory including “hidden” files with names staring with a dot ✦ ls a*—Lists all files in the current directory whose names start with a ✦ ls -l *.doc—Gives a long listing of all files in the current directory whose names end with .doc

Copying Files
The cp (copy) command copies a file, files, or directory to another location. The option -R allows you to copy directories recursively (in general, -R or -r in commands 16

often has the meaning of “recursive”). If the last argument to the cp command is a directory, the files mentioned will be copied into that directory. Note that by default, cp will “clobber” existing files, so in the second example that follows, if there is already a file called afile in the directory /home/bible, it will be overwritten without asking for any confirmation. Consider the following examples: ✦ cp afile afile.bak—Copies the file afile to a new file afile.bak. ✦ cp afile /home/bible/—Copies the file afile from the current directory to the directory /home/bible/. ✦ cp * /tmp—Copies all nonhidden files in the current directory to /tmp/. ✦ cp -a docs docs.bak—Recursively copies the directory docs beneath the current directory to a new directory docs.bak, while preserving file attributes and copying all files including hidden files whose names start with a dot. The -a option implies the -R option, as a convenience. ✦ cp –i—By default, if you copy a file to a location where a file of the same name already exists, the old file will be silently overwritten. The -i option makes the command interactive; in other words it asks before overwriting. ✦ cp –v—With the –v (verbose) option, the cp command will tell you what it is doing. A great many Linux commands have a –v option with the same meaning.

Moving and Renaming Files
The mv (move) command has the meaning both of “move” and of “rename.” In the first example that follows, the file afile will be renamed to the name bfile. In the second example, the file afile in the current directory will be moved to the directory /tmp/. ✦ mv afile bfile—Renames the existing file afile with the new name bfile ✦ mv afile /tmp—Moves the file afile in the current directory to the directory /tmp

Deleting Files and Directories
The r m (remove) command enables you to delete files and directories. Be warned: r m is a dangerous command. It doesn’t really offer you a second chance. When files are deleted, they’re gone. You can use r m -i as in the last example that follows. That at least gives you a second chance to think about it, but as soon as you agree, once again, the file is gone. Some people like to create an alias (see Chapter 14) that makes the r m command act like rm -i. We would advise at least to be careful about this: It will lull you into a false sense of security, and when you’re working on a system where this change has not been made, you may regret it. Doug Gwyn, a well-known Internet personality, once said, “Unix was never designed to keep people from doing stupid things because that policy would also keep them from doing clever things.” You can, of course, use r m to delete every file on your system as simply as this: rm -rf /. (You have to be logged in as a user, such as the 17

root user, who has the privileges to do this, but you get the idea.) Some better examples of using the rm command in daily use are: ✦ rm afile—Removes the file afile. ✦ rm * —Removes all (nonhidden) files in the current directory. The r m command will not remove directories unless you also specify the -r (recursive) option. ✦ rm -rf doomed—Removes the directory doomed and everything in it. ✦ rm –i a*—Removes all files with names beginning with a in the current directory, asking for confirmation each time.

Changing Directories
You use the cd (change directory) command to change directories: ✦ cd ~—Changes to your home directory ✦ cd /tmp—Changes to the directory /tmp On most Linux systems, your prompt will tell you what directory you’re in (depending on the setting you’ve used for the PS1 environment variable). However; if you ever explicitly need to know what directory you’re in, you can use the pwd command to identify the working directory for the current process (process working directory, hence pwd).

Making Directories
You can use the mkdir (make directory) command to make directories. For example: ✦ mkdir photos—Makes a directory called photos within the current directory. ✦ mkdir -p this/that/theother—Makes the nested subdirectories named within the current directory.

Removing Directories
The command rmdir will remove a directory that is empty.

Making Links to Files or Directories
In Linux, you can use the ln (link) command to make links to a file or directory. A file can have any number of so-called “hard” links to it. Effectively, these are alternative names for the file. So if you create a file called afile, and make a link to it called bfile, there are now two names for the same file. If you edit afile, the changes you’ve made will be in bfile. But if you delete afile, bfile will still exist; it will disappear only when there are no links left to it. Hard links can be made only on the same filesystem—you can’t create a hard link to a file on another partition because the link operates at the filesystem level, referring to the actual filesystem data structure that holds information about the file. You can create a hard link only to a file, not to a directory.

18

You can also create a symbolic link to a file. A symbolic link is a special kind of file that redirects any usage of the link to the original file. This is somewhat similar to the use of “shortcuts” in Windows. You can also create symbolic links to directories, which can be very useful if you frequently use a subdirectory that is hidden several levels deep below your home directory. In the last example that follows, you will end up with a symbolic link called useful in the current directory. Thus, the command cd useful will have the same effect as cd docs/linux/suse/useful. ✦ ln afile bfile—Makes a “hard” link to afile called bfile ✦ ln -s afile linkfile—Makes a symbolic link to afile called linkfile ✦ ln -s docs/linux/suse/useful—Makes a symbolic link to the named directory in the current directory

Concatenating Files
The command cat (concatenate) displays files to standard output. If you want to view the contents of a short text file, the easiest thing to do is to cat it, which sends its contents to the shell’s standard output, which is the shell in which you typed the cat command. If you cat two files, you will see the contents of each flying past on the screen. But if you want to combine those two files into one, all you need to do is cat them and redirect the output to the cat command to a file using >. Linux has a sense of humor. The cat command displays files to standard output, starting with the first line and ending with the last. The tac command (cat spelled backward) displays files in reverse order, beginning with the last line and ending with the first. The command tac is amusing: Try it! ✦ cat /etc/passwd—Prints /etc/passwd to the screen ✦ cat afile bfile—Prints the contents of afile to the screen followed by the contents of bfile ✦ cat afile bfile > cfile—Combines the contents of afile and bfile and writes them to a new file, cfile

Viewing Files with more and less
The more and less commands are known as pagers because they allow you to view the contents of a text file one screen at a time and to page forward and backward through the file (without editing it). The name of the more command is derived from the fact that it allows you to see a file one screen at a time, thereby seeing “more” of it. The name of the less command comes from the fact that it originally began as an open source version of the more command (before more itself became an open source command) and because it originally did less than the more command (the author had a sense of humor). Nowadays, the less command has many added features, including the fact that you can use keyboard shortcuts such as pressing the letter b when viewing a file to move backward through the file. The man page of less lists all the other hot keys that can be used for navigating through a file while reading it using less. Both more and less use the hot key q to exit. ✦ more /etc/passwd—Views the contents of /etc/passwd

19

✦ less /etc/passwd—Views the contents of /etc/passwd

Viewing the Start or End of Files
The head and tail commands allow you to see a specified number of lines from the top or bottom of a file. The tail command has the very useful feature that you can use tail -f to keep an eye on a file as it grows. This is particularly useful for watching what is being written to a log file while you make changes in the system. Consider the following examples: ✦ head -n5 /etc/passwd—Prints the first five lines of the file /etc/passwd to the screen ✦ tail -n5 /etc/passwd—Prints the last five lines of /etc/passwd to the screen ✦ tail -f /var/log/messages—Views the last few lines of /var/log/ messages and continues to display changes to the end of the file in real time

Searching Files with grep
The grep (global regular expression print) command is a very useful tool for finding stuff in files. It can do much more than even the examples that follow this paragraph indicate. Beyond simply searching for text, it can search for regular expressions. It’s a regular expression parser, and regular expressions are a subject for a book in themselves. ✦ grep bible /etc/exports—Looks for all lines in the file /etc/exports that include the string bible ✦ tail -100 /var/log/apache/access.log|grep 404—Looks for the string 404, the web server’s “file not found” code, in the last hundred lines of the web server log ✦ tail -100 /var/log/apache/access.log|grep -v googlebot—Looks in the last 100 lines of the web server log for lines that don’t indicate accesses by the Google search robot ✦ grep -v ^# /etc/apache2/httpd.conf—Looks for all lines that are not commented out in the main Apache configuration file.

Finding Files with find and locate
The find command searches the filesystem for files that match a specified pattern. The locate command provides a faster way of finding files but depends on a database that it creates and refreshes at regular intervals. The locate command is fast and convenient, but the information it displays may not always be up-to-date— this depends on whether its database is up-to-date. To use the locate command, you need to have the package findutils-locate installed. find is a powerful command with many options, including the ability to search for files with date stamps in a particular range (useful for backups) and to search for 20

files with particular permissions, owners, and other attributes. The documentation for find can be found in its info pages: info find. ✦ find .-name *.rpm—Finds RPM packages in the current directory ✦ find .|grep page—Finds files in the current directory and its subdirectories with the string page in their names ✦ locate traceroute—Finds files with names including the string traceroute anywhere on the system.

Basic User and Group Concepts
Linux is a truly multiuser operating system. The concept of users and groups in Linux is inherited from the Unix tradition, and among other things provides a very clear and precise distinction between what normal users can do and what a privileged user can do (such as the root user, the superuser and ultimate administrator on a Linux system, who can do anything). The fact that the system of users and groups and the associated system of permissions is built into the system at the deepest level is one of the reasons why Linux (and Unix in general) is fundamentally secure in a way that Microsoft Windows is not. Although modern versions of Windows have a similar concept of users and groups, the associated concept of the permissions with which a process can be run leaves a lot to be desired. This is why there are so many Windows vulnerabilities that are based on exploiting the scripting capabilities of programs that are run with user privileges but that turn out to be capable of subverting the system. If you’re interested in the differences between the major operating systems, Eric Raymond, noted open source guru and philosopher, offers some interesting comparisons and discussion at www.catb.org/~esr/writings/taoup/ html/ch03s02.html. Every Linux system has a number of users accounts: Some of these are human users, and some of them are system users, which are user identities that the system uses to perform certain tasks. The users on a system (provided it does authentication locally) are listed in the file /etc/passwd. Look at your own entry in /etc/passwd; it will look something like this: roger:x:1000:100:Roger Whittaker:/home/roger:/bin/bash This shows, among other things, that the user with username roger has the real name Roger Whittaker, that his home directory is /home/roger, and that his default shell is /bin/bash (the bash shell). There will almost certainly also be an entry for the system user postfix, looking something like this: postfix:x:51:51:Postfix Daemon:/var/spool/postfix:/bin/false This is the postfix daemon, which looks after mail. This user can’t log in because its shell is /bin/false, but its home directory is /var/spool/postfix, and it owns the spool directories in which mail being sent and delivered is held. The fact that these directories are owned by the user postfix rather than by root is a security feature—it means that any possible vulnerability in postfix is less likely to lead to

21

a subversion of the whole system. Similar system users exist for the web server (the user wwwrun) and various other services. You won’t often need to consider these, but it is important to understand that they exist and that the correct ownerships of certain files and directories by these users is part of the overall security model of the system as a whole. Each user belongs to one or more groups. The groups on the system are listed in the file /etc/groups. To find out what groups you belong to, you can simply type the command groups (alternatively look at the file /etc/group and look for your username). By default, on a SUSE system, you will find that you belong to the group users and also to a few system groups, including the groups dialout and audio. This is to give normal human users the right to use the modem and sound devices (which is arranged through file permissions as you shall see later in this chapter).

Creating Users and Groups
The useradd command has options that allow you to specify the groups to which the new user will belong:
useradd -c “Guest User” –u 5555 –g 500 –G 501 –m –d /home/guest –s /bin/bash –p password guest

I wouldn’t recommend to add the users directly in the /etc/passwd file unless you have some experience in Linux. Altough if you choose to do so please check the /etc/groups and /etc/shadow files to be in order. To delete a user the command is userdel Other useful commands are: groupadd, groupdel. I think it’s pretty obvious what these commands do. To verify user logged on the system you can try
$ last You might also want to see what commands like : who, whoami, id do.

Working with File Ownership and Permissions
The users and groups discussed in the previous section are useful only because each file on the system is owned by a certain user and group and because the system of file permissions can be used to restrict or control access to the files based on the user who is trying to access them. The section that follows is a crash course in file permissions; we go into greater detail in Chapter 13. If you look at a variety of files and directories from across the system and list them with the ls -l command, you can see different patterns of ownership and permissions. In each case the output from the ls command is giving you several pieces of information: the permissions on the file expressed as a ten-place string, the number of links to the file, the ownership of the file (user and group), the size of the file in 22

bytes, the modification time, and the filename. Of the ten places in the permissions string, the first differs from the others: The last nine can be broken up into three groups of three, representing what the user can do with the file, what members of the group can do with the file, and what others can do with the file, respectively. In most cases, these permissions are represented by the presence or absence of the letters r (read), w (write), and x (execute) in the three positions. So: ✦ rwx means permission to read, write, and execute ✦ r-- means permission to read but not to write or execute ✦ r-x means permission to read and execute but not to write Permission to write to a file includes the right to overwrite or delete it. So for example:
ls -l screenshot1.png
-rw-r--r-- 1 roger users 432686 2004-0 5-17 20:33 screenshot1.png

This file can be read and written by its owner (roger), can be read by members of the group users, and can be read by others.
ls -l /home/roger/afile
-r-------- 1 roger users 0 2004-0 5-17 21:07 afile

This file is not executable or writable, and can be read only by its owner (roger). Even roger would have to change the permissions on this file to be able to write it.
ls -l /etc/passwd
-rw-r--r-- 1 root root 1598 2004-0 5-17 19:36 /etc/passwd

This is the password file—it is owned by root (and the group root to which only root belongs), is readable by anyone, but is group writable only by root.
ls -l /etc/shadow
-rw-r----- 1 root shadow 796 2004-0 5-17 19:36 /etc/shadow

This is the shadow file, which holds the encrypted passwords for users. It can be read only by root and the system group shadow and can be written only by root.
ls -l /usr/sbin/traceroute
-rwxr-xr-x 1 root root 14228 2004-0 4-06 02:27 /usr/sbin/traceroute

This is an executable file that can be read and executed by anyone, but written only by root.
ls -ld /home
drwxr-xr-x 6 root root 4096 2004-0 5-17 19:36 /home

This is a directory (note the use of the -d flag to the ls command and the d in the first position in the permissions). It can be read and written by the root user, and read and executed by everyone. When used in directory permissions, the x (executable) permission translates into the ability to search or examine the directory— you cannot execute a directory.

23

ls -ld /root
drwx------ 18 root root 584 2004-0 5-14 08:29 /root

In the preceding code, /root is the root user’s home directory. No user apart from root can access it in any way.
ls -l /bin/mount
-rwsr-xr-x 1 root root 87296 2004-0 4-06 14:17 /bin/mount

This is a more interesting example: notice the letter s where until now we saw an x. This indicates that the file runs with the permissions of its owner (root) even when it is executed by another user: Such a file is known as being suid root (set user ID upon execution). There are a small number of executables on the system that need to have these permissions. This number is kept as small as possible because there is a potential for security problems if ever a way could be found to make such a file perform a task other than what it was written for.
ls -l alink
lrwxrwxrwx 1 roger users 8 2004-0 5-17 22:19 alink -> file.bz2

Note the lin the first position: This is a symbolic link to file.bz2 in the same directory. Numerical Permissions On many occasions when permissions are discussed, you will see them being described in a three-digit numerical form (sometimes more digits for exceptional cases), such as 644. If a file has permissions 644, it has read and write permissions for the owner and read permissions for the group and for others. This works because Linux actually stores file permissions as sequences of octal numbers. This is easiest to see by example:
421421421 -rw-r--r--rwxr-xr-x -r--r--r--r-------644 755 444 400

So for each owner, group, and others, a read permission is represented by 4 (the high bit of a 3-bit octal value), a write permission is represented by 2 (the middle bit of a 3-bit octal value), and an execute permission is represented by 1 (the low bit of a 3-bit octal value). Changing Ownership and Permissions You can change the ownership of a file with the command chown. If you are logged in as root, you can issue a command like this: chown harpo:users file.txt This changes the ownership of the file file.txt to the user harpo and the group users. To change the ownership of a directory and everything in it, you can use the command with the -R (recursive) option, like this: chown -R harpo:users /home/harpo/some_directory/ The chmod command is used to change file permissions. You can use chmod with

24

both the numerical and the rwx notation we discussed earlier in the chapter. Again, this is easiest to follow by looking at a few examples: ✦ chmod u+x afile—Adds execute permissions for the owner of the file ✦ chmod g+r afile—Adds read permissions for the group owning the file ✦ chmod o-r afile—Removes read permission for others ✦ chmod a+w afile—Adds write permissions for all ✦ chmod 644 afile—Changes the permissions to 644 (owner can read and write; group members and others can only read) ✦ chmod 755 afile—Changes the permissions to 755 (owner can read, write and execute; group members and others can only read and execute) If you use chmod with the rwx notation, u means the owner, g means the group, o means others, and a means all. In addition, + means add permissions, and - means remove permissions, while r, w, and x still represent read, write, and execute, respectively. When setting permissions, you can see the translation between the two notations by executing the chmod command with the -v (verbose) option. For example:
#chmod -v 755 afile mode of `afile’ changed to 0755 (rwxr-xr-x) #chmod -v 200 afile mode of `afile’ changed to 0200 (-w-------)

Mounting and Unmounting Filesystems
Mounting a filesystem is what you need to do to make the files it contains available, and the mount command is what you use to do that. In Linux, everything that can be seen is part of one big tree of files and directories. Those that are on physically different partitions, disks, or remote machines are “grafted” onto the system at a particular place—a mount point, which is usually an empty directory. To find out what is currently mounted, simply type the command mount on its own. We discuss the mount command further in Chapters 14 and 22. SUSE Linux now uses subfs to mount removable devices such as CD-ROMs and floppy disks. This means that you no longer have to mount them explicitly; for example, if you simply change to the directory /media/cdrom, the contents of the CD will be visible. ✦ mount 192.168.1.1:/home/bible/ /mnt—Mounts the remote network filesystem /home/bible/ from the machine 192.168.1.1 on the mount point /mnt ✦ mount /dev/hda3 /usr/local—Mounts the disk partition /dev/hda3 on the mount point /usr/local ✦ umount /mnt—Unmounts whatever is mounted on the mount point /mnt Tip: For more interesting information see the manual for the /etc/fstab. Ask for details if something is “foggy” ;)

25

System information related commands
Here are some commands that helps you find some information about the system status.

Memory Reporting with the free Command
The free command shows breakdowns of the amounts and totals of free and used memory, including your swapfile usage. This command has several command-line options, but is easy to run and understand, for example:
# free
total used free shared buffers cached Mem: 30892 28004 2888 14132 3104 10444 -/+ buffers: 14456 16436 Swap: 34268 7964 26304

This shows a 32MB system with 34MB swap space. Notice that nearly all the system memory is being used, and nearly 8MB of swap space has been used. By default, the free command displays memory in kilobytes, or 1024-byte notation. You can use the -b option to display your memory in bytes, or the -m option to display memory in megabytes. You can also use the free command to constantly monitor how much memory is being used through the -s command. This is handy as a real-time monitor if you specify a .01-second update and run the free command in a terminal window under X11.

Virtual Memory Reporting with the vmstat Command
The vmstat is a general-purpose monitoring program, which offers real-time display of not only memory usage, virtual memory statistics, but disk activity, system usage, and central processing unit (CPU) activity. If you call vmstat without any command-line options, you’ll get a one-time snapshot, for example:

# vmstat procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 7468 1060 4288 10552 1 1 10 1 134 68 3 2 96

If you specify a time interval in seconds on the vmstat command line, you’ll get a continuously scrolling report. Having a constant display of what is going on with your computer can help you if you’re trying to find out why your computer suddenly slows down, or why there’s a lot of disk activity.

26

Reclaiming Memory with the kill Command
As a desperate measure if you need to quickly reclaim memory, you can stop running programs by using the kill command. In order to kill a specific program, you should use the ps command to list current running processes, and then stop any or all of them with the kill command. By default, the ps command lists processes you own and which you can kill, for example:
# ps PID TTY STAT TIME COMMAND 367 p0 S 0:00 bash 581 p0 S 0:01 rxvt 582 p1 S 0:00 (bash) 747 p0 S 0:00 (applix) 809 p0 S 0:18 netscape index.html 810 p0 S 0:00 (dns helper) 945 p0 R 0:00 ps

The ps command will list the currently running programs and the program’s process number, or PID. You can use this information to kill a process with
# kill -9 809

Also you should try out the top command and see what it shows.

Determining How Long Linux Has Been Running with the uptime and w Commands
The uptime command shows you how long Linux has been running, how many users are on, and three system load averages, for example:
# uptime 12:44am up 8:16, 3 users, load average: 0.11, 0.10, 0.04

If this is too little information for you, try the w command, which first shows the same information as the uptime command, and then lists what currently logged-in users are doing:

#w 12:48am up 8:20, 3 users, load average: 0.14, 0.09, 0.05 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT bball ttyp0 localhost.locald 9:47pm 15.00s 0.38s 0.16s bash bball ttyp2 localhost.locald 12:48am 0.00s 0.16s 0.08s w

The w command gives a little more information, and it is especially helpful if you would like to monitor a busy system with a number of users. Kernel version and other related information (like hostname, new mail, date, architecture) can be found easily with: 27

#uname –a

Lots of hardware information you can get with:
#dmidecode

Or
#lspci

For the pci installed on the system or maybe for devices:
#lsdev Disk usage can be print with: #df –h

File disk usage can be also checked with:
#du -h

Runlevels
Linux systems typically use seven different runlevels, which define what services should be running on the system. The init process uses these runlevels to start and stop the computer. Runlevel 0 signifies that the computer has completely shut down, and runlevel 1 (or S) represents single-user mode. Runlevels 2 through 5 are multiuser modes, and runlevel 6 is the "reboot" level. Different Linux variations may not use all runlevels, but typically, runlevel 2 is multiuser text without NFS, runlevel 3 is multiuser text, and runlevel 5 is multiuser GUI. Each runlevel has its own directory that defines which services start and in what order. You'll typically find these directories at /etc/rc.d/rc?.d, where ? is a number from 0 through 6 that corresponds to the runlevel. Inside each directory are symlinks that point to master initscripts found in /etc/init.d or /etc/rc.d/init.d. These symlinks have a special format. For instance, S12syslog is a symlink that points to /etc/init.d/syslog, the initscript that handles the syslog service. The S in the name tells init to execute the script with the "start" parameter when starting that runlevel. Likewise, there may be another symlink pointing to the same initscript with the name K88syslog; init would execute this script with the "stop" parameter when exiting the runlevel.

28

The number following the S or K determines the order in which init should start or stop the service in relation to other services. You can see by the numbers associated with the syslog service that syslog starts fairly early in the boot process, but it stops late in the shutdown process. This is so syslog can log as much information about other services starting and stopping as possible. Because these are all symlinks, it's easy to manipulate the order in which init starts services by naming symlinks accordingly. It's also easy to add in new services by symlinking to the master initscript. To find out in which runlevel you are just type:
#runlevel

Or
#who –r

The configuration file for the runlevels is /etc/inittab. See the manual page for this file.

Using the vi Text Editor
It’s almost impossible to use Linux for any period of time and not need to use a text editor. This is because most Linux configuration files are plain text files that you will almost certainly need to change manually at some point. If you are using a GUI, you can run gedit, which is fairly intuitive for editing text. There’s also a simple text editor you can run from the shell called nano. However, most Linux shell users will use either the vi or emacs command to edit text files. The advantage of vi or emacs over a graphical editor is that you can use it from any shell, a character terminal, or a character-based connection over a network (using telnet or ssh, for example)—no GUI is required. They also each contain tons of features, so you can continue to grow with them. This section provides a brief tutorial on the vi text editor, which you can use to manually edit a configuration file from any shell. (If vi doesn’t suit you, see the “Exploring Other Text Editors” sidebar for other options.) Most often, you start vi to open a particular file. For example, to open a file called /tmp/test, type the following command:
$ vi /tmp/test

The box at the top represents where your cursor is. The bottom line keeps you informed about what is going on with your editing (here you just opened a new file). In between, there are tildes (~) as filler because there is no text in the file yet. Now here’s the intimidating part: There are no hints, menus, or icons to tell you what to do. On top of that, you can’t just start typing. If you do, the computer is

29

likely to beep at you. And some people complain that Linux isn’t friendly. The first things you need to know are the different operating modes: command and input. The vi editor always starts in command mode. Before you can add or change text in the file, you have to type a command (one or two letters and an optional number) to tell vi what you want to do. Case is important, so use uppercase and lowercase exactly as shown in the examples! To get into input mode, type an input command. To start out, type either of the following: ✦ a—The add command. After it, you can input text that starts to the right of the cursor. ✦ i—The insert command. After it, you can input text that starts to the left of the cursor. ✦Arrow keys—Move the cursor up, down, left, or right in the file one character at a time. To move left and right you can also use Backspace and the space bar, respectively. If you prefer to keep your fingers on the keyboard, move the cursor with h (left), l (right), j (down), or k (up). ✦ w—Moves the cursor to the beginning of the next word. ✦ b—Moves the cursor to the beginning of the previous word. ✦ 0 (zero)—Moves the cursor to the beginning of the current line. ✦ $—Moves the cursor to the end of the current line. ✦ H—Moves the cursor to the upper-left corner of the screen (first line on the screen). ✦ M—Moves the cursor to the first character of the middle line on the screen. ✦ L—Moves the cursor to the lower-left corner of the screen (last line on the screen). The only other editing you need to know is how to delete text. Here are a few vi commands for deleting text: ✦ x—Deletes the character under the cursor. ✦ X—Deletes the character directly before the cursor. ✦ dw—Deletes from the current character to the end of the current word. ✦ d$—Deletes from the current character to the end of the current line. ✦ d0—Deletes from the previous character to the beginning of the current line. To wrap things up, use the following keystrokes for saving and quitting the file: ✦ ZZ—Save the current changes to the file and exit from vi. ✦ :w—Save the current file but continue editing. ✦ :wq—Same as ZZ. ✦ :q—Quit the current file. This works only if you don’t have any unsaved changes. ✦ :q!—Quit the current file and don’t save the changes you just made to the file. If you’ve really trashed the file by mistake, the :q! command is the best way to exit and abandon your changes. The file reverts to the most recently changed version. So, if you just did a :w, you are stuck with the changes up to that point. If you just want to undo a few bad edits, press u to back out of changes. 30

You have learned a few vi editing commands. I describe more commands in the following sections. First, however, here are a few tips to smooth out your first trials with vi: ✦ Esc—Remember that Esc gets you back to command mode. (I’ve watched people press every key on the keyboard trying to get out of a file.) Esc followed by ZZ gets you out of command mode, saves the file, and exits. ✦ u—Press U to undo the previous change you made. Continue to press u to undo the change before that, and the one before that. ✦ Ctrl+R—If you decide you didn’t want to undo the previous command, use Ctrl+R for Redo. Essentially, this command undoes your undo. ✦ Caps Lock—Beware of hitting Caps Lock by mistake. Everything you type in vi has a different meaning when the letters are capitalized. You don’t get a warning that you are typing capitals—things just start acting weird. ✦ Ctrl+F—Page ahead, one page at a time. ✦ Ctrl+B—Page back, one page at a time. ✦ Ctrl+D—Page ahead one-half page at a time. ✦ Ctrl+U—Page back one-half page at a time. ✦ G—Go to the last line of the file. ✦ 1G—Go to the first line of the file. (Use any number to go to that line in the file.) To search for the next occurrence of text in the file, use either the slash (/) or the question mark (?) character. Follow the slash or question mark with a pattern (string of text) to search forward or backward, respectively, for that pattern. Within the search, you can also use metacharacters. Here are some examples: ✦ /hello—Searches forward for the word hello. ✦ ?goodbye—Searches backward for the word goodbye. ✦ /The.*foot—Searches forward for a line that has the word The in it and also, after that at some point, the word foot. ✦ ?[pP]rint—Searches backward for either print or Print. Remember that case matters in Linux, so make use of brackets to search for words that could have different capitalization. You can precede most vi commands with numbers to have the command repeated that number of times. This is a handy way to deal with several lines, words, or characters at a time. Here are some examples: ✦ 3dw—Deletes the next three words. ✦ 5cl—Changes the next five letters (that is, removes the letters and enters input mode). ✦ 12j—Moves down 12 lines. Putting a number in front of most commands just repeats those commands. At this point, you should be fairly proficient at using the vi command.

31

32

Automated Tasks
In Linux, tasks can be configured to run automatically within a specified period of time, on a specified date, or when the system load average is below a specified number. Red Hat Enterprise Linux is pre-configured to run important system tasks to keep the system updated. For example, the slocate database used by the locate command is updated daily. A system administrator can use automated tasks to perform periodic backups, monitor the system, run custom scripts, and more. Red Hat Enterprise Linux comes with several automated tasks utilities: cron, at, and batch.

Cron
Cron is a daemon that can be used to schedule the execution of recurring tasks according to a combination of the time, day of the month, month, day of the week, and week. Cron assumes that the system is on continuously. If the system is not on when a task is scheduled, it is not executed. To use the cron service, the vixie-cron RPM package must be installed and the crond service must be running. To determine if the package is installed, use the rpm -q vixiecron command. To determine if the service is running, use the command /sbin/service crond status.

Configuring Cron Tasks
The main configuration file for cron, /etc/crontab, contains the following lines: SHELL=/bin/bash PATH=/sbin:/bin:/usr/sbin:/usr/bin MAILTO=root HOME=/ # run-parts 01 * * * * root run-parts /etc/cron.hourly 02 4 * * * root run-parts /etc/cron.daily 22 4 * * 0 root run-parts /etc/cron.weekly 42 4 1 * * root run-parts /etc/cron.monthly The first four lines are variables used to configure the environment in which the cron tasks are run. The SHELL variable tells the system which shell environment to use (in this example the bash shell), while the PATH variable defines the path used to execute

33

commands. The output of the cron tasks are emailed to the username defined with the MAILTO variable. If the MAILTO variable is defined as an empty string (MAILTO=""), email is not sent. The HOME variable can be used to set the home directory to use when executing commands or scripts. Each line in the /etc/crontab file represents a task and has the following format: minute hour day of month month day of week command Fields # +---------------- minute (0 - 59) # | +------------- hour (0 - 23) # | | +---------- day of month (1 - 31) # | | | +------- month (1 - 12) # | | | | +---- day of week (0 - 6) (Sunday=0 or 7) #| | | | | * * * * * command to be executed minute — any integer from 0 to 59 hour — any integer from 0 to 23 day of month — any integer from 1 to 31 (must be a valid day if a month is specified) month — any integer from 1 to 12 (or the short name of the month such as jan or feb) day of week — any integer from 0 to 7, where 0 or 7 represents Sunday (or the short name of the week such as sun or mon) command — the command to execute (the command can either be a command such as ls /proc >> /tmp/proc or the command to execute a custom script) For any of the above values, an asterisk (*) can be used to specify all valid values. For example, an asterisk for the month value means execute the command every month within the constraints of the other values. A hyphen (-) between integers specifies a range of integers. For example, 1-4 means the integers 1, 2, 3, and 4. A list of values separated by commas (,) specifies a list. For example, 3, 4, 6, 8 indicates those four specific integers. The forward slash (/) can be used to specify step values. The value of an integer can be skipped within a range by following the range with /<integer>. For example, 0-59/2 can be used to define every other minute in the minute field. Step values can also be used

34

with an asterisk. For instance, the value */3 can be used in the month field to run the task every third month. Any lines that begin with a hash mark (#) are comments and are not processed. As shown in the /etc/crontab file, the run-parts script executes the scripts in the /etc/cron.hourly/, /etc/cron.daily/, /etc/cron.weekly/, and /etc/cron.monthly/ directories on an hourly, daily, weekly, or monthly basis respectively. The files in these directories should be shell scripts. If a cron task is required to be executed on a schedule other than hourly, daily, weekly, or monthly, it can be added to the /etc/cron.d/ directory. All files in this directory use the same syntax as /etc/crontab. # record the memory usage of the system every monday # at 3:30AM in the file /tmp/meminfo 30 3 * * mon cat /proc/meminfo >> /tmp/meminfo # run custom script the first day of every month at 4:10AM 10 4 1 * * /root/scripts/backup.sh

Crontab Examples
Users other than root can configure cron tasks by using the crontab utility. All userdefined crontabs are stored in the /var/spool/cron/ directory and are executed using the usernames of the users that created them. To create a crontab as a user, login as that user and type the command crontab -e to edit the user's crontab using the editor specified by the VISUAL or EDITOR environment variable. The file uses the same format as /etc/crontab. When the changes to the crontab are saved, the crontab is stored according to username and written to the file /var/spool/cron/username. The cron daemon checks the /etc/crontab file, the /etc/cron.d/ directory, and the /var/spool/cron/ directory every minute for any changes. If any changes are found, they are loaded into memory. Thus, the daemon does not need to be restarted if a crontab file is changed.

Controlling Access to Cron
The /etc/cron.allow and /etc/cron.deny files are used to restrict access to cron. The format of both access control files is one username on each line. Whitespace is not permitted in either file. The cron daemon (crond) does not have to be restarted if the access control files are modified. The access control files are read each time a user tries to add or delete a cron task. The root user can always use cron, regardless of the usernames listed in the access control files.

35

If the file cron.allow exists, only users listed in it are allowed to use cron, and the cron.deny file is ignored. If cron.allow does not exist, users listed in cron.deny are not allowed to use cron.

Starting and Stopping the Service
To start the cron service, use the command /sbin/service crond start. To stop the service, use the command /sbin/service crond stop. It is recommended that you start the service at boot time.

NFS
What is NFS? The Network File System (NFS) was developed to allow machines to mount a disk partition on a remote machine as if it were a local disk. It allows for fast, seamless sharing of files across a network. It also gives the potential for unwanted people to access your hard drive over the network (and thereby possibly read your email and delete all your files as well as break into your system) if you set it up incorrectly. There are other systems that provide similar functionality to NFS. Samba (http://www.samba.org) provides file services to Windows clients. The Andrew File System, originally developed by IBM (http://www.openafs.org) and now open-source, provides a file sharing mechanism with some additional security and performance features. The Coda File System (http://www.coda.cs.cmu.edu/) combines file sharing with a specific focus on disconnected clients. Many of the features of the Andrew and Coda file systems are slated for inclusion in the next version of NFS (Version 4) (http://www.nfsv4.org). The advantage of NFS today is that it is mature, standard, well understood, and supported robustly across a variety of platforms.

Setting Up an NFS Server
Introduction to Server Setup
It is assumed that you will be setting up both a server and a client. Setting up the server will be done in two steps: Setting up the configuration files for NFS, and then starting the NFS services.

36

Setting up the Configuration Files
There are three main configuration files you will need to edit to set up an NFS server: /etc/exports, /etc/hosts.allow, and /etc/hosts.deny . Strictly speaking, you only need to edit /etc/exports to get NFS to work, but you would be left with an extremely insecure setup. You may also need to edit your startup scripts; /etc/exports This file contains a list of entries; each entry indicates a volume that is shared and how it is shared. Check the man pages (man exports) for a complete description of all the setup options for the file, although the description here will probably satisfy most people's needs. An entry in /etc/exports will typically look like this: directory machine1(option11,option12) machine2(option21,option22) where directory the directory that you want to share. It may be an entire volume though it need not be. If you share a directory, then all directories under it within the same file system will be shared as well. machine1 and machine2 client machines that will have access to the directory. The machines may be listed by their DNS address or their IP address (e.g., machine.company.com or 192.168.0.8 ). Using IP addresses is more reliable and more secure. DNS names may not always resolve the ip address optionxx the option listing for each machine will describe what kind of access that machine will have. Important options are: ro: The directory is shared read only; the client machine will not be able to write it. This is the default. rw: The client machine will have read and write access to the directory. no_root_squash: By default, any file request made by user root on the client machine is treated as if it is made by user nobody on the server. (Exactly which UID the request is mapped to depends on the UID of user "nobody" on the server, not the client.) If no_root_squash is selected, then root on the client machine will have the same level of access to the files on the system as root on the server. This can have serious security implications, although it may be necessary if you want to perform any administrative

37

work on the client machine that involves the exported directories. You should not specify this option without a good reason. no_subtree_check: If only part of a volume is exported, a routine called subtree checking verifies that a file that is requested from the client is in the appropriate part of the volume. If the entire volume is exported, disabling this check will speed up transfers. sync: By default, all but the most recent version (version 1.11) of the exportfs command will use async behavior, telling a client machine that a file write is complete - that is, has been written to stable storage - when NFS has finished handing the write over to the filesystem. This behavior may cause data corruption if the server reboots, and the sync option prevents this. Suppose we have two client machines, slave1 and slave2, that have IP addresses 192.168.0.1 and 192.168.0.2, respectively. We wish to share our software binaries and home directories with these machines. A typical setup for /etc/exports might look like this: /usr/local 192.168.0.1(ro) 192.168.0.2(ro) /home 192.168.0.1(rw) 192.168.0.2(rw) Here we are sharing /usr/local read-only to slave1 and slave2, because it probably contains our software and there may not be benefits to allowing slave1 and slave2 to write to it that outweigh security concerns. On the other hand, home directories need to be exported read-write if users are to save their work on them. If you have a large installation, you may find that you have a bunch of computers all on the same local network that require access to your server. There are a few ways of simplifying references to large numbers of machines. First, you can give access to a range of machines at once by specifying a network and a netmask. For example, if you wanted to allow access to all the machines with IP addresses between 192.168.0.0 and 192.168.0.255 then you could have the entries: /usr/local 192.168.0.0/255.255.255.0(ro) /home 192.168.0.0/255.255.255.0(rw) See the Networking-Overview HOWTO for further information on how netmasks, and you may also wish to look at the man pages for init and hosts.allow. Second, you can use NIS netgroups in your entry. To specify a netgroup in your exports file, simply prepend the name of the netgroup with an "@". See the NIS HOWTO for details on how netgroups work. Third, you can use wildcards such as *.foo.com or 192.168. instead of hostnames. There were problems with wildcard implementation in the 2.2 kernel series that were fixed in kernel 2.2.19.

38

However, you should keep in mind that any of these simplifications could cause a security risk if there are machines in your netgroup or local network that you do not trust completely. A few cautions are in order about what cannot (or should not) be exported. First, if a directory is exported, its parent and child directories cannot be exported if they are in the same filesystem. However, exporting both should not be necessary because listing the parent directory in the /etc/exports file will cause all underlying directories within that file system to be exported. Second, it is a poor idea to export a FAT or VFAT (i.e., MS-DOS or Windows 95/98) filesystem with NFS. FAT is not designed for use on a multi-user machine, and as a result, operations that depend on permissions will not work well. Moreover, some of the underlying filesystem design is reported to work poorly with NFS's expectations. Third, device or other special files may not export correctly to non-Linux clients. /etc/hosts.allow and /etc/hosts.deny These two files specify which computers on the network can use services on your machine. Each line of the file contains a single entry listing a service and a set of machines. When the server gets a request from a machine, it does the following: It first checks hosts.allow to see if the machine matches a rule listed here. If it does, then the machine is allowed access. If the machine does not match an entry in hosts.allow the server then checks hosts.deny to see if the client matches a rule listed there. If it does then the machine is denied access. If the client matches no listings in either file, then it is allowed access. In addition to controlling access to services handled by inetd (such as telnet and FTP), this file can also control access to NFS by restricting connections to the daemons that provide NFS services. Restrictions are done on a per-service basis. The first daemon to restrict access to is the portmapper. This daemon essentially just tells requesting clients how to find all the NFS services on the system. Restricting access to the portmapper is the best defense against someone breaking into your system through NFS because completely unauthorized clients won't know where to find the NFS daemons. However, there are two things to watch out for. First, restricting portmapper isn't enough if the intruder already knows for some reason how to find those daemons. And second, if you are running NIS, restricting portmapper will also restrict requests to NIS. That should usually be harmless since you usually want to restrict NFS and NIS in a similar way, but just be cautioned. (Running NIS is generally a good idea if you are running NFS, because the client machines need a way of knowing who owns what files on the exported volumes. Of course there are other ways of doing this such as syncing password files. See the NIS HOWTO for information on setting up NIS.)

39

In general it is a good idea with NFS (as with most internet services) to explicitly deny access to IP addresses that you don't need to allow access to. The first step in doing this is to add the followng entry to /etc/hosts.deny: portmap:ALL Starting with nfs-utils 0.2.0, you can be a bit more careful by controlling access to individual daemons. It's a good precaution since an intruder will often be able to weasel around the portmapper. If you have a newer version of nfs-utils, add entries for each of the NFS daemons: lockd:ALL mountd:ALL rquotad:ALL statd:ALL Even if you have an older version of nfs-utils, adding these entries is at worst harmless (since they will just be ignored) and at best will save you some trouble when you upgrade. Some sys admins choose to put the entry ALL:ALL in the file /etc/hosts.deny, which causes any service that looks at these files to deny access to all hosts unless it is explicitly allowed. While this is more secure behavior, it may also get you in trouble when you are installing new services, you forget you put it there, and you can't figure out for the life of you why they won't work. Next, we need to add an entry to hosts.allow to give any hosts access that we want to have access. (If we just leave the above lines in hosts.deny then nobody will have access to NFS.) Entries in hosts.allow follow the format: service: host [or network/netmask] , host [or network/netmask] Here, host is IP address of a potential client; it may be possible in some versions to use the DNS name of the host, but it is strongly discouraged. Suppose we have the setup above and we just want to allow access to slave1.foo.com and slave2.foo.com, and suppose that the IP addresses of these machines are 192.168.0.1 and 192.168.0.2, respectively. We could add the following entry to /etc/hosts.allow: portmap: 192.168.0.1 , 192.168.0.2 For recent nfs-utils versions, we would also add the following (again, these entries are harmless even if they are not supported): lockd: 192.168.0.1 , 192.168.0.2 rquotad: 192.168.0.1 , 192.168.0.2 mountd: 192.168.0.1 , 192.168.0.2 statd: 192.168.0.1 , 192.168.0.2

40

If you intend to run NFS on a large number of machines in a local network, /etc/hosts.allow also allows for network/netmask style entries in the same manner as /etc/exports above.

Getting the services Started
Pre-requisites
The NFS server should now be configured and we can start it running. First, you will need to have the appropriate packages installed. This consists mainly of a new enough kernel and a new enough version of the nfs-utils package. Next, before you can start NFS, you will need to have TCP/IP networking functioning correctly on your machine. If you can use telnet, FTP, and so on, then chances are your TCP networking is fine. That said, with most recent Linux distributions you may be able to get NFS up and running simply by rebooting your machine, and the startup scripts should detect that you have set up your /etc/exports file and will start up NFS correctly. If this does not work, or if you are not in a position to reboot your machine, then the following section will tell you which daemons need to be started in order to run NFS services. If for some reason nfsd was already running when you edited your configuration files above, you will have to flush your configuration;

Starting the Portmapper
NFS depends on the portmapper daemon, either called portmap or rpc.portmap. It will need to be started first. It should be located in /sbin but is sometimes in /usr/sbin. Most recent Linux distributions start this daemon in the boot scripts, but it is worth making sure that it is running before you begin working with NFS (just type ps aux | grep portmap).

The Daemons
NFS serving is taken care of by five daemons: rpc.nfsd, which does most of the work; rpc.lockd and rpc.statd, which handle file locking; rpc.mountd, which handles the initial mount requests, and rpc.rquotad, which handles user file quotas on exported volumes. Starting with 2.2.18, lockd is called by nfsd upon demand, so you do not need to worry about starting it yourself. statd will need to be started separately. Most recent Linux distributions will have startup scripts for these daemons.

41

The daemons are all part of the nfs-utils package, and may be either in the /sbin directory or the /usr/sbin directory. If your distribution does not include them in the startup scripts, then you should add them, configured to start in the following order: rpc.portmap rpc.mountd, rpc.nfsd rpc.statd, rpc.lockd (if necessary), and rpc.rquotad The nfs-utils package has sample startup scripts for RedHat and Debian. If you are using a different distribution, in general you can just copy the RedHat script, but you will probably have to take out the line that says: . ../init.d/functions to avoid getting error messages.

Verifying that NFS is running
To do this, query the portmapper with the command rpcinfo –p localhost to find out what services it is providing. You should get something like this: program vers proto port 100000 2 tcp 111 portmapper 100000 2 udp 111 portmapper 100011 1 udp 749 rquotad 100011 2 udp 749 rquotad 100005 1 udp 759 mountd 100005 1 tcp 761 mountd 100005 2 udp 764 mountd 100005 2 tcp 766 mountd 100005 3 udp 769 mountd 100005 3 tcp 771 mountd 100003 2 udp 2049 nfs 100003 3 udp 2049 nfs 300019 1 tcp 830 amd 300019 1 udp 831 amd 100024 1 udp 944 status 100024 1 tcp 946 status 100021 1 udp 1042 nlockmgr 100021 3 udp 1042 nlockmgr

42

100021 100021 100021 100021

4 1 3 4

udp tcp tcp tcp

1042 nlockmgr 1629 nlockmgr 1629 nlockmgr 1629 nlockmgr

This says that we have NFS versions 2 and 3, rpc.statd version 1, network lock manager (the service name for rpc.lockd) versions 1, 3, and 4. There are also different service listings depending on whether NFS is travelling over TCP or UDP. Linux systems use UDP by default unless TCP is explicitly requested; however other OSes such as Solaris default to TCP. If you do not at least see a line that says portmapper, a line that says nfs, and a line that says mountd then you will need to backtrack and try again to start up the daemons. If you do see these services listed, then you should be ready to set up NFS clients to access files from your server.

Making Changes to /etc/exports later on
If you come back and change your /etc/exports file, the changes you make may not take effect immediately. You should run the command exportfs -ra to force nfsd to re-read the /etc/exports file. If you can't find the exportfs command, then you can kill nfsd with the -HUP flag (see the man pages for kill for details). If that still doesn't work, don't forget to check hosts.allow to make sure you haven't forgotten to list any new client machines there. Also check the host listings on any firewalls you may have set up.

Setting up an NFS Client Mounting Remote Directories
Before beginning, you should double-check to make sure your mount program is new enough (version 2.10m if you want to use Version 3 NFS), and that the client machine supports NFS mounting, though most standard distributions do. If you are using a 2.2 or later kernel with the /proc filesystem you can check the latter by reading the file /proc/filesystems and making sure there is a line containing nfs. If not, typing insmod nfs may make it magically appear if NFS has been compiled as a module; otherwise, you will need to build (or download) a kernel that has NFS support built in. In general, kernels that do not have NFS compiled in will give a very specific error when the mount command below is run.

43

To begin using machine as an NFS client, you will need the portmapper running on that machine, and to use NFS file locking, you will also need rpc.statd and rpc.lockd running on both the client and the server. Most recent distributions start those services by default at boot time. With portmap, lockd, and statd running, you should now be able to mount the remote directory from your server just the way you mount a local hard drive, with the mount command. Continuing our example from the previous section, suppose our server above is called master.foo.com,and we want to mount the /home directory on slave1.foo.com. Then, all we have to do, from the root prompt on slave1.foo.com, is type: # mount master.foo.com:/home /mnt/home and the directory /home on master will appear as the directory /mnt/home on slave1. (Note that this assumes we have created the directory /mnt/home as an empty mount point beforehand.)

You can unmount the file system by typing: # umount /mnt/home Just like you would for a local file system.

Getting NFS File Systems to be Mounted at Boot Time
NFS file systems can be added to your /etc/fstab file the same way local file systems can, so that they mount when your system starts up. The only difference is that the file system type will be set to nfs and the dump and fsck order (the last two entries) will have to be set to zero. So for our example above, the entry in /etc/fstab would look like: # device mountpoint fs-type options dump fsckorder ... master.foo.com:/home /mnt nfs rw 0 0 ... See the man pages for fstab if you are unfamiliar with the syntax of this file. If you are using an automounter such as amd or autofs, the options in the corresponding fields of your mount listings should look very similar if not identical. At this point you should have NFS working, though a few tweaks may still be necessary to get it to work well. 44

Mount Options
Soft versus Hard Mounting
There are some options you should consider adding at once. They govern the way the NFS client handles a server crash or network outage. One of the cool things about NFS is that it can handle this gracefully, if you set up the clients right. There are two distinct failure modes: soft If a file request fails, the NFS client will report an error to the process on the client machine requesting the file access. Some programs can handle this with composure, most won't. We do not recommend using this setting; it is a recipe for corrupted files and lost data. You should especially not use this for mail disks --- if you value your mail, that is. hard The program accessing a file on a NFS mounted file system will hang when the server crashes. The process cannot be interrupted or killed (except by a "sure kill") unless you also specify intr. When the NFS server is back online the program will continue undisturbed from where it was. We recommend using hard,intr on all NFS mounted file systems. Picking up from the previous example, the fstab would now look like: # device mountpoint fs-type options dump fsckord ... master.foo.com:/home /mnt/home nfs rw,hard,intr 0 0 ... The rsize and wsize mount options specify the size of the chunks of data that the client and server pass back and forth to each other. The defaults may be too big or to small; there is no size that works well on all or most setups. On the one hand, some combinations of Linux kernels and network cards (largely on older machines) cannot handle blocks that large. On the other hand, if they can handle larger blocks, a bigger size might be faster. Getting the block size right is an important factor in performance and is a must if you are planning to use the NFS server in a production environment.

45

NIS
Network Information Service, a service that provides information, that has to be known throughout the network, to all machines on the network. There is support for NIS in Linux's standard libc library, which in the following text is referred to as "traditional NIS" The next four lines are quoted from the Sun(tm) System & Network Administration Manual: "NIS was formerly known as Sun Yellow Pages (YP) but the name Yellow Pages(tm) is a registered trademark in the United Kingdom of British Telecom plc and may not be used without permission." NIS stands for Network Information Service. Its purpose is to provide information, that has to be known throughout the network, to all machines on the network. Information likely to be distributed by NIS is: • login names/passwords/home directories (/etc/passwd) • group information (/etc/group) If, for example, your password entry is recorded in the NIS passwd database, you will be able to login on all machines on the network which have the NIS client programs running. Sun is a trademark of Sun Microsystems, Inc. licensed to SunSoft, Inc. NIS+ Network Information Service (Plus :-), essentially NIS on steroids. NIS+ is designed by Sun Microsystems Inc. as a replacement for NIS with better security and better handling of _large_ installations.

How NIS works
Within a network there must be at least one machine acting as a NIS server. You can have multiple NIS servers, each serving different NIS "domains" - or you can have cooperating NIS servers, where one is the master NIS server, and all the other are socalled slave NIS servers (for a certain NIS "domain", that is!) - or you can have a mix of them... Slave servers only have copies of the NIS databases and receive these copies from the master NIS server whenever changes are made to the master's databases. Depending on the number of machines in your network and the reliability of your network, you might decide to install one or more slave servers. Whenever a NIS server goes down or is too

46

slow in responding to requests, a NIS client connected to that server will try to find one that is up or faster. NIS databases are in so-called DBM format, derived from ASCII databases. For example, the files /etc/passwd and /etc/group can be directly converted to DBM format using ASCIIto-DBM translation software (makedbm, included with the server software). The master NIS server should have both, the ASCII databases and the DBM databases. Slave servers will be notified of any change to the NIS maps, (via the yppush program), and automatically retrieve the necessary changes in order to synchronize their databases. NIS clients do not need to do this since they always talk to the NIS server to read the information stored in it's DBM databases. Old ypbind versions do a broadcast to find a running NIS server. This is insecure, due the fact that anyone may install a NIS server and answer the broadcast queries. Newer Versions of ypbind (ypbind-3.3 or ypbind-mt) are able to get the server from a configuration file - thus no need to broadcast.

How NIS+ works
NIS+ is a new version of the network information nameservice from Sun. The biggest difference between NIS and NIS+ is that NIS+ has support for data encryption and authentication over secure RPC. The naming model of NIS+ is based upon a tree structure. Each node in the tree corresponds to an NIS+ object, from which we have six types: directory, entry, group, link, table and private. The NIS+ directory that forms the root of the NIS+ namespace is called the root directory. There are two special NIS+ directories: org_dir and groups_dir. The org_dir directory consists of all administration tables, such as passwd, hosts, and mail_aliases. The groups_dir directory consists of NIS+ group objects which are used for access control. The collection of org_dir, groups_dir and their parent directory is referred to as an NIS+ domain.

Managing System Logs
The syslogd utility logs various kinds of system activity, such as debugging output from sendmail and warnings printed by the kernel. syslogd runs as a daemon and is usually started in one of the rc files at boot time. The file /etc/syslog.conf is used to control where syslogd records information. Such a file might look like the following: *.info;*.notice /var/log/messages 47

mail.debug /var/log/maillog *.warn /var/log/syslog kern.emerg /dev/console The first field of each line lists the kinds of messages that should be logged, and the second field lists the location where they should be logged. The first field is of the format: facility.level [; facility.level … ] where facility is the system application or facility generating the message, and level is the severity of the message. For example, facility can be mail (for the mail daemon), kern (for the kernel), user (for user programs), or auth (for authentication programs such as login or su). An asterisk in this field specifies all facilities. level can be (in increasing severity): debug, info, notice, warning, err, crit, alert, or emerg. In the previous /etc/syslog.conf, we see that all messages of severity info and notice are logged to /var/log/messages, all debug messages from the mail daemon are logged to /var/log/maillog, and all warn messages are logged to /var/log/syslog. Also, any emerg warnings from the kernel are sent to the console (which is the current virtual console, or an xterm started with the -C option). The messages logged by syslogd usually include the date, an indication of what process or facility delivered the message, and the message itself--all on one line. For example, a kernel error message indicating a problem with data on an ext2fs filesystem might appear in the log files as: Dec 1 21:03:35 loomer kernel: EXT2-fs error (device 3/2): ext2_check_blocks_bit map: Wrong free blocks count in super block, stored = 27202, counted = 27853 Similarly, if an su to the root account succeeds, you might see a log message such as: Dec 11 15:31:51 loomer su: mdw on /dev/ttyp3 Log files can be important in tracking down system problems. If a log file grows too large, you can delete it using rm; it will be recreated when syslogd starts up again. Your system probably comes equipped with a running syslogd and an /etc/syslog.conf that does the right thing. However, it's important to know where your log files are and what programs they represent. If you need to log many messages (say, debugging messages from the kernel, which can be very verbose) you can edit syslog.conf and tell syslogd to reread its configuration file with the command: kill -HUP `cat /var/run/syslog.pid` Note the use of backquotes to obtain the process ID of syslogd, contained in /var/run/syslog.pid.

48

Other system logs might be available as well. These include: /var/log/wtmp This file contains binary data indicating the login times and duration for each user on the system; it is used by the last command to generate a listing of user logins. The output of last might look like: mdw mdw mdw reboot tty3 tty3 tty1 ~ Sun Dec 11 15:25 still logged in Sun Dec 11 15:24 - 15:25 (00:00) Sun Dec 11 11:46 still logged in Sun Dec 11 06:46

A record is also logged in /var/log/wtmp when the system is rebooted. /var/run/utmp This is another binary file that contains information on users currently logged into the system. Commands, such as who, w, and finger, use this file to produce information on who is logged in. For example, the w command might print: 3:58pm up 4:12, 5 users, load average: 0.01, 0.02, 0.00 User tty login@ idle JCPU PCPU what mdw ttyp3 11:46am 14 mdw ttyp2 11:46am 1 w mdw ttyp4 11:46am kermit mdw ttyp0 11:46am 14 bash We see the login times for each user (in this case, one user logged in many times), as well as the command currently being used. The w manual page describes all of the fields displayed. /var/log/lastlog This file is similar to wtmp, but is used by different programs (such as finger, to determine when a user was last logged in.) Note that the format of the wtmp and utmp files differs from system to system. Some programs may be compiled to expect one format and others another format. For this reason, commands that use the files may produce confusing or inaccurate information-especially if the files become corrupted by a program that writes information to them in the wrong format.

49

Logfiles can get quite large, and if you do not have the necessary hard disk space, you have to do something about your partitions being filled too fast. Of course, you can delete the log files from time to time, but you may not want to do this, since the log files also contain information that can be valuable in crisis situations. One option is to copy the log files from time to time to another file and compress this file. The log file itself starts at 0 again. Here is a short shell script that does this for the log file /var/log/messages: mv /var/log/messages /var/log/messages-backup cp /dev/null /var/log/messages CURDATE=`date +"%m%d%y"` mv /var/log/messages-backup /var/log/messages-$CURDATE gzip /var/log/messages-$CURDATE First, we move the log file to a different name and then truncate the original file to 0 bytes by copying to it from /dev/null. We do this so that further logging can be done without problems while the next steps are done. Then, we compute a date string for the current date that is used as a suffix for the filename, rename the backup file, and finally compress it with gzip. You might want to run this small script from cron, but as it is presented here, it should not be run more than once a day--otherwise the compressed backup copy will be overwritten, because the filename reflects the date but not the time of day. If you want to run this script more often, you must use additional numbers to distinguish between the various copies. There are many more improvements that could be made here. For example, you might want to check the size of the log file first and only copy and compress it if this size exceeds a certain limit. Even though this is already an improvement, your partition containing the log files will eventually get filled. You can solve this problem by keeping only a certain number of compressed log files (say, 10) around. When you have created as many log files as you want to have, you delete the oldest, and overwrite it with the next one to be copied. This principle is also called log rotation. Some distributions have scripts like savelog or logrotate that can do this automatically. To finish this discussion, it should be noted that most recent distributions like SuSE, Debian, and Red Hat already have built-in cron scripts that manage your log files and are much more sophisticated than the small one presented here.

Logrotate

50

One of the most useful tools for log management in UNIX is logrotate, which is part of just about any UNIX distribution. In short, it lets you automatically split, compress and delete log files according to several policies , and is usually employed to rotate common files like /var/log/messages, /var/log/secure and /var/log/system.log. This HOWTO shows you how to set up log rotation not at a system level, but for a given user

Filesystem Layout
Let's assume you're user, and that you've set up a daemon to run under your username and spit out the files to ~user/var/log/daemon.log. Your filesystem tree looks like this: /home/user --+-- etc <- we're going to put logrotate.conf here | +-- Mail ... +-- var --+-- lib <- the logrotate status file goes here | +-- log <- the actual log files go here

Configuring logrotate
The first step is to create a configuration file. Here is a sample that rotates the log file on a weekly basis, compresses the old log, creates a new zero-byte file and mails us a short report: $ cat ~/etc/logrotate.conf
# see "man logrotate" for details # rotate log files weekly weekly # keep 4 weeks worth of backlogs rotate 4 # create new (empty) log files after rotating old ones # (this is the default, and can be overriden for each log file) create # uncomment this if you want your log files compressed compress /home/user/var/log/daemon.log { create mail user@localhost }

You can, of course, check out man logrotate and add more options (or more files with different options).

51

Getting it to Run
Making logrotate actually work, however, requires invoking it from cron. To do that, add it to your crontab specifying the status file with -s and the configuration file you created: $ crontab –l 0 0 * * * /usr/sbin/logrotate -s /home/user/var/lib/logrotate.status \ /home/user/etc/logrotate.conf > /dev/null 2>&1 (Take care - some systems do not allow "\" to skip to the next line, which means you must enter the logrotate invocation in a single line) The above invokes logrotate at midnight every day, dumping both standard output and standard error to /dev/null. It will then look at its status file and decide whether or not it is time to actually rotate the log files.

The difference between hard and soft links
The data part is associated with something called an 'inode'. The inode carries the map of where the data is, the file permissions, etc. for the data. .---------------> ! data ! ! data ! etc / +------+ !------+ ! permbits, etc ! data addresses ! +------------inode---------------+ The filename part carries a name and an associated inode number. .--------------> ! permbits, etc ! addresses ! / +---------inode-------------+ ! filename ! inode # ! +--------------------+ More than one filename can reference the same inode number; these files are said to be 'hard linked' together. ! filename ! inode # ! +--------------------+ \ >--------------> ! permbits, etc ! addresses ! / +---------inode-------------+ ! othername ! inode # ! 52

+---------------------+ On the other hand, there's a special file type whose data part carries a path to another file. Since it is a special file, the OS recognizes the data as a path, and redirects opens, reads, and writes so that, instead of accessing the data within the special file, they access the data in the file named by the data in the special file. This special file is called a 'soft link' or a 'symbolic link' (aka a 'symlink'). ! filename ! inode # ! +--------------------+ \ .-------> ! permbits, etc ! addresses ! +---------inode-------------+ / / / .----------------------------------------------' ( '--> !"/path/to/some/other/file"! +---------data-------------+ / } .~ ~ ~ ~ ~ ~ ~ }-- (redirected at open() time) ( } '~~> ! filename ! inode # ! +--------------------+ \ '------------> ! permbits, etc ! addresses ! +---------inode-------------+ / / .----------------------------------------------------' ( '-> ! data ! ! data ! etc. +------+ +------+ Now, the filename part of the file is stored in a special file of its own along with the filename parts of other files; this special file is called a directory. The directory, as a file, is just an array of filename parts of other files. When a directory is built, it is initially populated with the filename parts of two special files: the '.' and '..' files. The filename part for the '.' file is populated with the inode# of the directory file in which the entry has been made; '.' is a hardlink to the file that implements the current directory.

53

The filename part for the '..' file is populated with the inode# of the directory file that contains the filename part of the current directory file. '..' is a hardlink to the file that implements the immediate parent of the current directory. The 'ln' command knows how to build hardlinks and softlinks; the 'mkdir' command knows how to build directories (the OS takes care of the above hardlinks). There are restrictions on what can be hardlinked (both links must reside on the same filesystem, the source file must exist, etc.) that are not applicable to softlinks (source and target can be on seperate file systems, source does not have to exist, etc.). OTOH, softlinks have other restrictions not shared by hardlinks (additional I/O necessary to complete file access, additional storage taken up by softlink file's data, etc.) In other words, there's tradeoffs with each. Now, let's demonstrate some of this... ln in action Let's start off with an empty directory, and create a file in it ~/directory $ ls -lia total 3 73477 drwxr-xr-x 2 lpitcher users 1024 Mar 11 20:16 . 91804 drwxr-xr-x 29 lpitcher users 2048 Mar 11 20:16 .. ~/directory $ echo "This is a file" >basic.file ~/directory $ ls -lia total 4 73477 drwxr-xr-x 2 lpitcher users 91804 drwxr-xr-x 29 lpitcher users 73478 -rw-r--r-- 1 lpitcher users ~/directory $ cat basic.file This is a file Now, let's make a hardlink to the file ~/directory $ ln basic.file hardlink.file ~/directory $ ls -lia total 5 73477 drwxr-xr-x 2 lpitcher users 91804 drwxr-xr-x 29 lpitcher users 73478 -rw-r--r-- 2 lpitcher users 73478 -rw-r--r-- 2 lpitcher users ~/directory $ cat hardlink.file

1024 Mar 11 20:17 . 2048 Mar 11 20:16 .. 15 Mar 11 20:17 basic.file

1024 Mar 11 20:20 . 2048 Mar 11 20:18 .. 15 Mar 11 20:17 basic.file 15 Mar 11 20:17 hardlink.file

54

This is a file We see that: hardlink.file shares the same inode (73478) as basic.file hardlink.file shares the same data as basic.file If we change the permissions on basic.file: ~/directory $ chmod a+w basic.file ~/directory $ ls -lia total 5 73477 drwxr-xr-x 2 lpitcher users 91804 drwxr-xr-x 29 lpitcher users 73478 -rw-rw-rw- 2 lpitcher users 73478 -rw-rw-rw- 2 lpitcher users

1024 Mar 11 20:20 . 2048 Mar 11 20:18 .. 15 Mar 11 20:17 basic.file 15 Mar 11 20:17 hardlink.file

then the same permissions change on hardlink.file. The two files (basic.file and hardlink.file) share the same inode and data, but have different file names. Let's now make a softlink to the original file: ~/directory $ ln -s basic.file softlink.file ~/directory $ ls -lia total 5 73477 drwxr-xr-x 2 lpitcher users 91804 drwxr-xr-x 29 lpitcher users 73478 -rw-rw-rw- 2 lpitcher users 73478 -rw-rw-rw- 2 lpitcher users 73479 lrwxrwxrwx 1 lpitcher users ~/directory $ cat softlink.file This is a file Here, we see that although softlink.file accesses the same data as basic.file and hardlink.file, it does not share the same inode (73479 vs 73478), nor does it exhibit the same file permissions. It does show a new permission bit: the 'l' (softlink) bit. If we delete basic.file: ~/directory $ rm basic.file ~/directory $ ls -lia total 4 73477 drwxr-xr-x 2 lpitcher users 91804 drwxr-xr-x 29 lpitcher users

1024 Mar 11 20:24 . 2048 Mar 11 20:18 .. 15 Mar 11 20:17 basic.file 15 Mar 11 20:17 hardlink.file 10 Mar 11 20:24 softlink.file -> basic.file

1024 Mar 11 20:27 . 2048 Mar 11 20:18 ..

55

73478 -rw-rw-rw- 1 lpitcher users 73479 lrwxrwxrwx 1 lpitcher users

15 Mar 11 20:17 hardlink.file 10 Mar 11 20:24 softlink.file -> basic.file

then we lose the ability to access the linked data through the softlink: ~/directory $ cat softlink.file cat: softlink.file: No such file or directory However, we still have access to the original data through the hardlink: ~/directory $ cat hardlink.file This is a file You will notice that when we deleted the original file, the hardlink didn't vanish. Similarly, if we had deleted the softlink, the original file wouldn't have vanished. A further note with respect to hardlink files When deleting files, the data part isn't disposed of until all the filename parts have been deleted. There's a count in the inode that indicates how many filenames point to this file, and that count is decremented by 1 each time one of those filenames is deleted. When the count makes it to zero, the inode and its associated data are deleted. By the way, the count also reflects how many times the file has been opened without being closed (in other words, how many references to the file are still active). This has some ramifications which aren't obvious at first: you can delete a file so that no "filename" part points to the inode, without releasing the space for the data part of the file, because the file is still open. Have you ever found yourself in this position: you notice that /var/log/messages (or some other syslog-owned file) has grown too big, and you rm /var/log/messages touch /var/log/messages to reclaim the space, but the used space doesn't reappear? This is because, although you've deleted the filename part, there's a process that's got the data part open still (syslogd), and the OS won't release the space for the data until the process closes it. In order to complete your space reclamation, you have to kill -SIGHUP `cat /var/run/syslogd.pid` to get syslogd to close and reopen the file.

56

File Compression and Archiving
Sometimes it is useful to store a group of files in one file so that they can be backed up, easily transferred to another directory, or even transferred to a different computer. It is also sometimes useful to compress files into one file so that they use less disk space and download faster via the Internet. It is important to understand the distinction between an archive file and a compressed file. An archive file is a collection of files and directories that are stored in one file. The archive file is not compressed — it uses the same amount of disk space as all the individual files and directories combined. A compressed file is a collection of files and directories that are stored in one file and stored in a way that uses less disk space than all the individual files and directories combined. If you do not have enough disk space on your computer, you can compress files that you do not use very often or files that you want to save but do not use anymore. You can even create an archive file and then compress it to save disk space.

Compressing Files at the Shell Prompt
Compressed files use less disk space and download faster than large, uncompressed files. In Red Hat Linux you can compress files with the compression tools gzip, bzip2, or zip. The bzip2 compression tool is recommended because it provides the most compression and is found on most UNIX-like operating systems. The gzip compression tool can also be found on most UNIX-like operating systems. If you need to transfer files between Linux and other operating system such as MS Windows, you should use zip because it is more compatible with the compression utilities on Windows. Compression Tool Gzip bzip2 zip File Extension .gz .bz2 .zip Uncompression Tool gunzip bunzip2 unzip

By convention, files compressed with gzip are given the extension .gz, files compressed with bzip2 are given the extension .bz2, and files compressed with zip are given the extension .zip. Files compressed with gzip are uncompressed with gunzip, files compressed with bzip2 are uncompressed with bunzip2, and files compressed with zip are uncompressed with unzip.

57

Bzip2 and Bunzip2
To use bzip2 to compress a file, type the following command at a shell prompt: bzip2 filename The file will be compressed and saved as filename.bz2. To expand the compressed file, type the following command:bunzip2 filename.bz2 The filename.bz2 is deleted and replaced with filename. You can use bzip2 to compress multiple files and directories at the same time by listing them with a space between each one:bzip2 filename.bz2 file1 file2 file3 /usr/work/school The above command compresses file1, file2, file3, and the contents of the /usr/work/school directory (assuming this directory exists) and places them in a file named filename.bz2. For more information, type man bzip2 and man bunzip2 at a shell prompt to read the man pages for bzip2 and bunzip2.

Gzip and Gunzip
To use gzip to compress a file, type the following command at a shell prompt: gzip filename. The file will be compressed and saved as filename.gz. To expand the compressed file, type the following command:gunzip filename.gz The filename.gz is deleted and replaced with filename. You can use gzip to compress multiple files and directories at the same time by listing them with a space between each one:gzip -r filename.gz file1 file2 file3 /usr/work/school The above command compresses file1, file2, file3, and the contents of the /usr/work/school directory (assuming this directory exists) and places them in a file named filename.gz. For more information, type man gzip and man gunzip at a shell prompt to read the man pages for gzip and gunzip.

Zip and Unzip
To compress a file with zip, type the following command:zip -r filename.zip filesdir In this example, filename.zip represents the file you are creating and filesdir represents the directory you want to put in the new zip file. The -r option specifies that you want to include all files contained in the filesdir directory recursively. To extract the contents of a zip file, type the following command:unzip filename.zip You can use zip to compress multiple files and directories at the same time by listing them with a space between each one: zip -r filename.zip file1 file2 file3 /usr/work/school 58

The above command compresses file1, file2, file3, and the contents of the /usr/work/school directory (assuming this directory exists) and places them in a file named filename.zip. For more information, type man zip and man unzip at a shell prompt to read the man pages for zip and unzip.

Archiving Files at the Shell Prompt
A tar file is a collection of several files and/or directories in one file. This is a good way to create backups and archives. Some of the options used with the tar are: -c — create a new archive. -f — when used with the -c option, use the filename specified for the creation of the tar file; when used with the -x option, unarchive the specified file. -t — show the list of files in the tar file. -v — show the progress of the files being archived. -x — extract files from an archive. -z — compress the tar file with gzip. -j — compress the tar file with bzip2. To create a tar file, type:tar -cvf filename.tar directory/file In this example, filename.tar represents the file you are creating and directory/file represents the directory and file you want to put in the archived file. You can tar multiple files and directories at the same time by listing them with a space between each one: tar -cvf filename.tar /home/mine/work /home/mine/school The above command places all the files in the work and the school subdirectories of /home/mine in a new file called filename.tar in the current directory. To list the contents of a tar file, type:tar -tvf filename.tar To extract the contents of a tar file, type: tar -xvf filename.tar This command does not remove the tar file, but it places copies of its unarchived contents in the current working directory, preserving any directory structure that the archive file used. For example, if the tarfile contains a file called bar.txt within a directory called foo/, then extracting the archive file will result in the creation of the directory foo/ in your current working directory with the file bar.txt inside of it.

59

Remember, the tar command does not compress the files by default. To create a tarred and bzipped compressed file, use the -j option: tar -cjvf filename.tbz file tar files compressed with bzip2 are conventionally given the extension .tbz; however, sometimes users archive their files using the tar.bz2 extension. The above command creates an archive file and then compresses it as the file filename.tbz. If you uncompress the filename.tbz file with the bunzip2 command, the filename.tbz file is removed and replaced with filename.tar. You can also expand and unarchive a bzip tar file in one command:tar -xjvf filename.tbz To create a tarred and gzipped compressed file, use the -z option: tar -czvf filename.tgz file tar files compressed with gzip are conventionally given the extension .tgz. This command creates the archive file filename.tar and then compresses it as the file filename.tgz. (The file filename.tar is not saved.) If you uncompress the filename.tgz file with the gunzip command, the filename.tgz file is removed and replaced with filename.tar. You can expand a gzip tar file in one command:tar -xzvf filename.tgz Type the command man tar for more information about the tar command.

Package Management with RPM
The Red Hat Package Manager (RPM) is an open packaging system, available for anyone to use, which runs on Red Hat Linux as well as other Linux and UNIX systems. Red Hat, Inc. encourages other vendors to use RPM for their own products. RPM is distributable under the terms of the GPL. For the end user, RPM makes system updates easy. Installing, uninstalling, and upgrading RPM packages can be accomplished with short commands. RPM maintains a database of installed packages and their files, so you can invoke powerful queries and verifications on your system. If you prefer a graphical interface, you can use Gnome-RPM to perform many RPM commands.

60

During upgrades, RPM handles configuration files carefully, so that you never lose your customizations — something that you will not accomplish with regular .tar.gz files. For the developer, RPM allows you to take software source code and package it into source and binary packages for end users. This process is quite simple and is driven from a single file and optional patches that you create. This clear delineation of "pristine" sources and your patches and build instructions eases the maintenance of the package as new versions of the software are released. Run RPM Commands as Root Because RPM makes changes to your system, you must be root in order to install, remove, or upgrade an RPM package.

RPM Design Goals
In order to understand how to use RPM, it can be helpful to understand RPM's design goals: Upgradability Using RPM, you can upgrade individual components of your system without completely reinstalling. When you get a new release of an operating system based on RPM (such as Red Hat Linux), you don't need to reinstall on your machine (as you do with operating systems based on other packaging systems). RPM allows intelligent, fully-automated, inplace upgrades of your system. Configuration files in packages are preserved across upgrades, so you won't lose your customizations. There are no special upgrade files need to upgrade a package because the same RPM file is used to install and upgrade the package on your system. Powerful Querying RPM is designed to provide powerful querying options. You can do searches through your entire database for packages or just for certain files. You can also easily find out what package a file belongs to and from where the package came. The files an RPM package contains are in a compressed archive, with a custom binary header containing useful information about the package and its contents, allowing you to query individual packages quickly and easily. System Verification Another powerful feature is the ability to verify packages. If you are worried that you deleted an important file for some package, simply verify the package. You will be notified of any anomalies. At that point, you can reinstall the package if necessary. Any configuration files that you modified are preserved during reinstallation.

61

Pristine Sources A crucial design goal was to allow the use of "pristine" software sources, as distributed by the original authors of the software. With RPM, you have the pristine sources along with any patches that were used, plus complete build instructions. This is an important advantage for several reasons. For instance, if a new version of a program comes out, you do not necessarily have to start from scratch to get it to compile. You can look at the patch to see what you might need to do. All the compiled-in defaults, and all of the changes that were made to get the software to build properly are easily visible using this technique. The goal of keeping sources pristine may only seem important for developers, but it results in higher quality software for end users, too. We would like to thank the folks from the BOGUS distribution for originating the pristine source concept.

Using RPM
RPM has five basic modes of operation (not counting package building): installing, uninstalling, upgrading, querying, and verifying. This section contains an overview of each mode. For complete details and options try rpm --help, or turn to the section called Additional Resources for more information on RPM. Finding RPMs Before using an RPM, you must know where to find them. An Internet search will return many RPM repositories, but if you are looking for RPM packages built by Red Hat, they can be found at the following locations: • • • • The official Red Hat Linux CD-ROMs The Red Hat Errata Page available at http://www.redhat.com/support/errata A Red Hat FTP Mirror Site available at http://www.redhat.com/mirrors.html Red Hat Network

RPM packages typically have file names like foo-1.0-1.i386.rpm. The file name includes the package name (foo), version (1.0), release (1), and architecture (i386). Installing a package is as simple as typing the following command at a shell prompt:

62

# rpm -ivh foo-1.0-1.i386.rpm foo #################################### # As you can see, RPM prints out the name of the package and then prints a succession of hash marks as the package is installed as a progress meter. Note Although a command like rpm -ivh foo-1.0-1.i386.rpm is commonly used to install an RPM package, you may want to consider using rpm -Uvh foo-1.0-1.i386.rpm instead. -U is commonly used for upgrading a package, but it will also install new packages. Installing packages is designed to be simple, but you may sometimes see errors: Package Already Installed If the package of the same version is already installed, you will see: # rpm -ivh foo-1.0-1.i386.rpm foo package foo-1.0-1 is already installed # If you want to install the package anyway and the same version you are trying to install is already installed, you can use the --replacepkgs option, which tells RPM to ignore the error: # rpm -ivh --replacepkgs foo-1.0-1.i386.rpm foo #################################### # This option is helpful if files installed from the RPM were deleted or if you want the original configuration files from the RPM to be installed. Conflicting Files If you attempt to install a package that contains a file which has already been installed by another package or an earlier version of the same package, you'll see: # rpm -ivh foo-1.0-1.i386.rpm foo /usr/bin/foo conflicts with file from bar-1.0-1 #

63

To make RPM ignore this error, use the --replacefiles option: # rpm -ivh --replacefiles foo-1.0-1.i386.rpm foo #################################### # Unresolved Dependency RPM packages can "depend" on other packages, which means that they require other packages to be installed in order to run properly. If you try to install a package which has an unresolved dependency, you'll see: # rpm -ivh foo-1.0-1.i386.rpm failed dependencies: bar is needed by foo-1.0-1 # To handle this error you should install the requested packages. If you want to force the installation anyway (a bad idea since the package probably will not run correctly), use the --nodeps option. Uninstalling Uninstalling a package is just as simple as installing one. Type the following command at a shell prompt: # rpm -e foo # Note Notice that we used the package name foo, not the name of the original package file foo1.0-1.i386.rpm. To uninstall a package, you will need to replace foo with the actual package name of the original package. You can encounter a dependency error when uninstalling a package if another installed package depends on the one you are trying to remove. For example: # rpm -e foo removing these packages would break dependencies: # foo is needed by bar-1.0-1

64

To cause RPM to ignore this error and uninstall the package anyway (which is also a bad idea since the package that depends on it will probably fail to work properly), use the --nodeps option. Upgrading Upgrading a package is similar to installing one. Type the following command at a shell prompt: # rpm -Uvh foo-2.0-1.i386.rpm foo #################################### # What you do not see above is that RPM automatically uninstalled any old versions of the foo package. In fact, you may want to always use -U to install packages, since it will work even when there are no previous versions of the package installed. Since RPM performs intelligent upgrading of packages with configuration files, you may see a message like the following: saving /etc/foo.conf as /etc/foo.conf.rpmsave This message means that your changes to the configuration file may not be "forward compatible" with the new configuration file in the package, so RPM saved your original file, and installed a new one. You should investigate the differences between the two configuration files and resolve them as soon as possible, to ensure that your system continues to function properly. Upgrading is really a combination of uninstalling and installing, so during an RPM upgrade you can encounter uninstalling and installing errors, plus one more. If RPM thinks you are trying to upgrade to a package with an older version number, you will see: # rpm -Uvh foo-1.0-1.i386.rpm foo package foo-2.0-1 (which is newer) is already installed # To cause RPM to "upgrade" anyway, use the --oldpackage option: # rpm -Uvh --oldpackage foo-1.0-1.i386.rpm foo #################################### # Freshening

65

Freshening a package is similar to upgrading one. Type the following command at a shell prompt: # rpm -Fvh foo-1.2-1.i386.rpm foo #################################### # RPM's freshen option checks the versions of the packages specified on the command line against the versions of packages that have already been installed on your system. When a newer version of an already-installed package is processed by RPM's freshen option, it will be upgraded to the newer version. However, RPM's freshen option will not install a package if no previously-installed package of the same name exists. This differs from RPM's upgrade option, as an upgrade will install packages, whether or not an older version of the package was already installed. RPM's freshen option works for single packages or a group of packages. If you have just downloaded a large number of different packages, and you only want to upgrade those packages that are already installed on your system, freshening will do the job. If you use freshening, you will not have to deleting any unwanted packages from the group that you downloaded before using RPM. In this case, you can simply issue the following command: # rpm -Fvh *.rpm RPM will automatically upgrade only those packages that are already installed. Querying Use the rpm -q command to query the database of installed packages. The rpm -q foo command will print the package name, version, and release number of the installed package foo: # rpm -q foo foo-2.0-1 # Note Notice that we used the package name foo. To query a package, you will need to replace foo with the actual package name.

66

Instead of specifying the package name, you can use the following options with -q to specify the package(s) you want to query. These are called Package Specification Options. -a queries all currently installed packages. -f <file> will query the package which owns <file>. When specifying a file, you must specify the full path of the file (for example, /usr/bin/ls). -p <packagefile> queries the package <packagefile>. There are a number of ways to specify what information to display about queried packages. The following options are used to select the type of information for which you are searching. These are called Information Selection Options. -i displays package information including name, description, release, size, build date, install date, vendor, and other miscellaneous information. -l displays the list of files that the package contains. -s displays the state of all the files in the package. -d displays a list of files marked as documentation (man pages, info pages, READMEs, etc.). -c displays a list of files marked as configuration files. These are the files you change after installation to adapt the package to your system (for example, sendmail.cf, passwd, inittab, etc.). For the options that display lists of files, you can add -v to the command to display the lists in a familiar ls -l format. Verifying Verifying a package compares information about files installed from a package with the same information from the original package. Among other things, verifying compares the size, MD5 sum, permissions, type, owner, and group of each file. The command rpm -V verifies a package. You can use any of the Package Selection Options listed for querying to specify the packages you wish to verify. A simple use of verifying is rpm -V foo, which verifies that all the files in the foo package are as they were when they were originally installed. For example:

67

To verify a package containing a particular file: rpm -Vf /bin/vi To verify ALL installed packages: rpm -Va To verify an installed package against an RPM package file: rpm -Vp foo-1.0-1.i386.rpm This command can be useful if you suspect that your RPM databases are corrupt. If everything verified properly, there will be no output. If there are any discrepancies they will be displayed. The format of the output is a string of eight characters (a c denotes a configuration file) and then the file name. Each of the eight characters denotes the result of a comparison of one attribute of the file to the value of that attribute recorded in the RPM database. A single . (a period) means the test passed. The following characters denote failure of certain tests: 5 — MD5 checksum S — file size L — symbolic link T — file modification time D — device U — user G — group M — mode (includes permissions and file type) ? — unreadable file If you see any output, use your best judgment to determine if you should remove or reinstall the package, or fix the problem in another way.

68

Compiling from the original source
Read documentation
Look for files called: INSTALL, README, SETUP, or similar. Read with less docfile, or zless docfile.gz for .gz files.

The procedure
The installation procedure for software that comes in tar.gz and tar.bz2 packages isn't always the same, but usually it's like this: # tar xvzf package.tar.gz (or tar xvjf package.tar.bz2) # cd package # ./configure # make # make install If you're lucky, by issuing these simple commands you unpack, configure, compile, and install the software package and you don't even have to know what you're doing. However, it's healthy to take a closer look at the installation procedure and see what these steps mean.

Unpacking
Maybe you've already noticed that the package containing the source code of the program has a tar.gz or a tar.bz2 extension. This means that the package is a compressed tar archive, also known as a tarball. When making the package, the source code and the other needed files were piled together in a single tar archive, hence the tar extension. After piling them all together in the tar archive, the archive was compressed with gzip, hence the gz extension. Some people want to compress the tar archive with bzip2 instead of gzip. In these cases the package has a tar.bz2 extension. You install these packages exactly the same way as tar.gz packages, but you use a bit different command when unpacking. It doesn't matter where you put the tarballs you download from the internet but I suggest creating a special directory for downloaded tarballs. In this tutorial I assume you keep tarballs in a directory called dls that you've created under your home directory. However, the dls directory is just an example. You can put your downloaded tar.gz or tar.bz2 software packages into any directory you want. In this example I assume your username is me and you've downloaded a package called pkg.tar.gz into the dls directory you've created (/home/me/dls).

69

Ok, finally on to unpacking the tarball. After downloading the package, you unpack it with this command: me@puter: ~/dls$ tar xvzf pkg.tar.gz As you can see, you use the tar command with the appropriate options (xvzf) for unpacking the tarball. If you have a package with tar.bz2 extension instead, you must tell tar that this isn't a gzipped tar archive. You do so by using the j option instead of z, like this: me@puter: ~/dls$ tar xvjf pkg.tar.bz2 What happens after unpacking, depends on the package, but in most cases a directory with the package's name is created. The newly created directory goes under the directory where you are right now. To be sure, you can give the ls command: me@puter: ~/dls$ ls pkg pkg.tar.gz me@puter: ~/dls$ In our example unpacking our package pkg.tar.gz did what expected and created a directory with the package's name. Now you must cd into that newly created directory: me@puter: ~/dls$ cd pkg me@puter: ~/dls/pkg$ Read any documentation you find in this directory, like README or INSTALL files, before continuing!

Configuring
Now, after we've changed into the package's directory (and done a little RTFM'ing), it's time to configure the package. Usually, but not always (that's why you need to check out the README and INSTALL files) it's done by running the configure script. You run the script with this command: me@puter: ~/dls/pkg$ ./configure When you run the configure script, you don't actually compile anything yet. configure just checks your system and assigns values for system-dependent variables. These values are used for generating a Makefile. The Makefile in turn is used for generating the actual binary.

70

When you run the configure script, you'll see a bunch of weird messages scrolling on your screen. This is normal and you shouldn't worry about it. If configure finds an error, it complains about it and exits. However, if everything works like it should, configure doesn't complain about anything, exits, and shuts up. If configure exited without errors, it's time to move on to the next step.

Building
It's finally time to actually build the binary, the executable program, from the source code. This is done by running the make command: me@puter: ~/dls/pkg$ make Note that make needs the Makefile for building the program. Otherwise it doesn't know what to do. This is why it's so important to run the configure script successfully, or generate the Makefile some other way. When you run make, you'll see again a bunch of strange messages filling your screen. This is also perfectly normal and nothing you should worry about. This step may take some time, depending on how big the program is and how fast your computer is. If you're doing this on an old dementic rig with a snail processor, go grab yourself some coffee. At this point I usually lose my patience completely. If all goes as it should, your executable is finished and ready to run after make has done its job. Now, the final step is to install the program.

Installing
Now it's finally time to install the program. When doing this you must be root. If you've done things as a normal user, you can become root with the su command. It'll ask you the root password and then you're ready for the final step! me@puter: ~/dls/pkg$ su Password: root@puter: /home/me/dls/pkg# Now when you're root, you can install the program with the make install command: root@puter: /home/me/dls/pkg# make install Again, you'll get some weird messages scrolling on the screen. After it's stopped, congrats: you've installed the software and you're ready to run it! Because in this example we didn't change the behavior of the configure script, the program was installed in the default place. In many cases it's /usr/local/bin. If

71

/usr/local/bin (or whatever place your program was installed in) is already in your PATH, you can just run the program by typing its name. And one more thing: if you became root with su, you'd better get back your normal user privileges before you do something stupid. Type exit to become a normal user again: root@puter: /home/me/dls/pkg# exit exit me@puter: ~/dls/pkg$

Cleaning up the mess
I bet you want to save some disk space. If this is the case, you'll want to get rid of some files you don't need. When you ran make it created all sorts of files that were needed during the build process but are useless now and are just taking up disk space. This is why you'll want to make clean: me@puter: ~/dls/pkg$ make clean However, make sure you keep your Makefile. It's needed if you later decide to uninstall the program and want to do it as painlessly as possible!

Uninstalling
So, you decided you didn't like the program after all? Uninstalling the programs you've compiled yourself isn't as easy as uninstalling programs you've installed with a package manager, like rpm. If you want to uninstall the software you've compiled yourself, do the obvious: do some old-fashioned RTFM'ig. Read the documentation that came with your software package and see if it says anything about uninstalling. If it doesn't, you can start pulling your hair out. If you didn't delete your Makefile, you may be able to remove the program by doing a make uninstall: root@puter: /home/me/dls/pkg# make uninstall If you see weird text scrolling on your screen (but at this point you've probably got used to weird text filling the screen? :-) that's a good sign. If make starts complaining at you, that's a bad sign. Then you'll have to remove the program files manually. If you know where the program was installed, you'll have to manually delete the installed files or the directory where your program is. If you have no idea where all the files are, you'll have to read the Makefile and see where all the files got installed, and then delete them.

72

yum
About Repositories
A repository is a prepared directory or Web site that contains software packages and index files. Software management utilities such as yum automatically locate and obtain the correct RPM packages from these repositories. This method frees you from having to manually find and install new applications or updates. You may use a single command to update all system software, or search for new software by specifying criteria. A network of servers provide several repositories for each version of Red Hat. The package management utilities in Red Hat are already configured to use three of these repositories: • Base

The packages that make up a Red Hat release, as it is on disc • Updates

Updated versions of packages that are provided in Base • Extras

Packages for a large selection of additional software

Red Hat Development Repositories
Red Hat also includes settings for several alternative repositories. These provide packages for various types of test system, and replace one or more of the standard repositories. Third-party software developers also provide repositories for their Red Hat compatible packages. You may also use the package groups provided by the Red Hat repositories to manage related packages as sets. Some third-party repositories add packages to these groups, or provide their packages as additional groups.

73

Available Package Groups
To view a list of all of the available package groups for your Red Hat system, run the command su -c 'yum grouplist'. Use repositories to ensure that you always receive current versions of software. If several versions of the same package are available, your management utility automatically selects the latest version.

About Dependencies
Some of the files installed on a Red Hat distribution are libraries which may provide functions to multiple applications. When an application requires a specific library, the package which contains that library is a dependency. To properly install a package, Red Hat must first satisfy its dependencies. The dependency information for a RPM package is stored within the RPM file. The yum utility uses package dependency data to ensure that all of requirements for an application are met during installation. It automatically installs the packages for any dependencies not already present on your system. If a new application has requirements that conflict with existing software, yum aborts without making any changes to your system.

Understanding Package Names
Each package file has a long name that indicates several key pieces of information. For example, this is the full name of a tsclient package: tsclient-0.132-6.i386.rpm Management utilities commonly refer to packages with one of three formats: Package name: tsclient Package name with version and release numbers: tsclient-0.132-6 Package name with hardware architecture: tsclient.i386 For clarity, yum lists packages in the format name.architecture. Repositories also commonly store packages in separate directories by architecture. In each case, the hardware architecture specified for the package is the minimum type of machine required to use the package. i386 Suitable for any current Intel-compatible computer noarch 74

Compatible with all computer architectures ppc Suitable for PowerPC systems, such as Apple Power Macintosh x86_64 Suitable for 64-bit Intel-compatible processors, such as Opterons Some software may be optimized for particular types of Intel-compatible machine. Separate packages may be provided for i386, i586, i686 and x86_64 computers. A machine with at least an Intel Pentium, VIA C3 or compatible CPU may use i586 packages. Computers with an Intel Pentium Pro and above, or a current model of AMD chip, may use i686 packages. Use the short name of the package for yum commands. This causes yum to automatically select the most recent package in the repositories that matches the hardware architecture of your computer. Specify a package with other name formats to override the default behavior and force yum to use the package that matches that version or architecture. Only override yum when you know that the default package selection has a bug or other fault that makes it unsuitable for installation.

Managing Software with yum
Use the yum utility to modify the software on your system in four ways: To install new software from package repositories To install new software from an individual package file To update existing software on your system To remove unwanted software from your system To use yum, specify a function and one or more packages or package groups. Each section below gives some examples. For each operation, yum downloads the latest package information from the configured repositories. If your system uses a slow network connection yum may require several seconds to download the repository indexes and the header files for each package. The yum utility searches these data files to determine the best set of actions to produce the required result, and displays the transaction for you to approve. The transaction may

75

include the installation, update, or removal of additional packages, in order to resolve software dependencies. This is an example of the transaction for installing tsclient: =============================================================== ============== Package Arch Version Repository Size =============================================================== ============== Installing: tsclient i386 0.132-6 base 247 k Installing for dependencies: rdesktop i386 1.4.0-2 base 107 k Transaction Summary =============================================================== ============== Install 2 Package(s) Update 0 Package(s) Remove 0 Package(s) Total download size: 355 k Is this ok [y/N]: Example 1. Format of yum Transaction Reports Review the list of changes, and then press y to accept and begin the process. If you press N or Enter, yum does not download or change any packages.

Package Versions
The yum utility only displays and uses the newest version of each package, unless you specify an older version. The yum utility also imports the repository public key if it is not already installed on the rpm keyring. This is an example of the public key import: warning: rpmts_HdrFromFdno: Header V3 DSA signature: NOKEY, key ID 4f2a6fd2 public key not available for tsclient-0.132-6.i386.rpm Retrieving GPG key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora Importing GPG key 0x4F2A6FD2 "Fedora Project <fedora@redhat.com>" Is this ok [y/N]: Example 2. Format of yum Public Key Import

76

Check the public key, and then press y to import the key and authorize the key for use. If you press N or Enter, yum stops without installing any packages. To ensure that downloaded packages are genuine, yum verifies the digital signature of each package against the public key of the provider. Once all of the packages required for the transaction are successfully downloaded and verified, yum applies them to your system.

Transaction Log
Every completed transaction records the affected packages in the log file /var/log/yum.log. You may only read this file with root access.

Installing New Software with yum
To install the package tsclient, enter the command: su -c 'yum install tsclient' Enter the password for the root account when prompted. To install the package group MySQL Database, enter the command: su -c 'yum groupinstall "MySQL Database"' Enter the password for the root account when prompted.

New Services Require Activation
When you install a service, Red Hat does not activate or start it. To configure a new service to run on bootup, choose Desktop → System Settings → Server Settings → Services, or use the chkconfig and service command-line utilities.

Updating Software with yum
To update the tsclient package to the latest version, type: su -c 'yum update tsclient' Enter the password for the root account when prompted.

New Software Versions Require Reloading
If a piece of software is in use when you update it, the old version remains active until the application or service is restarted. Kernel updates take effect when you reboot the system.

Kernel Packages

77

Kernel packages remain on the system after they have been superseded by newer versions. This enables you to boot your system with an older kernel if an error occurs with the current kernel. To minimize maintenance, yum automatically removes obsolete kernel packages from your system, retaining only the current kernel and the previous version. To update all of the packages in the package group MySQL Database, enter the command: su -c 'yum groupupdate "MySQL Database"' Enter the password for the root account when prompted. Updating the Entire System

Removing Software with yum
To remove software, yum examines your system for both the specified software, and any software which claims it as a dependency. The transaction to remove the software deletes both the software and the dependencies. To remove the tsclient package from your system, use the command: su -c 'yum remove tsclient' Enter the password for the root account when prompted. To remove all of the packages in the package group MySQL Database, enter the command: su -c 'yum groupremove "MySQL Database"' Enter the password for the root account when prompted.

sysctl
Sysctl is an interface for examining and dynamically changing parameters in a BSD Unix (or Linux) operating system kernel. Generally, these parameters (identified as objects in a Management Information Base) describe tunable limits such as the size of a shared memory segment, the number of threads the operating system will use as an NFS client, or the maximum number of processes on the system; or describe, enable or disable behaviors such as IP forwarding, security restrictions on the superuser (the "securelevel"), or debugging output. Generally, a system call or system call wrapper is provided for use by programs, as well as an administrative program and a configuration file (for setting the tunable parameters when the system boots).

78

This feature appeared in the "4.4BSD" version of Unix, and is also used in the Linux kernel. It has the advantage over hardcoded constants that changes to the parameters can be made dynamically without recompiling the kernel. Examples When IP forwarding is enabled, the operating system kernel will act as a router. For the Linux kernel, the parameter net.ipv4.ip_forward can be set to 1 to enable this behavior. In FreeBSD, NetBSD and OpenBSD the parameter is net.inet.ip.forwarding. In most systems, the command sysctl -w parameter=1 will enable the desired behavior. This will persist until the next reboot. If the behavior should be enabled whenever the system boots, the line parameter=1 can be added to the file /etc/sysctl.conf. Additionally, some sysctl variables cannot be modified after the system is booted, these variables (depending on the variable and the version and flavor of BSD) need to either be set statically in the kernel at compile time or set in /boot/loader.conf. The proc filesystem Under the Linux kernel, the proc filesystem also provides an interface to the sysctl parameters. For example, the parameter net.ipv4.ip_forward corresponds with the file /proc/sys/net/ipv4/ip_forward. Reading or changing this file is equivalent to changing the parameter using the sysctl command. Oracle parameters kernel.shmmax=2313682943 kernel.msgmni=1024 kernel.sem=1250 256000 100 1024 vm.max_map_count=300000 net.ipv4.ip_local_port_range = 1024 65000

Linux Partitions
Devices
There is a special nomenclature that linux uses to refer to hard drive partitions that must be understood in order to follow the discussion on the following pages. In Linux, partitions are represented by device files. These are phoney files located in /dev. Here are a few entries: brw-rw---- 1 root disk 3, 0 May 5 1998 hda brw-rw---- 1 root disk 8, 0 May 5 1998 sda crw------- 1 root tty 4, 64 May 5 1998 ttyS0 79

A device file is a file with type c ( for "character" devices, devices that do not use the buffer cache) or b (for "block" devices, which go through the buffer cache). In Linux, all disks are represented as block devices only.

Device names Naming Convention
By convention, IDE drives will be given device names /dev/hda to /dev/hdd. Hard Drive A (/dev/hda) is the first drive and Hard Drive C (/dev/hdc) is the third. Table 2. IDE controller naming convention drive name /dev/hda /dev/hdb /dev/hdc /dev/hdd drive controller 1 1 2 2 drive number 1 2 1 2

A typical PC has two IDE controllers, each of which can have two drives connected to it. For example, /dev/hda is the first drive (master) on the first IDE controller and /dev/hdd is the second (slave) drive on the second controller (the fourth IDE drive in the computer). You can write to these devices directly (using cat or dd). However, since these devices represent the entire disk, starting at the first block, you can mistakenly overwrite the master boot record and the partition table, which will render the drive unusable. Table 3. partition names drive name /dev/hda1 /dev/hda2 /dev/hda3 /dev/hda4 /dev/hdb1 /dev/hdb2 /dev/hdb3 /dev/hdb4 drive controller 1 1 1 1 1 1 1 1 drive number 1 1 1 1 2 2 2 2 partition type primary primary primary swap primary primary primary primary partition number 1 2 3 NA 1 2 3 4

80

Once a drive has been partitioned, the partitions will represented as numbers on the end of the names. For example, the second partition on the second drive will be /dev/hdb2. The partition type (primary) is listed in the table above for clarity, Table 4. SCSI Drives drive name /dev/sda1 /dev/sda2 /dev/sda3 drive controller 1 1 1 drive number 6 6 6 partition type primary primary primary partition number 1 2 3

SCSI drives follow a similar pattern; They are represented by 'sd' instead of 'hd'. The first partition of the second SCSI drive would therefore be /dev/sdb1. In the table above, the drive number is arbitraily chosen to be 6 to introduce the idea that SCSI ID numbers do not map onto device names under linux.

Name Assignment
Under (Sun) Solaris and (SGI) IRIX, the device name given to a SCSI drive has some relationship to where you plug it in. Under linux, there is only wailing and gnashing of teeth. Before SCSI ID #2 /dev/sda After SCSI ID #2 /dev/sda SCSI ID #5 SCSI ID #7 SCSI ID #8 /dev/sdb /dev/sdc /dev/sdd

SCSI ID #7 SCSI ID #8 /dev/sdb /dev/sdc

SCSI drives have ID numbers which go from 1 through 15. Lower SCSI ID numbers are assigned lower-order letters. For example, if you have two drives numbered 2 and 5, then #2 will be /dev/sda and #5 will be /dev/sdb. If you remove either, all the higher numbered drives will be renamed the next time you boot up. If you have two SCSI controllers in your linux box, you will need to examine the output of /bin/dmesg in order to see what name each drive was assigned. If you remove one of two controllers, the remaining controller might have all its drives renamed. Grrr... There are two work-arounds; both involve using a program to put a label on each partition. The label is persistent even when the device is physically moved. You then refer to the partition directly or indirectly by label.

81

Logical Partitions
Table 5. Logical Partitions drive name /dev/hdb1 /dev/hdb2 /dev/hda5 /dev/hdb6 drive controller 1 1 1 1 drive number 2 2 2 2 partition type primary extended logical logical partition number 1 NA 2 3

The table above illustrates a mysterious jump in the name assignments. This is due to the use of logical partitions. This is all you have to know to deal with linux disk devices. For the sake of completeness, see Kristian's discussion of device numbers below.

Device numbers
The only important thing with a device file are its major and minor device numbers, which are shown instead of the file size: $ ls -l /dev/hda Table 6. Device file attributes brw-rw---permissions 1 root owner disk group 3, major device number 0 minor device number Jul 18 1994 date /dev/hda device name

When accessing a device file, the major number selects which device driver is being called to perform the input/output operation. This call is being done with the minor number as a parameter and it is entirely up to the driver how the minor number is being interpreted. The driver documentation usually describes how the driver uses minor numbers. For IDE disks, this documentation is in /usr/src/linux/Documentation/ide.txt. For SCSI disks, one would expect such documentation in /usr/src/linux/Documentation/scsi.txt, but it isn't there. One has to look at the driver source to be sure ( /usr/src/linux/driver/scsi/sd.c:184-196). Fortunately, there is Peter Anvin's list of device numbers and names in /usr/src/linux/Documentation/devices.txt; see the entries for block devices, major 3, 22, 33, 34 for IDE and major 8 for SCSI disks. 82

The major and minor numbers are a byte each and that is why the number of partitions per disk is limited.

Partition Types
A partition is labeled to host a certain kind of file system (not to be confused with a volume label. Such a file system could be the linux standard ext2 file system or linux swap space, or even foreign file systems like (Microsoft) NTFS or (Sun) UFS. There is a numerical code associated with each partition type. For example, the code for ext2 is 0x83 and linux swap is 0x82. To see a list of partition types and their codes, execute /sbin/sfdisk -T

Foreign Partition Types
The partition type codes have been arbitrarily chosen (you can't figure out what they should be) and they are particular to a given operating system. Therefore, it is theoretically possible that if you use two operating systems with the same hard drive, the same code might be used to designate two different partition types. OS/2 marks its partitions with a 0x07 type and so does Windows NT's NTFS. MS-DOS allocates several type codes for its various flavors of FAT file systems: 0x01, 0x04 and 0x06 are known. DR-DOS used 0x81 to indicate protected FAT partitions, creating a type clash with Linux/Minix at that time, but neither Linux/Minix nor DR-DOS are widely used any more. OS/2 marks its partitions with a 0x07 type and so does Windows NT's NTFS. MS-DOS allocates several type codes for its various flavors of FAT file systems: 0x01, 0x04 and 0x06 are known. DR-DOS used 0x81 to indicate protected FAT partitions, creating a type clash with Linux/Minix at that time, but neither Linux/Minix nor DR-DOS are widely used any more.

Primary Partitions
The number of partitions on an Intel-based system was limited from the very beginning: The original partition table was installed as part of the boot sector and held space for only four partition entries. These partitions are now called primary partitions.

Logical Partitions
One primary partition of a hard drive may be subpartitioned. These are logical partitions. This effectively allows us to skirt the historical four partition limitation. The primary partition used to house the logical partitions is called an extended partition and it has its own file system type (0x05). Unlike primary partitions, logical partitions must be contiguous. Each logical partition contains a pointer to the next logical partition, which implies that the number of logical partitions is unlimited. However, linux imposes 83

limits on the total number of any type of partition on a drive, so this effectively limits the number of logical partitions. This is at most 15 partitions total on an SCSI disk and 63 total on an IDE disk.

Swap Partitions
Every process running on your computer is allocated a number of blocks of RAM. These blocks are called pages. The set of in-memory pages which will be referenced by the processor in the very near future is called a "working set." Linux tries to predict these memory accesses (assuming that recently used pages will be used again in the near future) and keeps these pages in RAM if possible. If you have too many processes running on a machine, the kernel will try to free up RAM by writing pages to disk. This is what swap space is for. It effectively increases the amount of memory you have available. However, disk I/O is about a hundred times slower than reading from and writing to RAM. Consider this emergency memory and not extra memory. If memory becomes so scarce that the kernel pages out from the working set of one process in order to page in for another, the machine is said to be thrashing. Some readers might have inadvertenly experienced this: the hard drive is grinding away like crazy, but the computer is slow to the point of being unusable. Swap space is something you need to have, but it is no substitute for sufficient RAM.

Partitioning with fdisk
This section shows you how to actually partition your hard drive with the fdisk utility. Linux allows only 4 primary partitions. You can have a much larger number of logical partitions by sub-dividing one of the primary partitions. Only one of the primary partitions can be sub-divided. Examples: Four primary partitions Mixed primary and logical partitions

fdisk usage
fdisk is started by typing (as root) fdisk device at the command prompt. device might be something like /dev/hda or /dev/sda. The basic fdisk commands you need are: p print the partition table n create a new partition

84

d delete a partition q quit without saving changes w write the new partition table and exit Changes you make to the partition table do not take effect until you issue the write (w) command. Here is a sample partition table: Disk /dev/hdb: 64 heads, 63 sectors, 621 cylinders Units = cylinders of 4032 * 512 bytes Device Boot /dev/hdb1 * /dev/hdb2 /dev/hdb3 /dev/hdb4 Start 1 185 369 553 End Blocks 184 370912+ 368 370944 552 370944 621 139104 Id 83 83 83 82 System Linux Linux Linux Linux swap

The first line shows the geometry of your hard drive. It may not be physically accurate, but you can accept it as though it were. The hard drive in this example is made of 32 double-sided platters with one head on each side (probably not true). Each platter has 621 concentric tracks. A 3-dimensional track (the same track on all disks) is called a cylinder. Each track is divided into 63 sectors. Each sector contains 512 bytes of data. Therefore the block size in the partition table is 64 heads * 63 sectors * 512 bytes er...divided by 1024. (See 4 for discussion on problems with this calculation.) The start and end values are cylinders.

Four primary partitions
The overview: Decide on the size of your swap space and where it ought to go. Divide up the remaining space for the three other partitions. Example: I start fdisk from the shell prompt: # fdisk /dev/hdb which indicates that I am using the second drive on my IDE controller. When I print the (empty) partition table, I just get configuration information. Command (m for help): p Disk /dev/hdb: 64 heads, 63 sectors, 621 cylinders Units = cylinders of 4032 * 512 bytes

85

I knew that I had a 1.2Gb drive, but now I really know: 64 * 63 * 512 * 621 = 1281982464 bytes. I decide to reserve 128Mb of that space for swap, leaving 1153982464. If I use one of my primary partitions for swap, that means I have three left for ext2 partitions. Divided equally, that makes for 384Mb per partition. Now I get to work. Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 1 First cylinder (1-621, default 1):<RETURN> Using default value 1 Last cylinder or +size or +sizeM or +sizeK (1-621, default 621): +384M Next, I set up the partition I want to use for swap: Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 2 First cylinder (197-621, default 197):<RETURN> Using default value 197 Last cylinder or +size or +sizeM or +sizeK (197-621, default 621): +128M Now the partition table looks like this: Device Boot Start End Blocks Id System /dev/hdb1 1 196 395104 83 Linux /dev/hdb2 197 262 133056 83 Linux I set up the remaining two partitions the same way I did the first. Finally, I make the first partition bootable: Command (m for help): a Partition number (1-4): 1 And I make the second partition of type swap: Command (m for help): t Partition number (1-4): 2 Hex code (type L to list codes): 82 Changed system type of partition 2 to 82 (Linux swap) Command (m for help): p The end result: Disk /dev/hdb: 64 heads, 63 sectors, 621 cylinders Units = cylinders of 4032 * 512 bytes

86

Device Boot /dev/hdb1 * /dev/hdb2 /dev/hdb3 /dev/hdb4

Start 1 197 263 459

End Blocks 196 395104+ 262 133056 458 395136 621 328608

Id 83 82 83 83

System Linux Linux swap Linux Linux

Finally, I issue the write command (w) to write the table on the disk.

Mixed primary and logical partitions
The overview: create one use one of the primary partitions to house all the extra partitions. Then create logical partitions within it. Create the other primary partitions before or after creating the logical partitions. Example: I start fdisk from the shell prompt: # fdisk /dev/sda which indicates that I am using the first drive on my SCSI chain. First I figure out how many partitions I want. I know my drive has a 183Gb capacity and I want 26Gb partitions (because I happen to have back-up tapes that are about that size). 183Gb / 26Gb = ~7 so I will need 7 partitions. Even though fdisk accepts partition sizes expressed in Mb and Kb, I decide to calculate the number of cylinders that will end up in each partition because fdisk reports start and stop points in cylinders. I see when I enter fdisk that I have 22800 cylinders. > The number of cylinders for this disk is set to 22800. There is > nothing wrong with that, but this is larger than 1024, and could in > certain setups cause problems with: 1) software that runs at boot > time (e.g., LILO) 2) booting and partitioning software from other > OSs (e.g., DOS FDISK, OS/2 FDISK) So, 22800 total cylinders divided by seven partitions is 3258 cylinders. Each partition will be about 3258 cylinders long. I ignore the warning msg because this is not my boot drive. Since I have 4 primary partitions, 3 of them can be 3258 long. The extended partition will have to be (4 * 3258), or 13032, cylinders long in order to contain the 4 logical partitions.

87

I enter the following commands to set up the first of the 3 primary partitions (stuff I type is bold ): Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 1 First cylinder (1-22800, default 1): <RETURN> Using default value 1 Last cylinder or +size or +sizeM or +sizeK (1-22800, default 22800): 3258 The last partition is the extended partition: Partition number (1-4): 4 First cylinder (9775-22800, default 9775): <RETURN> Using default value 9775 Last cylinder or +size or +sizeM or +sizeK (9775-22800, default 22800): <RETURN> Using default value 22800 The result, when I issue the print table command is: /dev/sda1 1 3258 26169853+ 83 Linux /dev/sda2 3259 6516 26169885 83 Linux /dev/sda3 6517 9774 26169885 83 Linux /dev/sda4 9775 22800 104631345 5 Extended Next I segment the extended partition into 4 logical partitions, starting with the first logical partition, into 3258-cylinder segments. The logical partitions automatically start from /dev/sda5. Command (m for help): n First cylinder (9775-22800, default 9775): <RETURN> Using default value 9775 Last cylinder or +size or +sizeM or +sizeK (9775-22800, default 22800): 13032 The end result is: Device Boot /dev/sda1 /dev/sda2 /dev/sda3 /dev/sda4 /dev/sda5 /dev/sda6 /dev/sda7 /dev/sda8 Start End Blocks Id System 1 3258 26169853+ 83 Linux 3259 6516 26169885 83 Linux 6517 9774 26169885 83 Linux 9775 22800 104631345 5 Extended 9775 13032 26169853+ 83 Linux 13033 16290 26169853+ 83 Linux 16291 19584 26459023+ 83 Linux 19585 22800 25832488+ 83 Linux

88

Finally, I issue the write command (w) to write the table on the disk. To make the partitions usable, I will have to format each partition and then mount it.

Submitted Examples
I'd like to submit my partition layout, because it works well with any distribution of Linux (even big RPM based ones). I have one hard drive that ... is 10 gigs, exactly. Windows can't see above 9.3 gigs of it, but Linux can see it all, and use it all. It also has much more than 1024 cylenders. Table 7. Partition layout example Partition /dev/hda1 /dev/hda2 /dev/hda3 /dev/hda5 /dev/hda6 /dev/hda7 /dev/hda8 /dev/hda9 Mount point /boot windows 98 partition extended swap space /tmp / /usr /home Size (15 megs) (2 gigs) (N/A) (64 megs) (50 megs) (150 megs) (1.5 gigs) (rest of drive)

LVM
LVM is a logical volume manager for the Linux kernel. It was originally written in 1998 by Heinz Mauelshagen, who based its design on that of the LVM in HP-UX. The installers for the Red Hat, MontaVista Linux, SLED, Debian GNU/Linux, and Ubuntu distributions are LVM-aware and can install a bootable system with a root filesystem on a logical volume.

Features
The LVM can: Resize volume groups online by absorbing new physical volumes (PV) or ejecting existing ones. Resize logical volumes online by concatenating extents onto them or truncating extents from them. Create read-only snapshots of logical volumes (LVM1). Create read-write snapshots of logical volumes (LVM2). Stripe whole or parts of logical volumes across multiple PVs, in a fashion similar to RAID0. Mirror whole or parts of logical volumes, in a fashion similar to RAID1 Move online logical volumes between PVs.

89

Split or merge volume groups in situ (as long as no logical volumes span the split). This can be useful when migrating whole logical volumes to or from offline storage.

Missing features
LVM cannot provide parity based redundancy similar to RAID4, RAID5, or RAID6.

Implementation
LVM keeps a metadata header at the start of every PV, each of which is uniquely identified by a UUID. Each PV's header is a complete copy of the entire volume group's layout, including the UUIDs of all other PV, the UUIDs of all logical volumes and an allocation map of PEs to LEs. In the 2.6-series Linux kernels, the LVM is implemented in terms of the device mapper, a block-level scheme for creating virtual block devices and mapping their contents onto other block devices. This minimizes the amount of the relatively hard-to-debug kernel code needed to implement the LVM and also allows its I/O redirection services to be shared with other volume managers (such as EVMS). Any LVM-specific code is pushed out into its user-space tools. To bring a volume group online, for example, the "vgchange" tool: Searches for PVs in all available block devices. Parses the metadata header in each PV found. Computes the layouts of all visible volume groups. Loops over each logical volume in the volume group to be brought online and: Checks if the logical volume to be brought online has all its PVs visible. Creates a new, empty device mapping. Maps it (with the "linear" target) onto the data areas of the PVs the logical volume belongs to. To move an online logical volume between PVs, the "pvmove" tool: Creates a new, empty device mapping for the destination. Applies the "mirror" target to the original and destination maps. The kernel will start the mirror in "degraded" mode and begin copying data from the original to the destination to bring it into sync. Replaces the original mapping with the destination when the mirror comes into sync, then destroys the original. These device mapper operations take place transparently, without applications or filesystems being aware that their underlying storage is moving. Example: A Basic File Server A simple, practical example of LVM use is a traditional file server, which provides centralized backup, storage space for media files, and shared file space for several family

90

members' computers. Flexibility is a key requirement; who knows what storage challenges next year's technology will bring? For example, suppose your requirements are: 400G - Large media file storage 50G - Online backups of two laptops and three desktops (10G each) 10G - Shared files Ultimately, these requirements may increase a great deal over the next year or two, but exactly how much and which partition will grow the most are still unknown.

Disk Hardware
Traditionally, a file server uses SCSI disks, but today SATA disks offer an attractive combination of speed and low cost. At the time of this writing, 250 GB SATA drives are commonly available for around $100; for a terabyte, the cost is around $400. SATA drives are not named like ATA drives (hda, hdb), but like SCSI (sda, sdb). Once the system has booted with SATA support, it has four physical devices to work with: /dev/sda /dev/sdb /dev/sdc /dev/sdd 251.0 GB 251.0 GB 251.0 GB 251.0 GB

Next, partition these for use with LVM. You can do this with fdisk by specifying the "Linux LVM" partition type 8e. The finished product looks like this: # fdisk -l /dev/sdd Disk /dev/sdd: 251.0 GB, 251000193024 bytes 255 heads, 63 sectors/track, 30515 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device /dev/sdd1 Start End Blocks Id System 1 30515 245111706 8e Linux LVM

Notice the partition type is 8e, or "Linux LVM."

Creating a Virtual Volume
Initialize each of the disks using the pvcreate command: # pvcreate /dev/sda /dev/sdb /dev/sdc /dev/sdd

91

This sets up all the partitions on these drives for use under LVM, allowing creation of volume groups. To examine available PVs, use the pvdisplay command. This system will use a single-volume group named datavg: # vgcreate datavg /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 Use vgdisplay to see the newly created datavg VG with the four drives stitched together. Now create the logical volumes within them: # lvcreate --name medialv --size 400G # lvcreate --name backuplv --size 50G # lvcreate --name sharelv --size 10G Without LVM, you might allocate all available disk space to the partitions you're creating, but with LVM, it is worthwhile to be conservative, allocating only half the available space to the current requirements. As a general rule, it's easier to grow a filesystem than to shrink it, so it's a good strategy to allocate exactly what you need today, and leave the remaining space unallocated until your needs become clearer. This method also gives you the option of creating new volumes when new needs arise (such as a separate encrypted file share for sensitive data). To examine these volumes, use the lvdisplay command. Now you have several nicely named logical volumes at your disposal: /dev/datavg/backuplv (also /dev/mapper/datavg-backuplv) /dev/datavg/medialv (also /dev/mapper/datavg-medialv) /dev/datavg/sharelv (also /dev/mapper/datavg-sharelv)

92

UNIX Sumary

Typographical conventions
In what follows, we shall use the following typographical conventions:
• • •

Characters written in bold typewriter font are commands to be typed into the computer as they stand. Characters written in italic typewriter font indicate non-specific file or directory names. Words inserted within square brackets [Ctrl] indicate keys to be pressed.

So, for example, % ls anydirectory [Enter] means "at the UNIX prompt %, type ls followed by the name of some directory, then press the key marked Enter" Don't forget to press the [Enter] key: commands are not sent to the computer until this is done. Note: UNIX is case-sensitve, so LS is not the same as ls. The same applies to filenames, so myfile.txt, MyFile.txt and MYFILE.TXT are three seperate files. Beware if copying files to a PC, since DOS and Windows do not make this distinction.

Introduction
This session concerns UNIX, which is a common operating system. By operating system, we mean the suite of programs which make the computer work. UNIX is used by the workstations and multi-user servers within the school. On X terminals and the workstations, X Windows provide a graphical interface between the user and UNIX. However, knowledge of UNIX is required for operations which aren't covered by a graphical program, or for when there is no X windows system, for example, in a telnet session.

93

The UNIX operating system
The UNIX operating system is made up of three parts; the kernel, the shell and the programs.

The kernel
The kernel of UNIX is the hub of the operating system: it allocates time and memory to programs and handles the filestore and communications in response to system calls. As an illustration of the way that the shell and the kernel work together, suppose a user types rm myfile (which has the effect of removing the file myfile). The shell searches the filestore for the file containing the program rm, and then requests the kernel, through system calls, to execute the program rm on myfile. When the process rm myfile has finished running, the shell then returns the UNIX prompt % to the user, indicating that it is waiting for further commands.

The shell
The shell acts as an interface between the user and the kernel. When a user logs in, the login program checks the username and password, and then starts another program called the shell. The shell is a command line interpreter (CLI). It interprets the commands the user types in and arranges for them to be carried out. The commands are themselves programs: when they terminate, the shell gives the user another prompt (% on our systems). The adept user can customise his/her own shell, and users can use different shells on the same machine. Staff and students in the school have the tcsh shell by default. The tcsh shell has certain features to help the user inputting commands. Filename Completion - By typing part of the name of a command, filename or directory and pressing the [Tab] key, the tcsh shell will complete the rest of the name automatically. If the shell finds more than one name beginning with those letters you have typed, it will beep, prompting you to type a few more letters before pressing the tab key again. History - The shell keeps a list of the commands you have typed in. If you need to repeat a command, use the cursor keys to scroll up and down the list or type history for a list of previous commands.

94

Files and processes
Everything in UNIX is either a file or a process. A process is an executing program identified by a unique PID (process identifier). A file is a collection of data. They are created by users using text editors, running compilers etc. Examples of files:
• • •

a document (report, essay etc.) the text of a program written in some high-level programming language instructions comprehensible directly to the machine and incomprehensible to a casual user, for example, a collection of binary digits (an executable or binary file); a directory, containing information about its contents, which may be a mixture of other directories (subdirectories) and ordinary files.

The Directory Structure
All the files are grouped together in the directory structure. The file-system is arranged in a hierarchical structure, like an inverted tree. The top of the hierarchy is traditionally called root.

In the diagram above, we see that the directory ee51ab contains the subdirectory unixstuff and a file proj.txt

Starting an Xterminal session

95

To start an Xterm session, click on the Unix Terminal icon on your desktop, or from the drop-down menus

An Xterminal window will appear with a Unix prompt, waiting for you to start entering commands.

96

Part One

1.1 Listing files and directories
ls (list)
When you first login, your current working directory is your home directory. Your home directory has the same name as your user-name, for example, ee91ab, and it is where your personal files and subdirectories are saved. To find out what is in your home directory, type % ls (short for list) The ls command lists the contents of your current working directory. There may be no files visible in your home directory, in which case, the UNIX prompt will be returned. Alternatively, there may already be some files inserted by the System Administrator when your account was created. ls does not, in fact, cause all the files in your home directory to be listed, but only those ones whose name does not begin with a dot (.) Files beginning with a dot (.) are known as hidden files and usually contain important program configuration information. They are hidden because you should not change them unless you are very familiar with UNIX!!! To list all files in your home directory including those whose names begin with a dot, type

97

% ls -a ls is an example of a command which can take options: -a is an example of an option. The options change the behaviour of the command. There are online manual pages that tell you which options a particular command can take, and how each option modifies the behaviour of the command. (See later in this tutorial)

1.2 Making Directories
mkdir (make directory)
We will now make a subdirectory in your home directory to hold the files you will be creating and using in the course of this tutorial. To make a subdirectory called unixstuff in your current working directory type % mkdir unixstuff To see the directory you have just created, type % ls

1.3 Changing to a different directory
cd (change directory)
The command cd directory means change the current working directory to 'directory'. The current working directory may be thought of as the directory you are in, i.e. your current position in the file-system tree. To change to the directory you have just made, type % cd unixstuff Type ls to see the contents (which should be empty)

Exercise 1a
Make another directory inside the unixstuff directory called backups

1.4 The directories . and ..
Still in the unixstuff directory, type 98

% ls -a As you can see, in the unixstuff directory (and in all other directories), there are two special directories called (.) and (..) In UNIX, (.) means the current directory, so typing % cd . NOTE: there is a space between cd and the dot means stay where you are (the unixstuff directory). This may not seem very useful at first, but using (.) as the name of the current directory will save a lot of typing, as we shall see later in the tutorial.

(..) means the parent of the current directory, so typing % cd .. will take you one directory up the hierarchy (back to your home directory). Try it now. Note: typing cd with no argument always returns you to your home directory. This is very useful if you are lost in the file system.

1.5 Pathnames
pwd (print working directory)
Pathnames enable you to work out where you are in relation to the whole file-system. For example, to find out the absolute pathname of your home-directory, type cd to get back to your home-directory and then type % pwd The full pathname will look something like this /a/fservb/fservb/fservb22/eebeng99/ee91ab which means that ee91ab (your home directory) is in the directory eebeng99 (the group directory),which is located on the fservb file-server. Note:

99

/a/fservb/fservb/fservb22/eebeng99/ee91ab can be shortened to /user/eebeng99/ee91ab

Exercise 1b
Use the commands ls, pwd and cd to explore the file system. (Remember, if you get lost, type cd by itself to return to your home-directory)

1.6 More about home directories and pathnames
Understanding pathnames
First type cd to get back to your home-directory, then type % ls unixstuff to list the conents of your unixstuff directory.

Now type % ls backups You will get a message like this backups: No such file or directory The reason is, backups is not in your current working directory. To use a command on a file (or directory) not in the current working directory (the directory you are currently in), you must either cd to the correct directory, or specify its full pathname. To list the contents of your backups directory, you must type % ls unixstuff/backups

100

~ (your home directory)
Home directories can also be referred to by the tilde ~ character. It can be used to specify paths starting at your home directory. So typing % ls ~/unixstuff will list the contents of your unixstuff directory, no matter where you currently are in the file system. What do you think % ls ~ would list? What do you think % ls ~/.. would list?

Summary
ls ls -a mkdir cd directory cd cd ~ cd .. pwd

list files and directories list all files and directories make a directory change to named directory change to home-directory change to home-directory change to parent directory display the path of the current directory

101

Part Two

2.1 Copying Files
cp (copy)
cp file1 file2 is the command which makes a copy of file1 in the current working directory and calls it file2 What we are going to do now, is to take a file stored in an open access area of the file system, and use the cp command to copy it to your unixstuff directory. First, cd to your unixstuff directory. % cd ~/unixstuff Then at the UNIX prompt, type, % cp /vol/examples/tutorial/science.txt . (Note: Don't forget the dot (.) at the end. Remember, in UNIX, the dot means the current directory.) The above command means copy the file science.txt to the current directory, keeping the name the same. (Note: The directory /vol/examples/tutorial/ is an area to which everyone in the department has read and copy access. If you are from outside the University, you can grab a copy of the file here. Use 'File/Save As..' from the menu bar to save it into your unixstuff directory.)

Exercise 2a
Create a backup of your science.txt file by copying it to a file called science.bak

2.2 Moving files
mv (move)
mv file1 file2 moves (or renames) file1 to file2

102

To move a file from one place to another, use the mv command. This has the effect of moving rather than copying the file, so you end up with only one file rather than two. It can also be used to rename a file, by moving the file to the same directory, but giving it a different name. We are now going to move the file science.bak to your backup directory. First, change directories to your unixstuff directory (can you remember how?). Then, inside the unixstuff directory, type % mv science.bak backups/. Type ls and ls backups to see if it has worked.

2.3 Removing files and directories
rm (remove), rmdir (remove directory)
To delete (remove) a file, use the rm command. As an example, we are going to create a copy of the science.txt file then delete it. Inside your unixstuff directory, type % cp science.txt tempfile.txt % ls (to check if it has created the file) % rm tempfile.txt % ls (to check if it has deleted the file) You can use the rmdir command to remove a directory (make sure it is empty first). Try to remove the backups directory. You will not be able to since UNIX will not let you remove a non-empty directory.

Exercise 2b
Create a directory called tempstuff using mkdir , then remove it using the rmdir command.

103

2.4 Displaying the contents of a file on the screen
clear (clear screen)
Before you start the next section, you may like to clear the terminal window of the previous commands so the output of the following commands can be clearly understood. At the prompt, type % clear This will clear all text and leave you with the % prompt at the top of the window.

cat (concatenate)
The command cat can be used to display the contents of a file on the screen. Type: % cat science.txt As you can see, the file is longer than than the size of the window, so it scrolls past making it unreadable.

less
The command less writes the contents of a file onto the screen a page at a time. Type % less science.txt Press the [space-bar] if you want to see another page, type [q] if you want to quit reading. As you can see, less is used in preference to cat for long files.

head
The head command writes the first ten lines of a file to the screen. First clear the screen then type % head science.txt 104

Then type % head -5 science.txt What difference did the -5 do to the head command?

tail
The tail command writes the last ten lines of a file to the screen. Clear the screen and type % tail science.txt How can you view the last 15 lines of the file?

2.5 Searching the contents of a file
Simple searching using less
Using less, you can search though a text file for a keyword (pattern). For example, to search through science.txt for the word 'science', type % less science.txt then, still in less (i.e. don't press [q] to quit), type a forward slash [/] followed by the word to search /science As you can see, less finds and highlights the keyword. Type [n] to search for the next occurrence of the word.

grep (don't ask why it is called grep)
grep is one of many standard UNIX utilities. It searches files for specified words or patterns. First clear the screen, then type % grep science science.txt

105

As you can see, grep has printed out each line containg the word science. Or has it???? Try typing % grep Science science.txt The grep command is case sensitive; it distinguishes between Science and science. To ignore upper/lower case distinctions, use the -i option, i.e. type % grep -i science science.txt To search for a phrase or pattern, you must enclose it in single quotes (the apostrophe symbol). For example to search for spinning top, type % grep -i 'spinning top' science.txt Some of the other options of grep are: -v display those lines that do NOT match -n precede each maching line with the line number -c print only the total count of matched lines Try some of them and see the different results. Don't forget, you can use more than one option at a time, for example, the number of lines without the words science or Science is % grep -ivc science science.txt

wc (word count)
A handy little utility is the wc command, short for word count. To do a word count on science.txt, type % wc -w science.txt To find out how many lines the file has, type % wc -l science.txt

106

Summary

cp file1 file2 mv file1 file2 rm file rmdir directory cat file more file head file tail file grep 'keyword' file wc file

copy file1 and call it file2 move or rename file1 to file2 remove a file remove a directory display a file display a file a page at a time display the first few lines of a file display the last few lines of a file search a file for keywords count number of lines/words/characters in file

Part Three

3.1 Redirection
Most processes initiated by UNIX commands write to the standard output (that is, they write to the terminal screen), and many take their input from the standard input (that is, they read it from the keyboard). There is also the standard error, where processes write their error messages, by default, to the terminal screen. We have already seen one use of the cat command to write the contents of a file to the screen. Now type cat without specifing a file to read % cat Then type a few words on the keyboard and press the [Return] key. Finally hold the [Ctrl] key down and press [d] (written as ^D for short) to end the input.

107

What has happened? If you run the cat command without specifing a file to read, it reads the standard input (the keyboard), and on receiving the'end of file' (^D), copies it to the standard output (the screen). In UNIX, we can redirect both the input and the output of commands.

3.2 Redirecting the Output
We use the > symbol to redirect the output of a command. For example, to create a file called list1 containing a list of fruit, type % cat > list1 Then type in the names of some fruit. Press [Return] after each one. pear banana apple ^D (Control D to stop) What happens is the cat command reads the standard input (the keyboard) and the > redirects the output, which normally goes to the screen, into a file called list1 To read the contents of the file, type % cat list1

Exercise 3a
Using the above method, create another file called list2 containing the following fruit: orange, plum, mango, grapefruit. Read the contents of list2 The form >> appends standard output to a file. So to add more items to the file list1, type % cat >> list1 Then type in the names of more fruit peach grape orange ^D (Control D to stop)

108

To read the contents of the file, type % cat list1 You should now have two files. One contains six fruit, the other contains four fruit. We will now use the cat command to join (concatenate) list1 and list2 into a new file called biglist. Type % cat list1 list2 > biglist What this is doing is reading the contents of list1 and list2 in turn, then outputing the text to the file biglist To read the contents of the new file, type % cat biglist

3.3 Redirecting the Input
We use the < symbol to redirect the input of a command. The command sort alphabetically or numerically sorts a list. Type % sort Then type in the names of some vegetables. Press [Return] after each one. carrot beetroot artichoke ^D (control d to stop) The output will be artichoke beetroot carrot Using < you can redirect the input to come from a file rather than the keyboard. For example, to sort the list of fruit, type % sort < biglist and the sorted list will be output to the screen. To output the sorted list to a file, type, 109

% sort < biglist > slist Use cat to read the contents of the file slist

3.4 Pipes
To see who is on the system with you, type % who One method to get a sorted list of names is to type, % who > names.txt % sort < names.txt This is a bit slow and you have to remember to remove the temporary file called names when you have finished. What you really want to do is connect the output of the who command directly to the input of the sort command. This is exactly what pipes do. The symbol for a pipe is the vertical bar | For example, typing % who | sort will give the same result as above, but quicker and cleaner. To find out how many users are logged on, type % who | wc -l

Exercise 3b
a2ps -Phockney textfile is the command to print a postscript file to the printer hockney. Using pipes, print all lines of list1 and list2 containing the letter 'p', sort the result, and print to the printer hockney.

110

Summary
command > file command >> file command < file command1 | command2 cat file1 file2 > file0 sort who a2ps -Pprinter textfile lpr -Pprinter psfile

redirect standard output to a file append standard output to a file redirect standard input from a file pipe the output of command1 to the input of command2 concatenate file1 and file2 to file0 sort data list users currently logged in print text file to named printer print postscript file to named printer

Part Four

4.1 Wildcards
The characters * and ?
The character * is called a wildcard, and will match against none or more character(s) in a file (or directory) name. For example, in your unixstuff directory, type % ls list* This will list all files in the current directory starting with list.... Try typing % ls *list This will list all files in the current directory ending with ....list The character ? will match exactly one character. So ls ?ouse will match files like house and mouse, but not grouse. Try typing % ls ?list

4.2 Filename conventions
111

We should note here that a directory is merely a special type of file. So the rules and conventions for naming files apply also to directories. In naming files, characters with special meanings such as / * & % , should be avoided. Also, avoid using spaces within names. The safest way to name a file is to use only alphanumeric characters, that is, letters and numbers, together with _ (underscore) and . (dot). File names conventionally start with a lower-case letter, and may end with a dot followed by a group of letters indicating the contents of the file. For example, all files consisting of C code may be named with the ending .c, for example, prog1.c . Then in order to list all files containing C code in your home directory, you need only type ls *.c in that directory. Beware: some applications give the same name to all the output files they generate. For example, some compilers, unless given the appropriate option, produce compiled files named a.out. Should you forget to use that option, you are advised to rename the compiled file immediately, otherwise the next such file will overwrite it and it will be lost.

4.3 Getting Help
On-line Manuals
There are on-line manuals which gives information about most commands. The manual pages tell you which options a particular command can take, and how each option modifies the behaviour of the command. Type man command to read the manual page for a particular command. For example, to find out more about the wc (word count) command, type % man wc Alternatively % whatis wc gives a one-line description of the command, but omits any information about options etc.

Apropos
When you are not sure of the exact name of a command,

112

% apropos keyword will give you the commands with keyword in their manual page header. For example, try typing % apropos copy

Summary
* ? man command whatis command apropos keyword

match any number of characters match one character read the online manual page for a command brief description of a command match commands with keyword in their man pages

Part Five

5.1 File system security (access rights)
In your unixstuff directory, type % ls -l (l for long listing!) You will see that you now get lots of details about the contents of your directory, similar to the example below.

113

Each file (and directory) has associated access rights, which may be found by typing ls -l. Also, ls -lg gives additional information as to which group owns the file (beng95 in the following example): -rwxrw-r-- 1 ee51ab beng95 2450 Sept29 11:52 file1 In the left-hand column is a 10 symbol string consisting of the symbols d, r, w, x, -, and, occasionally, s or S. If d is present, it will be at the left hand end of the string, and indicates a directory: otherwise - will be the starting symbol of the string. The 9 remaining symbols indicate the permissions, or access rights, and are taken as three groups of 3.
• • •

The left group of 3 gives the file permissions for the user that owns the file (or directory) (ee51ab in the above example); the middle group gives the permissions for the group of people to whom the file (or directory) belongs (eebeng95 in the above example); the rightmost group gives the permissions for all others.

The symbols r, w, etc., have slightly different meanings depending on whether they refer to a simple file or to a directory.

Access rights on files.
• • •

r (or -), indicates read permission (or otherwise), that is, the presence or absence of permission to read and copy the file w (or -), indicates write permission (or otherwise), that is, the permission (or otherwise) to change a file x (or -), indicates execution permission (or otherwise), that is, the permission to execute a file, where appropriate

Access rights on directories.
• • •

r allows users to list files in the directory; w means that users may delete files from the directory or move files into it; x means the right to access files in the directory. This implies that you may read files in the directory provided you have read permission on the individual files.

So, in order to read a file, you must have execute permission on the directory containing that file, and hence on any directory containing that directory as a subdirectory, and so on, up the tree.

114

Some examples
-rwxrwxrwx a file that everyone can read, write and execute (and delete). a file that only the owner can read and write - no-one else -rw------- can read or write and no-one has execution rights (e.g. your mailbox file).

5.2 Changing access rights
chmod (changing a file mode)
Only the owner of a file can use chmod to change the permissions of a file. The options of chmod are as follows Symbol u g o a r w x + user group other all read write (and delete) execute (and access directory) add permission take away permission Meaning

For example, to remove read write and execute permissions on the file biglist for the group and others, type % chmod go-rwx biglist This will leave the other permissions unaffected. To give read and write permissions on the file biglist to all, % chmod a+rw biglist

115

Exercise 5a
Try changing access permissions on the file science.txt and on the directory backups Use ls -l to check that the permissions have changed.

5.3 Processes and Jobs
A process is an executing program identified by a unique PID (process identifier). To see information about your processes, with their associated PID and status, type % ps A process may be in the foreground, in the background, or be suspended. In general the shell does not return the UNIX prompt until the current process has finished executing. Some processes take a long time to run and hold up the terminal. Backgrounding a long process has the effect that the UNIX prompt is returned immediately, and other tasks can be carried out while the original process continues executing.

Running background processes
To background a process, type an & at the end of the command line. For example, the command sleep waits a given number of seconds before continuing. Type % sleep 10 This will wait 10 seconds before returning the command prompt %. Until the command prompt is returned, you can do nothing except wait. To run sleep in the background, type % sleep 10 & [1] 6259 The & runs the job in the background and returns the prompt straight away, allowing you do run other programs while waiting for that one to finish. The first line in the above example is typed in by the user; the next line, indicating job number and PID, is returned by the machine. The user is be notified of a job number (numbered from 1) enclosed in square brackets, together with a PID and is notified when a background process is finished. Backgrounding is useful for jobs which will take a long time to complete.

116

Backgrounding a current foreground process
At the prompt, type % sleep 100 You can suspend the process running in the foreground by holding down the [control] key and typing [z] (written as ^Z) Then to put it in the background, type % bg Note: do not background programs that require user interaction e.g. pine

5.4 Listing suspended and background processes
When a process is running, backgrounded or suspended, it will be entered onto a list along with a job number. To examine this list, type % jobs An example of a job list could be [1] Suspended sleep 100 [2] Running netscape [3] Running nedit To restart (foreground) a suspended processes, type % fg %jobnumber For example, to restart sleep 100, type % fg %1 Typing fg with no job number foregrounds the last suspended process.

5.5 Killing a process
kill (terminate or signal a process)
It is sometimes necessary to kill a process (for example, when an executing program is in an infinite loop) To kill a job running in the foreground, type ^C (control c). For example, run

117

% sleep 100 ^C To kill a suspended or background process, type % kill %jobnumber For example, run % sleep 100 & % jobs If it is job number 4, type % kill %4 To check whether this has worked, examine the job list again to see if the process has been removed.

ps (process status)
Alternatively, processes can be killed by finding their process numbers (PIDs) and using kill PID_number % sleep 100 & % ps PID TT S TIME COMMAND 20077 pts/5 S 0:05 sleep 100 21563 pts/5 T 0:00 netscape 21873 pts/5 S 0:25 nedit To kill off the process sleep 100, type % kill 20077 and then type ps again to see if it has been removed from the list. If a process refuses to be killed, uses the -9 option, i.e. type % kill -9 20077 Note: It is not possible to kill off other users' processes !!!

Summary
118

ls -lag chmod [options] file command & ^C ^Z bg jobs fg %1 kill %1 ps kill 26152

list access rights for all files change access rights for named file run command in background kill the job running in the foreground suspend the job running in the foreground background the suspended job list current jobs foreground job number 1 kill job number 1 list current processes kill process number 26152

Part Six

Other useful UNIX commands
quota
All students are allocated a certain amount of disk space on the file system for their personal files, usually about 100Mb. If you go over your quota, you are given 7 days to remove excess files. To check your current quota and how much of it you have used, type % quota -v

df
The df command reports on the space left on the file system. For example, to find out how much space is left on the fileserver, type % df .

119

du
The du command outputs the number of kilobyes used by each subdirectory. Useful if you have gone over quota and you want to find out which directory has the most files. In your home-directory, type % du

compress
This reduces the size of a file, thus freeing valuable disk space. For example, type % ls -l science.txt and note the size of the file. Then to compress science.txt, type % compress science.txt This will compress the file and place it in a file called science.txt.Z To see the change in size, type ls -l again. To uncomress the file, use the uncompress command. % uncompress science.txt.Z

gzip
This also compresses a file, and is more efficient than compress. For example, to zip science.txt, type % gzip science.txt This will zip the file and place it in a file called science.txt.gz To unzip the file, use the gunzip command. % gunzip science.txt.gz

file
file classifies the named files according to the type of data they contain, for example ascii (text), pictures, compressed data, etc.. To report on all files in your home directory, type % file *

120

history
The C shell keeps an ordered list of all the commands that you have entered. Each command is given a number according to the order it was entered. % history (show command history list) If you are using the C shell, you can use the exclamation character (!) to recall commands easily. % !! (recall last command) % !-3 (recall third most recent command) % !5 (recall 5th command in list) % !grep (recall last command starting with grep) You can increase the size of the history buffer by typing % set history=100

Part Seven

7.1 Compiling UNIX software packages
We have many public domain and commercial software packages installed on our systems, which are available to all users. However, students are allowed to download and install small software packages in their own home directory, software usually only useful to them personally. There are a number of steps needed to install the software.
• • • • •

Locate and download the source code (which is usually compressed) Unpack the source code Compile the code Install the resulting executable Set paths to the installation directory

Of the above steps, probably the most difficult is the compilation stage.

121

Compiling Source Code
All high-level language code must be converted into a form the computer understands. For example, C language source code is converted into a lower-level language called assembly language. The assembly language code made by the previous stage is then converted into object code which are fragments of code which the computer understands directly. The final stage in compiling a program involves linking the object code to code libraries which contain certain built-in functions. This final stage produces an executable program. To do all these steps by hand is complicated and beyond the capability of the ordinary user. A number of utilities and tools have been developed for programmers and end-users to simplify these steps.

make and the Makefile
The make command allows programmers to manage large programs or groups of programs. It aids in developing large programs by keeping track of which portions of the entire program have been changed, compiling only those parts of the program which have changed since the last compile. The make program gets its set of compile rules from a text file called Makefile which resides in the same directory as the source files. It contains information on how to compile the software, e.g. the optimisation level, whether to include debugging info in the executable. It also contains information on where to install the finished compiled binaries (executables), manual pages, data files, dependent library files, configuration files, etc. Some packages require you to edit the Makefile by hand to set the final installation directory and any other parameters. However, many packages are now being distributed with the GNU configure utility.

configure
As the number of UNIX variants increased, it became harder to write programs which could run on all variants. Developers frequently did not have access to every system, and the characteristics of some systems changed from version to version. The GNU configure and build system simplifies the building of programs distributed as source code. All programs are built using a simple, standardised, two step process. The program builder need not install any special tools in order to build the program. The configure shell script attempts to guess correct values for various systemdependent variables used during compilation. It uses those values to create a Makefile in each directory of the package. The simplest way to compile a package is: 122

1. 2. 3. 4. 5.

cd to the directory containing the package's source code. Type ./configure to configure the package for your system. Type make to compile the package. Optionally, type make check to run any self-tests that come with the package. Type make install to install the programs and any data files and documentation. 6. Optionally, type make clean to remove the program binaries and object files from the source code directory The configure utility supports a wide variety of options. You can usually use the --help option to get a list of interesting options for a particular configure script. The only generic options you are likely to use are the --prefix and --execprefix options. These options are used to specify the installation directories. The directory named by the --prefix option will hold machine independent files such as documentation, data and configuration files. The directory named by the --exec-prefix option, (which is normally a subdirectory of the --prefix directory), will hold machine dependent files such as executables.

7.2 Downloading source code
For this example, we will download a piece of free software that converts between different units of measurements. First create a download directory % mkdir download Download the software here and save it to your new download directory.

7.3 Extracting the source code
Go into your download directory and list the contents. % cd download % ls -l As you can see, the filename ends in tar.gz. The tar command turns several files and directories into one single tar file. This is then compressed using the gzip command (to create a tar.gz file).

123

First unzip the file using the gunzip command. This will create a .tar file. % gunzip units-1.74.tar.gz Then extract the contents of the tar file. % tar -xvf units-1.74.tar Again, list the contents of the download directory, then go to the units-1.74 subdirectory. % cd units-1.74

7.4 Configuring and creating the Makefile
The first thing to do is carefully read the README and INSTALL text files (use the less command). These contain important information on how to compile and run the software. The units package uses the GNU configure system to compile the source code. We will need to specify the installation directory, since the default will be the main system area which you will not have write permissions for. We need to create an install directory in your home directory. % mkdir ~/units174 Then run the configure utility setting the installation path to this. % ./configure --prefix=$HOME/units174 NOTE: The $HOME variable is an example of an environment variable. The value of $HOME is the path to your home directory. Just type
% echo $HOME

to show the contents of this variable. We will learn more about environment variables in a later chapter. If configure has run correctly, it will have created a Makefile with all necessary options. You can view the Makefile if you wish (use the less command), but do not edit the contents of this.

7.5 Building the package
124

Now you can go ahead and build the package by running the make command. % make After a minute or two (depending on the speed of the computer), the executables will be created. You can check to see everything compiled successfully by typing % make check If everything is okay, you can now install the package. % make install This will install the files into the ~/units174 directory you created earlier.

7.6 Running the software
You are now ready to run the software (assuming everything worked). % cd ~/units174 If you list the contents of the units directory, you will see a number of subdirectories. bin info The binary executables GNU info formatted documentation

man Man pages share Shared data files To run the program, change to the bin directory and type % ./units As an example, convert 6 feet to metres. You have: 6 feet You want: metres * 1.8288 If you get the answer 1.8288, congratulations, it worked.

125

To view what units it can convert between, view the data file in the share directory (the list is quite comprehensive). To read the full documentation, change into the info directory and type % info --file=units.info

7.7 Stripping unnecessary code
When a piece of software is being developed, it is useful for the programmer to include debugging information into the resulting executable. This way, if there are problems encountered when running the executable, the programmer can load the executable into a debugging software package and track down any software bugs. This is useful for the programmer, but unnecessary for the user. We can assume that the package, once finished and available for download has already been tested and debugged. However, when we compiled the software above, debugging information was still compiled into the final executable. Since it is unlikey that we are going to need this debugging information, we can strip it out of the final executable. One of the advantages of this is a much smaller executable, which should run slightly faster. What we are going to do is look at the before and after size of the binary file. First change into the bin directory of the units installation directory. % cd ~/units174/bin % ls -l As you can see, the file is over 100 kbytes in size. You can get more information on the type of file by using the file command. % file units units: ELF 32-bit LSB executable, Intel 80386, version 1, dynamically linked (uses shared libs), not stripped To strip all the debug and line numbering information out of the binary file, use the strip command % strip units % ls -l As you can see, the file is now 36 kbytes - a third of its original size. Two thirds of the binary file was debug code !!!

126

Check the file information again. % file units units: ELF 32-bit LSB executable, Intel 80386, version 1, dynamically linked (uses shared libs), stripped HINT: You can use the make command to install pre-stripped copies of all the binary files when you install the package. Instead of typing make install, simply type make install-strip

Part Eight

8.1 UNIX Variables
Variables are a way of passing information from the shell to programs when you run them. Programs look "in the environment" for particular variables and if they are found will use the values stored. Some are set by the system, others by you, yet others by the shell, or any program that loads another program. Standard UNIX variables are split into two categories, environment variables and shell variables. In broad terms, shell variables apply only to the current instance of the shell and are used to set short-term working conditions; environment variables have a farther reaching significance, and those set at login are valid for the duration of the session. By convention, environment variables have UPPER CASE and shell variables have lower case names.

8.2 Environment Variables
An example of an environment variable is the OSTYPE variable. The value of this is the current operating system you are using. Type % echo $OSTYPE More examples of environment variables are
• • • • • • •

USER (your login name) HOME (the path name of your home directory) HOST (the name of the computer you are using) ARCH (the architecture of the computers processor) DISPLAY (the name of the computer screen to display X windows) PRINTER (the default printer to send print jobs) PATH (the directories the shell should search to find a command)

127

Finding out the current values of these variables.
ENVIRONMENT variables are set using the setenv command, displayed using the printenv or env commands, and unset using the unsetenv command. To show all values of these variables, type % printenv | less

8.3 Shell Variables
An example of a shell variable is the history variable. The value of this is how many shell commands to save, allow the user to scroll back through all the commands they have previously entered. Type % echo $history More examples of shell variables are
• • • •

cwd (your current working directory) home (the path name of your home directory) path (the directories the shell should search to find a command) prompt (the text string used to prompt for interactive commands shell your login shell)

Finding out the current values of these variables.
SHELL variables are both set and displayed using the set command. They can be unset by using the unset command. To show all values of these variables, type % set | less

So what is the difference between PATH and path ?
In general, environment and shell variables that have the same name (apart from the case) are distinct and independent, except for possibly having the same initial values. There are, however, exceptions. Each time the shell variables home, user and term are changed, the corresponding environment variables HOME, USER and TERM receive the same values. However, altering the environment variables has no effect on the corresponding shell variables.

128

PATH and path specify directories to search for commands and programs. Both variables always represent the same directory list, and altering either automatically causes the other to be changed.

8.4 Using and setting variables
Each time you login to a UNIX host, the system looks in your home directory for initialisation files. Information in these files is used to set up your working environment. The C and TC shells uses two files called .login and .cshrc (note that both file names begin with a dot). At login the C shell first reads .cshrc followed by .login .login is to set conditions which will apply to the whole session and to perform actions that are relevant only at login. .cshrc is used to set conditions and perform actions specific to the shell and to each invocation of it. The guidelines are to set ENVIRONMENT variables in the .login file and SHELL variables in the .cshrc file. WARNING: NEVER put commands that run graphical displays (e.g. a web browser) in your .cshrc or .login file.

8.5 Setting shell variables in the .cshrc file
For example, to change the number of shell commands saved in the history list, you need to set the shell variable history. It is set to 100 by default, but you can increase this if you wish. % set history = 200 Check this has worked by typing % echo $history However, this has only set the variable for the lifetime of the current shell. If you open a new xterm window, it will only have the default history value set. To PERMANENTLY set the value of history, you will need to add the set command to the .cshrc file. First open the .cshrc file in a text editor. An easy, user-friendly editor to use is nedit. % nedit ~/.cshrc

129

Add the following line AFTER the list of other commands. set history = 200 Save the file and force the shell to reread its .cshrc file buy using the shell source command. % source .cshrc Check this has worked by typing % echo $history

8.6 Setting the path
When you type a command, your path (or PATH) variable defines in which directories the shell will look to find the command you typed. If the system returns a message saying "command: Command not found", this indicates that either the command doesn't exist at all on the system or it is simply not in your path. For example, to run units, you either need to directly specify the units path (~/units174/bin/units), or you need to have the directory ~/units174/bin in your path. You can add it to the end of your existing path (the $path represents this) by issuing the command: % set path = ($path ~/units174/bin) Test that this worked by trying to run units in any directory other that where units is actually located. % cd; units HINT: You can run multiple commands on one line by separating them with a semicolon. To add this path PERMANENTLY, add the following line to your .cshrc AFTER the list of other commands. set path = ($path ~/units174/bin)

130

Unix - Frequently Asked Questions (1) [Frequent posting]
These articles are divided approximately as follows: 1.*) General questions. 2.*) Relatively basic questions, likely to be asked by beginners. 3.*) Intermediate questions.

4.*) Advanced questions, likely to be asked by people who thought
they already knew all of the answers. 5.*) Questions pertaining to the various shells, and the differences. This article includes answers to: 1.1) Who helped you put this list together?

1.2) When someone refers to ‘rn(1)’ or ‘ctime(3)’, what does the number in parentheses mean?
1.3) What does {some strange unix command name} stand for? 1.4) How does the gateway between “comp.unix.questions” and the “info-unix” mailing list work? 1.5) What are some useful Unix or C books?

1.6) What happened to the pronunciation list that used to be part of this document?
If you’re looking for the answer to, say, question 1.5, and want to skip everything else, you can search ahead for the regular expression “^1.5)”. While these are all legitimate questions, they seem to crop up in comp.unix.questions or comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are correct) and then a period of griping about how the same questions keep coming up. You may also like to read the monthly article “Answers to Frequently Asked Questions” in the newsgroup “news.announce.newusers”, which will tell you what “UNIX” stands for. With the variety of Unix systems in the world, it’s hard to guarantee that these answers will work everywhere. Read your local manual pages before trying anything suggested here. If you have suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.

1.2) When someone refers to ‘rn(1)’ or ‘ctime(3)’, what does the number in parentheses mean? It looks like some sort of function call, but it isn’t. These numbers refer to the section of the “Unix manual” where the appropriate documentation can be found. You could type “man 3 ctime” to look up the manual page for “ctime” in section 3 of the manual. The traditional manual sections are:

1

User-level commands 131

2 3 4 5 6 7 8

System calls Library functions Devices and device drivers File formats Games Various miscellaneous stuff - macro packages etc. System maintenance and operation commands

Some Unix versions use non-numeric section names. For instance, Xenix uses “C” for commands and “S” for functions. Some newer versions of Unix require “man -s# title” instead of “man # title”. Each section has an introduction, which you can read with “man # intro” where # is the section number. Sometimes the number is necessary to differentiate between a command and a library routine or system call of the same name. For instance, your system may have “time(1)”, a manual page about the ‘time’ command for timing programs, and also “time(3)”, a manual page about the ‘time’ subroutine for determining the current time. You can use “man 1 time” or “man 3 time” to specify which “time” man page you’re interested in. You’ll often find other sections for local programs or even subsections of the sections above Ultrix has sections 3m, 3n, 3x and 3yp among others.

1.3) What does {some strange unix command name} stand for? awk = “Aho Weinberger and Kernighan” This language was named by its authors, Al Aho, Peter Weinberger and Brian Kernighan. grep = “Global Regular Expression Print” grep comes from the ed command to print all lines matching a certain pattern g/re/p where “re” is a “regular expression”. fgrep = “Fixed GREP”. fgrep searches for fixed strings only. The “f” does not stand for “fast” - in fact, “fgrep foobar *.c” is usually slower than “egrep foobar *.c” (Yes, this is kind of surprising. Try it.) Fgrep still has its uses though, and may be useful when searching a file for a larger number of strings than egrep can handle. egrep = “Extended GREP” egrep uses fancier regular expressions than grep. Many people use egrep all the time, since it has some more sophisticated internal algorithms than grep or fgrep, and is usually the fastest of the three programs. cat = “CATenate”

132

catenate is an obscure word meaning “to connect in a series”, which is what the “cat” command does to one or more files. Not to be confused with C/A/T, the Computer Aided Typesetter. gecos = “General Electric Comprehensive Operating Supervisor” When GE’s large systems division was sold to Honeywell, Honeywell dropped the “E” from “GECOS”. Unix’s password file has a “pw_gecos” field. The name is a real holdover from the early days. Dennis Ritchie has reported: “Sometimes we sent printer output or batch jobs to the GCOS machine. The gcos field in the password file was a place to stash the information for the $IDENT card. Not elegant.” nroff = “New ROFF” troff = “Typesetter new ROFF” These are descendants of “roff”, which was a re-implementation of the Multics “runoff” program (a program that you’d use to “run off” a good copy of a document). tee =T From plumbing terminology for a T-shaped pipe splitter. bss = “Block Started by Symbol” Dennis Ritchie says: Actually the acronym (in the sense we took it up; it may have other credible etymologies) is “Block Started by Symbol.” It was a pseudo-op in FAP (Fortran Assembly [-er?] Program), an assembler for the IBM 704-709-7090-7094 machines. It defined its label and set aside space for a given number of words. There was another pseudo-op, BES, “Block Ended by Symbol” that did the same except that the label was defined by the last assigned word + 1. (On these machines Fortran arrays were stored backwards in storage and were 1-origin.) The usage is reasonably appropriate, because just as with standard Unix loaders, the space assigned didn’t have to be punched literally into the object deck but was represented by a count somewhere. biff = “BIFF” This command, which turns on asynchronous mail notification, was actually named after a dog at Berkeley. I can confirm the origin of biff, if you’re interested. Biff was Heidi Stettner’s dog, back when Heidi (and I, and Bill Joy) were all grad students at U.C. Berkeley and the early versions of BSD were being developed. Biff was popular among the residents of Evans Hall, and was known for barking at the mailman, hence the name of the command. Confirmation courtesy of Eric Cooper, Carnegie Mellon University rc (as in “.cshrc” or “/etc/rc”) = “RunCom” “rc” derives from “runcom”, from the MIT CTSS system, ca. 1965. ‘There was a facility that would execute a bunch of commands stored in a file; it was called “runcom” for “run commands”, and the file began to be called “a runcom.” “rc” in Unix is a fossil from that usage.’

133

Brian Kernighan & Dennis Ritchie, as told to Vicki Brown
“rc” is also the name of the shell from the new Plan 9 operating system. Perl = “Practical Extraction and Report Language” Perl = “Pathologically Eclectic Rubbish Lister” The Perl language is Larry Wall’s highly popular freely-available completely portable text, process, and file manipulation tool that bridges the gap between shell and C programming (or between doing it on the command line and pulling your hair out). For further information, see the

Usenet newsgroup comp.lang.perl.misc.
Don Libes’ book “Life with Unix” contains lots more of these tidbits.

1.4) How does the gateway between “comp.unix.questions” and the “info-unix” mailing list work? “info-unix” and “unix-wizards” are mailing list versions of comp.unix.questions and comp.unix.wizards respectively. There should be no difference in content between the mailing list and the newsgroup. To get on or off either of these lists, send mail to info-unix-request@brl.mil or unix-wizards-request@brl.mil. Be sure to use the ‘-Request’. Don’t expect an immediate response. Here are the gory details, courtesy of the list’s maintainer, Bob Reschly. ==== postings to info-UNIX and UNIX-wizards lists ==== Anything submitted to the list is posted; I do not moderate incoming traffic—BRL functions as a reflector. Postings submitted by Internet subscribers should be addressed to the list address (info-UNIX or UNIX- wizards); the ‘-request’ addresses are for correspondence with the list maintainer [me]. Postings submitted by USENET readers should be addressed to the appropriate news group (comp.unix.questions or comp.unix.wizards). For Internet subscribers, received traffic will be of two types; individual messages, and digests. Traffic which comes to BRL from the Internet and BITNET (via the BITNETInternet gateway) is immediately resent to all addressees on the mailing list. Traffic originating on USENET is gathered up into digests which are sent to all list members daily. BITNET traffic is much like Internet traffic. The main difference is that I maintain only one address for traffic destined to all BITNET subscribers. That address points to a list exploder which then sends copies to individual BITNET subscribers. This way only one copy of a given message has to cross the BITNET-Internet gateway in either direction. USENET subscribers see only individual messages. All messages originating on the Internet side are forwarded to our USENET machine. They are then posted to the appropriate newsgroup. Unfortunately, for gatewayed messages, the sender becomes “news@brl-adm”. This is currently an unavoidable side-effect of the software which performs the gateway function. As for readership, USENET has an extremely large readership - I would guess several thousand hosts and tens of thousands of readers. The master list maintained here at BRL runs

134

about two hundred fifty entries with roughly ten percent of those being local redistribution lists. I don’t have a good feel for the size of the BITNET redistribution, but I would guess it is roughly the same size and composition as the master list. Traffic runs 150K to 400K bytes per list per week on average.

1.5) What are some useful Unix or C books? Mitch Wright (mitch@cirrus.com) maintains a useful list of Unix and C books, with descriptions and some mini-reviews. There are currently 167 titles on his list. You can obtain a copy of this list by anonymous ftp from ftp.rahul.net (192.160.13.1), where it’s “pub/mitch/YABL/yabl”. Send additions or suggestions to mitch@cirrus.com. Samuel Ko (kko@sfu.ca) maintains another list of Unix books. This list contains only recommended books, and is therefore somewhat shorter. This list is also a classified list, with books grouped into categories, which may be better if you are looking for a specific type of book. You can obtain a copy of this list by anonymous ftp from rtfm.mit.edu, where it’s “pub/usenet/news.answers/books/unix”. Send additions or suggestions to kko@sfu.ca. If you can’t use anonymous ftp, email the line “help” to “ftpmail@decwrl.dec.com” for instructions on retrieving things via email.

1.6) What happened to the pronunciation list that used to be part of this document? From its inception in 1989, this FAQ document included a comprehensive pronunciation list maintained by Maarten Litmaath (thanks, Maarten!). It was originally created by Carl Paukstis <carlp@frigg.isc-br.com>. It has been retired, since it is not really relevant to the topic of “Unix questions”. You can still find it as part of the widely-distributed “Jargon” file (maintained by Eric S. Raymond, eric@snark.thyrsus.com) which seems like a much more appropriate forum for the topic of “How do you pronounce /* ?” If you’d like a copy, you can ftp one from ftp.wg.omron.co.jp (133.210.4.4), it’s “pub/unixfaq/docs/Pronunciation-Guide”.

135

Unix - Frequently Asked Questions (2) [Frequent posting]
This article includes answers to: 2.1) 2.2) 2.3) 2.4) 2.5) How do I remove a file whose name begins with a “-“ ? How do I remove a file with funny characters in the filename ? How do I get a recursive directory listing? How do I get the current directory into my prompt? How do I read characters from the terminal in a shell script?

2.6) How do I rename “*.foo” to “*.bar”, or change file names to lowercase? 2.7) Why do I get [some strange error message] when I “rsh host command” ? 2.8) How do I {set an environment variable, change directory} inside a program or shell script and have that change affect my current shell?
2.9) How do I redirect stdout and stderr separately in csh? 2.10) How do I tell inside .cshrc if I’m a login shell?

2.11) How do I construct a shell glob-pattern that matches all files except “.” and “..” ?
2.12) How do I find the last argument in a Bourne shell script? 2.13) What’s wrong with having ‘.’ in your $PATH ? 2.14) How do I ring the terminal bell during a shell script? 2.15) Why can’t I use “talk” to talk with my friend on machine X? 2.16) Why does calendar produce the wrong output? If you’re looking for the answer to, say, question 2.5, and want to skip everything else, you can search ahead for the regular expression “^2.5)”. While these are all legitimate questions, they seem to crop up in comp.unix.questions or comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are correct) and then a period of griping about how the same questions keep coming up. You may also like to read the monthly article “Answers to Frequently Asked Questions” in the newsgroup “news.announce.newusers”, which will tell you what “UNIX” stands for. With the variety of Unix systems in the world, it’s hard to guarantee that these answers will work everywhere. Read your local manual pages before trying anything suggested here. If you have suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.

2.1) How do I remove a file whose name begins with a “-“ ? Figure out some way to name the file so that it doesn’t begin with a dash. The simplest answer is to use rm ./-filename

136

(assuming “-filename” is in the current directory, of course.) This method of avoiding the interpretation of the “-“ works with other commands too. Many commands, particularly those that have been written to use the “getopt(3)” argument parsing routine, accept a “—“ argument which means “this is the last option, anything after this is not an option”, so your version of rm might handle “rm -- -filename”. Some versions of rm that don’t use getopt() treat a single “-“ in the same way, so you can also try “rm -filename”.

2.2) How do I remove a file with funny characters in the filename ? If the ‘funny character’ is a ‘/’, skip to the last part of this answer. If the funny character is something else, such as a ‘ ‘ or control character or character with the 8th bit set, keep reading.
The classic answers are

rm -i some*pattern*that*matches*only*the*file*you*want which asks you whether you want to remove each file matching the indicated pattern; depending on your shell, this may not work if the filename has a character with the 8th bit set (the shell may strip that off); and rm -ri . which asks you whether to remove each file in the directory. Answer “y” to the problem file and “n” to everything else. Unfortunately this doesn’t work with many versions of rm. Also unfortunately, this will walk through every subdirectory of “.”, so you might want to “chmod a-x” those directories temporarily to make them unsearchable. Always take a deep breath and think about what you’re doing and double check what you typed when you use rm’s “-r” flag or a wildcard on the command line; and find . -type f ... -ok rm ‘{}’ \; where “...” is a group of predicates that uniquely identify the file. One possibility is to figure out the inode number of the problem file (use “ls -i .”) and then use find . -inum 12345 -ok rm ‘{}’ \; or find . -inum 12345 -ok mv ‘{}’ new-file-name \; “-ok” is a safety check - it will prompt you for confirmation of the command it’s about to execute. You can use “-exec” instead to avoid the prompting, if you want to live dangerously, or if you suspect that the filename may contain a funny character sequence that will mess up your screen when printed.
What if the filename has a ‘/’ in it?

These files really are special cases, and can only be created by buggy kernel code (typically by implementations of NFS that don’t filter out illegal characters in file names from remote machines.) The first thing to do is to try to understand exactly why this problem is so strange.

137

Recall that Unix directories are simply pairs of filenames and inode numbers. A directory essentially contains information like this: filename inode
file1 12345 file2.c 12349 file3 12347

Theoretically, ‘/’ and ‘\0’ are the only two characters that cannot appear in a filename - ‘/’ because it’s used to separate directories and files, and ‘\0’ because it terminates a filename. Unfortunately some implementations of NFS will blithely create filenames with embedded slashes in response to requests from remote machines. For instance, this could happen when someone on a Mac or other non-Unix machine decides to create a remote NFS file on your Unix machine with the date in the filename. Your Unix directory then has this in it: filename inode 91/02/07 12357 No amount of messing around with ‘find’ or ‘rm’ as described above will delete this file, since those utilities and all other Unix programs, are forced to interpret the ‘/’ in the normal way. Any ordinary program will eventually try to do unlink(“91/02/07”), which as far as the kernel is concerned means “unlink the file 07 in the subdirectory 02 of directory 91”, but that’s not what we have - we have a FILE named “91/02/07” in the current directory. This is a subtle but crucial distinction. What can you do in this case? The first thing to try is to return to the Mac that created this crummy entry, and see if you can convince it and your local NFS daemon to rename the file to something without slashes. If that doesn’t work or isn’t possible, you’ll need help from your system manager, who will have to try the one of the following. Use “ls -i” to find the inode number of this bogus file, then unmount the file system and use “clri” to clear the inode, and “fsck” the file system with your fingers crossed. This destroys the information in the file. If you want to keep it, you can try: create a new directory in the same parent directory as the one containing the bad file name; move everything you can (i.e. everything but the file with the bad name) from the old directory to the new one; do “ls -id” on the directory containing the file with the bad name to get its inumber; umount the file system; “clri” the directory containing the file with the bad name; “fsck” the file system. Then, to find the file, remount the file system; rename the directory you created to have the name of the old directory (since the old directory should have been blown away by “fsck”) move the file out of “lost+found” into the directory with a better name.

138

Alternatively, you can patch the directory the hard way by crawling around in the raw file system. Use “fsdb”, if you have it.

2.3) How do I get a recursive directory listing? One of the following may do what you want:
ls -R find . -print du -a . (not all versions of “ls” have -R) (should work everywhere) (shows you both the name and size)

If you’re looking for a wildcard pattern that will match all “.c” files in this directory and below, you won’t find one, but you can use % some-command ‘find . -name ‘*.c’ -print‘ “find” is a powerful program. Learn about it.

2.4) How do I get the current directory into my prompt? It depends which shell you are using. It’s easy with some shells, hard or impossible with others. C Shell (csh): Put this in your .cshrc - customize the prompt variable the way you want. alias setprompt ‘set prompt=”${cwd}% “’ setprompt # to set the initial prompt alias cd ‘chdir \!* && setprompt’
If you use pushd and popd, you’ll also need

alias pushd ‘pushd \!* && setprompt’ alias popd ‘popd \!* && setprompt’ Some C shells don’t keep a $cwd variable - you can use ‘pwd‘ instead. If you just want the last component of the current directory in your prompt (“mail% “ instead of “/usr/spool/mail% “) you can use alias setprompt ‘set prompt=”$cwd:t% “’ Some older csh’s get the meaning of && and || reversed. Try doing: false && echo bug If it prints “bug”, you need to switch && and || (and get a better version of csh.) Bourne Shell (sh): If you have a newer version of the Bourne Shell (SVR2 or newer) you can use a shell function to make your own command, “xcd” say:

139

xcd() { cd $* ; PS1=”‘pwd‘ $ “; } If you have an older Bourne shell, it’s complicated but not impossible. Here’s one way. Add this to your .profile file: LOGIN_SHELL=$$ export LOGIN_SHELL CMDFILE=/tmp/cd.$$ export CMDFILE # 16 is SIGURG, pick a signal that’s not likely to be used PROMPTSIG=16 export PROMPTSIG trap ‘. $CMDFILE’ $PROMPTSIG and then put this executable script (without the indentation!), let’s call it “xcd”, somewhere in your PATH : xcd directory - change directory and set prompt : by signalling the login shell to read a command file cat >${CMDFILE?”not set”} <<EOF cd $1 PS1=”\‘pwd\‘$ “ EOF kill -${PROMPTSIG?”not set”} ${LOGIN_SHELL?”not set”} Now change directories with “xcd /some/dir”. Korn Shell (ksh): Put this in your .profile file:
PS1=’$PWD $ ‘

If you just want the last component of the directory, use
PS1=’${PWD##*/} $ ‘

T C shell (tcsh) Tcsh is a popular enhanced version of csh with some extra builtin variables (and many other features):
%~ the current directory, using ~ for $HOME %/ the full pathname of the current directory %c or %. the trailing component of the current directory

so you can do set prompt=’%~ ‘ BASH (FSF’s “Bourne Again SHell”) \w in $PS1 gives the full pathname of the current directory, with ~ expansion for $HOME; \W gives the basename of the current directory. So, in addition to the above sh and ksh solutions, you could use PS1=’\w $ ‘ or
PS1=’\W $ ‘

140

2.5) How do I read characters from the terminal in a shell script?
In sh, use read. It is most common to use a loop like

while read line do
...

done In csh, use $< like this: while ( 1 ) set line = “$<” if ( “$line” == “” ) break
...

end Unfortunately csh has no way of distinguishing between a blank line and an end-of-file. If you’re using sh and want to read a single character from the terminal, you can try something like echo -n “Enter a character: “ stty cbreak # or stty raw readchar=‘dd if=/dev/tty bs=1 count=1 2>/dev/null‘ stty -cbreak echo “Thank you for typing a $readchar .”

2.6) How do I rename “*.foo” to “*.bar”, or change file names to lowercase? Why doesn’t “mv *.foo *.bar” work? Think about how the shell expands wildcards. “*.foo” and “*.bar” are expanded before the mv command ever sees the arguments. Depending on your shell, this can fail in a couple of ways. CSH prints “No match.” because it can’t match “*.bar”. SH executes “mv a.foo b.foo c.foo *.bar”, which will only succeed if you happen to have a single directory named “*.bar”, which is very unlikely and almost certainly not what you had in mind. Depending on your shell, you can do it with a loop to “mv” each file individually. If your system has “basename”, you can use: C Shell: foreach f ( *.foo ) set base=‘basename $f .foo‘ mv $f $base.bar end Bourne Shell: for f in *.foo; do base=‘basename $f .foo‘ mv $f $base.bar done

141

Some shells have their own variable substitution features, so instead of using “basename”, you can use simpler loops like: C Shell: foreach f ( *.foo ) mv $f $f:r.bar end Korn Shell: for f in *.foo; do mv $f ${f%foo}bar done If you don’t have “basename” or want to do something like renaming foo.* to bar.*, you can use something like “sed” to strip apart the original file name in other ways, but the general looping idea is the same. You can also convert file names into “mv” commands with ‘sed’, and hand the commands off to “sh” for execution. Try ls -d *.foo | sed -e ‘s/.*/mv & &/’ -e ‘s/foo$/bar/’ | sh A program by Vladimir Lanin called “mmv” that does this job nicely was posted to comp.sources.unix (Volume 21, issues 87 and 88) in April 1990. It lets you use mmv ‘*.foo’ ‘=1.bar’ Shell loops like the above can also be used to translate file names from upper to lower case or vice versa. You could use something like this to rename uppercase files to lowercase: C Shell: foreach f ( * ) mv $f ‘echo $f | tr ‘[A-Z]’ ‘[a-z]’‘ end Bourne Shell: for f in *; do mv $f ‘echo $f | tr ‘[A-Z]’ ‘[a-z]’‘ done Korn Shell: typeset -l l for f in *; do l=”$f” mv $f $l done If you wanted to be really thorough and handle files with ‘funny’ names (embedded blanks or whatever) you’d need to use Bourne Shell: for f in *; do g=‘expr “xxx$f” : ‘xxx\(.*\)’ | tr ‘[A-Z]’ ‘[a-z]’‘ mv “$f” “$g” done The ‘expr’ command will always print the filename, even if it equals ‘-n’ or if it contains a System V escape sequence like ‘\c’.

142

Some versions of “tr” require the [ and ], some don’t. It happens to be harmless to include them in this particular example; versions of tr that don’t want the [] will conveniently think they are supposed to translate ‘[’ to ‘[’ and ‘]’ to ‘]’. If you have the “perl” language installed, you may find this rename script by Larry Wall very useful. It can be used to accomplish a wide variety of filename changes. #!/usr/bin/perl
#

# rename script examples from lwall:
# # # # rename ‘s/\.orig$//’ *.orig rename ‘y/A-Z/a-z/ unless /^Make/’ * rename ‘$_ .= “.bad”’ *.f rename ‘print “$_: “; s/foo/bar/ if <stdin> =~ /^y/i’ *

$op = shift; for (@ARGV) { $was = $_; eval $op; die $@ if $@; rename($was,$_) unless $was eq $_;
}

2.7) Why do I get [some strange error message] when I “rsh host command” ? (We’re talking about the remote shell program “rsh” or sometimes “remsh” or “remote”; on some machines, there is a restricted shell called “rsh”, which is a different thing.) If your remote account uses the C shell, the remote host will fire up a C shell to execute ‘command’ for you, and that shell will read your remote .cshrc file. Perhaps your .cshrc contains a “stty”, “biff” or some other command that isn’t appropriate for a non-interactive shell. The unexpected output or error message from these commands can screw up your rsh in odd ways.
Here’s an example. Suppose you have

stty erase ^H biff y in your .cshrc file. You’ll get some odd messages like this. % rsh some-machine date stty: : Can’t assign requested address Where are you? Tue Oct 1 09:24:45 EST 1991 You might also get similar errors when running certain “at” or “cron” jobs that also read your .cshrc file. Fortunately, the fix is simple. There are, quite possibly, a whole bunch of operations in your “.cshrc” (e.g., “set history=N”) that are simply not worth doing except in interactive shells. What you do is surround them in your “.cshrc” with: if ( $?prompt ) then

143

operations.... endif and, since in a non-interactive shell “prompt” won’t be set, the operations in question will only be done in interactive shells. You may also wish to move some commands to your .login file; if those commands only need to be done when a login session starts up (checking for new mail, unread news and so on) it’s better to have them in the .login file.

2.8) How do I {set an environment variable, change directory} inside a program or shell script and have that change affect my current shell? In general, you can’t, at least not without making special arrangements. When a child process is created, it inherits a copy of its parent’s variables (and current directory). The child can change these values all it wants but the changes won’t affect the parent shell, since the child is changing a copy of the original data. Some special arrangements are possible. Your child process could write out the changed variables, if the parent was prepared to read the output and interpret it as commands to set its own variables. Also, shells can arrange to run other shell scripts in the context of the current shell, rather than in a child process, so that changes will affect the original shell. For instance, if you have a C shell script named “myscript”: cd /very/long/path setenv PATH /something:/something-else or the equivalent Bourne or Korn shell script cd /very/long/path PATH=/something:/something-else export PATH and try to run “myscript” from your shell, your shell will fork and run the shell script in a subprocess. The subprocess is also running the shell; when it sees the “cd” command it changes its current directory, and when it sees the “setenv” command it changes its environment, but neither has any effect on the current directory of the shell at which you’re typing (your login shell, let’s say). In order to get your login shell to execute the script (without forking) you have to use the “.” command (for the Bourne or Korn shells) or the “source” command (for the C shell). I.e. you type . myscript to the Bourne or Korn shells, or source myscript to the C shell. If all you are trying to do is change directory or set an environment variable, it will probably be simpler to use a C shell alias or Bourne/Korn shell function. See the “how do I get the current directory into my prompt” section of this article for some examples. A much more detailed answer prepared by

144

xtm@telelogic.se (Thomas Michanek) can be found at ftp.wg.omron.co.jp in /pub/unix-faq/docs/script-vs-env. 2.9) How do I redirect stdout and stderr separately in csh? In csh, you can redirect stdout with “>”, or stdout and stderr together with “>&” but there is no direct way to redirect stderr only. The best you can do is ( command >stdout_file ) >&stderr_file which runs “command” in a subshell; stdout is redirected inside the subshell to stdout_file, and both stdout and stderr from the subshell are redirected to stderr_file, but by this point stdout has already been redirected so only stderr actually winds up in stderr_file. If what you want is to avoid redirecting stdout at all, let sh do it for you. sh -c ‘command 2>stderr_file’

2.10) How do I tell inside .cshrc if I’m a login shell?
When people ask this, they usually mean either

How can I tell if it’s an interactive shell? or How can I tell if it’s a top-level shell? You could perhaps determine if your shell truly is a login shell (i.e. is going to source “.login” after it is done with “.cshrc”) by fooling around with “ps” and “$$”. Login shells generally have names that begin with a ‘-‘. If you’re really interested in the other two questions, here’s one way you can organize your .cshrc to find out. if (! $?CSHLEVEL) then
#

# This is a “top-level” shell, # perhaps a login shell, perhaps a shell started up by # ‘rsh machine some-command’ # This is where we should set PATH and anything else we # want to apply to every one of our shells.
# setenv CSHLEVEL 0 set home = ~username # just to be sure source ~/.env # environment stuff we always want

else
#

# This shell is a child of one of our other shells so # we don’t need to set all the environment variables again.
#

set tmp = $CSHLEVEL @ tmp++ setenv CSHLEVEL endif

$tmp

# Exit from .cshrc if not interactive, e.g. under rsh

145

if (! $?prompt) exit # Here we could set the prompt or aliases that would be useful # for interactive shells only. source ~/.aliases

2.11) How do I construct a shell glob-pattern that matches all files except “.” and “..” ? You’d think this would be easy. • .* Matches all files that don’t begin with a “.”; Matches all files that do begin with a “.”, but

this includes the special entries “.” and “..”, which often you don’t want;
.[!.]* (Newer shells only; some shells use a “^” instead of the “!”; POSIX shells must accept the “!”, but may accept a “^” as well; all portable applications shall not use an unquoted “^” immediately following the “[”) Matches all files that begin with a “.” and are followed by a non-“.”; unfortunately this will miss “..foo”; .??* Matches files that begin with a “.” and which are at least 3 characters long. This neatly avoids “.” and “..”, but also misses “.a” .

So to match all files except “.” and “..” safely you have to use 3 patterns (if you don’t have filenames like “.a” you can leave out the first):
.[!.]* .??* *

Alternatively you could employ an external program or two and use backquote substitution. This is pretty good: ‘ls -a | sed -e ‘/^\.$/d’ -e ‘/^\.\.$/d’‘ (or ‘ls -A‘ in some Unix versions) but even it will mess up on files with newlines, IFS characters or wildcards in their names.
In ksh, you can use: .!(.|) *

2.12) How do I find the last argument in a Bourne shell script? Answer by: Martin Weitzel <@mikros.systemware.de:martin@mwtech.uucp>
Maarten Litmaath <maart@nat.vu.nl>

146

If you are sure the number of arguments is at most 9, you can use: eval last=\${$#} In POSIX-compatible shells it works for ANY number of arguments. The following works always too: for last do
:

done This can be generalized as follows: for i do third_last=$second_last second_last=$last last=$i done Now suppose you want to REMOVE the last argument from the list, or REVERSE the argument list, or ACCESS the N-th argument directly, whatever N may be. Here is a basis of how to do it, using only built-in shell constructs, without creating subprocesses: t0= u0= rest=’1 2 3 4 5 6 7 8 9’ argv= for h in “ $rest do for t in “$t0” $rest do for u in $u0 $rest do case $# in 0) break 3 esac eval argv$h$t$u=\$1 argv=“$argv \“\$argv$h$t$u\““ # (1) shift done u0=0 done t0=0 done # now restore the arguments eval set x “$argv” shift # (2)

This example works for the first 999 arguments. Enough? Take a good look at the lines marked (1) and (2) and convince yourself that the original arguments are restored indeed, no matter what funny characters they contain! To find the N-th argument now you can use this:

147

eval argN=\$argv$N To reverse the arguments the line marked (1) must be changed to: argv=“\“\$argv$h$t$u\“ $argv“ How to remove the last argument is left as an exercise. If you allow subprocesses as well, possibly executing nonbuilt-in commands, the ‘argvN’ variables can be set up more easily:
N=1

for i do eval argv$N=\$i N=‘expr $N + 1‘ done To reverse the arguments there is still a simpler method, that even does not create subprocesses. This approach can also be taken if you want to delete e.g. the last argument, but in that case you cannot refer directly to the N-th argument any more, because the ‘argvN’ variables are set up in reverse order: argv= for i do eval argv$#=\$i argv=”\”\$argv$#\” $argv” shift done eval set x “$argv” shift 2.13) What’s wrong with having ‘.’ in your $PATH ? A bit of background: the PATH environment variable is a list of directories separated by colons. When you type a command name without giving an explicit path (e.g. you type “ls”, rather than “/bin/ls”) your shell searches each directory in the PATH list in order, looking for an executable file by that name, and the shell will run the first matching program it finds. One of the directories in the PATH list can be the current directory “.” . It is also permissible to use an empty directory name in the PATH list to indicate the current directory. Both of these are equivalent for csh users: setenv PATH :/usr/ucb:/bin:/usr/bin setenv PATH .:/usr/ucb:/bin:/usr/bin for sh or ksh users PATH=:/usr/ucb:/bin:/usr/bin export PATH PATH=.:/usr/ucb:/bin:/usr/bin export PATH

148

Having “.” somewhere in the PATH is convenient - you can type “a.out” instead of “./a.out” to run programs in the current directory. But there’s a catch. Consider what happens in the case where “.” is the first entry in the PATH. Suppose your current directory is a publically-writable one, such as “/tmp”. If there just happens to be a program named “/tmp/ls” left there by some other user, and you type “ls” (intending, of course, to run the normal “/bin/ls” program), your shell will instead run “./ls”, the other user’s program. Needless to say, the results of running an unknown program like this might surprise you. It’s slightly better to have “.” at the end of the PATH: setenv PATH /usr/ucb:/bin:/usr/bin:. Now if you’re in /tmp and you type “ls”, the shell will search /usr/ucb, /bin and /usr/bin for a program named “ls” before it gets around to looking in “.”, and there is less risk of inadvertently running some other user’s “ls” program. This isn’t 100% secure though - if you’re a clumsy typist and some day type “sl -l” instead of “ls -l”, you run the risk of running “./sl”, if there is one. Some “clever” programmer could anticipate common typing mistakes and leave programs by those names scattered throughout public directories. Beware. Many seasoned Unix users get by just fine without having “.” in the PATH at all: setenv PATH /usr/ucb:/bin:/usr/bin If you do this, you’ll need to type “./program” instead of “program” to run programs in the current directory, but the increase in security is probably worth it.

2.14) How do I ring the terminal bell during a shell script? The answer depends on your Unix version (or rather on the kind of “echo” program that is available on your machine). A BSD-like “echo” uses the “-n” option for suppressing the final newline and does not understand the octal \nnn notation. Thus the command is echo -n ‘^G’ where ^G means a literal BEL-character (you can produce this in emacs using “Ctrl-Q CtrlG” and in vi using “Ctrl-V Ctrl-G”). A SysV-like “echo” understands the \nnn notation and uses \c to suppress the final newline, so the answer is: echo ‘\007\c’

2.15) Why can’t I use “talk” to talk with my friend on machine X? Unix has three common “talk” programs, none of which can talk with any of the others. The “old” talk accounts for the first two types. This version (often called otalk) did not take

149

“endian” order into account when talking to other machines. As a consequence, the Vax version of otalk cannot talk with the Sun version of otalk. These versions of talk use port 517. Around 1987, most vendors (except Sun, who took 6 years longer than any of their competitors) standardized on a new talk (often called ntalk) which knows about network byte order. This talk works between all machines that have it. This version of talk uses port 518. There are now a few talk programs that speak both ntalk and one version of otalk. The most common of these is called “ytalk”.

2.16) Why does calendar produce the wrong output?
Frequently, people find that the output for the Unix calendar program, ‘cal’ produces output that they do not expect. The calendar for September 1752 is very odd: September 1752 S M Tu W Th F S 1 2 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

This is the month in which the US (the entire British Empire actually) switched from the Julian to the Gregorian calendar. The other common problem people have with the calendar program is that they pass it arguments like ‘cal 9 94’. This gives the calendar for September of AD 94, NOT 1994.

Unix - Frequently Asked Questions (3) [Frequent posting]
This article includes answers to: 3.1) How do I find the creation time of a file?

3.2) How do I use “rsh” without having the rsh hang around until the remote command has completed? 3.3) How do I truncate a file?
3.4) Why doesn’t find’s “{}” symbol do what I want? 3.5) How do I set the permissions on a symbolic link? 3.6) How do I “undelete” a file? 3.7) How can a process detect if it’s running in the background? 3.8) Why doesn’t redirecting a loop work as intended? (Bourne shell)

150

3.9) How do I run ‘passwd’, ‘ftp’, ‘telnet’, ‘tip’ and other interactive programs from a shell script or in the background?
3.10) How do I find the process ID of a program with a particular name from inside a shell script or C program? 3.11) How do I check the exit status of a remote command executed via “rsh” ? 3.12) Is it possible to pass shell variable settings into an awk program? 3.13) How do I get rid of zombie processes that persevere?

3.14) How do I get lines from a pipe as they are written instead of only in larger blocks?
3.15) How do I get the date into a filename? 3.16) Why do some scripts start with #! ... ? If you’re looking for the answer to, say, question 3.5, and want to skip everything else, you can search ahead for the regular expression “^3.5)”. While these are all legitimate questions, they seem to crop up in comp.unix.questions or comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are correct) and then a period of griping about how the same questions keep coming up. You may also like to read the monthly article “Answers to Frequently Asked Questions” in the newsgroup “news.announce.newusers”, which will tell you what “UNIX” stands for. With the variety of Unix systems in the world, it’s hard to guarantee that these answers will work everywhere. Read your local manual pages before trying anything suggested here. If you have suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.

3.1) How do I find the creation time of a file? You can’t - it isn’t stored anywhere. Files have a last-modified time (shown by “ls -l”), a last-accessed time (shown by “ls -lu”) and an inode change time (shown by “ls -lc”). The latter is often referred to as the “creation time” - even in some man pages - but that’s wrong; it’s also set by such operations as mv, ln, chmod, chown and chgrp. The man page for “stat(2)” discusses this.

3.2) How do I use “rsh” without having the rsh hang around until the remote command has completed? (See note in question 2.7 about what “rsh” we’re talking about.) The obvious answers fail: rsh machine command &
or rsh machine ‘command &’

For instance, try doing rsh machine ‘sleep 60 &’ and you’ll see

that the ‘rsh’ won’t exit right away. It will wait 60 seconds until the remote ‘sleep’ command finishes, even though that command was started in the background on the remote machine. So how do you get the ‘rsh’ to exit immediately after the ‘sleep’ is started?

151

The solution - if you use csh on the remote machine: rsh machine -n ‘command >&/dev/null </dev/null &’ If you use sh on the remote machine: rsh machine -n ‘command >/dev/null 2>&1 </dev/null &’ Why? “-n” attaches rsh’s stdin to /dev/null so you could run the complete rsh command in the background on the LOCAL machine. Thus “-n” is equivalent to another specific “< /dev/null”. Furthermore, the input/output redirections on the REMOTE machine (inside the single quotes) ensure that rsh thinks the session can be terminated (there’s no data flow any more.) Note: The file that you redirect to/from on the remote machine doesn’t have to be /dev/null; any ordinary file will do. In many cases, various parts of these complicated commands aren’t necessary.

3.3) How do I truncate a file? The BSD function ftruncate() sets the length of a file. (But not all versions behave identically.) Other Unix variants all seem to support some version of truncation as well. For systems which support the ftruncate function, there are three known behaviours:
BSD 4.2 - Ultrix, SGI, LynxOS

• •

truncation doesn’t grow file truncation doesn’t move file pointer

BSD 4.3 - SunOS, Solaris, OSF/1, HP/UX, Amiga

• • Cray • •

truncation can grow file truncation doesn’t move file pointer - UniCOS 7, UniCOS 8 truncation doesn’t grow file truncation changes file pointer

Other systems come in four varieties:
F_CHSIZE - Only SCO

• • • •

some systems define F_CHSIZE but don’t support it behaves like BSD 4.3 some systems (eg. Interactive Unix) define F_FREESP but don’t support it behaves like BSD 4.3

F_FREESP - Only Interative Unix

chsize() - QNX and SCO • some systems (eg. Interactive Unix) have chsize() but don’t support it

152

behaves like BSD 4.3

nothing - no known systems • there will be systems that don’t support truncate at all

Moderator’s Note: I grabbed the functions below a few years back. I can no longer identify the original author. S. Spencer Sun <spencer@ncd.com> has also contributed a version for F_FREESP. functions for each non-native ftruncate follow /* ftruncate emulations that work on some System V’s. This file is in the public domain. */ #include #include #ifdef F_CHSIZE int ftruncate (fd, length) int fd; off_t length;
{

return fcntl (fd, F_CHSIZE, length);
}

#else #ifdef F_FREESP /* The following function was written by kucharsk@Solbourne.com (William Kucharski) */ #include #include #include int ftruncate (fd, length) int fd; off_t length;
{

struct flock fl; struct stat filebuf; if (fstat (fd, &filebuf) < 0) return -1; if (filebuf.st_size < length)
{

/* Extend file length. */ if (lseek (fd, (length - 1), SEEK_SET) < 0) return -1;

153

/* Write a “0” byte. */ if (write (fd, “”, 1) != 1) return -1;
}

else
{

/* Truncate length. */ fl.l_whence = 0; fl.l_len = 0; fl.l_start = length; fl.l_type = F_WRLCK;

/* Write lock on file space. */

/* This relies on the UNDOCUMENTED F_FREESP argument to fcntl, which truncates the file so that it ends at the position indicated by fl.l_start. Will minor miracles never cease? */ if (fcntl (fd, F_FREESP, &fl) < 0) return -1;
}

return 0;
}

#else int ftruncate (fd, length) int fd; off_t length;
{

return chsize (fd, length);
}

#endif #endif

3.4) Why doesn’t find’s “{}” symbol do what I want? “find” has a -exec option that will execute a particular command on all the selected files. Find will replace any “{}” it sees with the name of the file currently under consideration. So, some day you might try to use “find” to run a command on every file, one directory at a time. You might try this: find /path -type d -exec command {}/\* \; hoping that find will execute, in turn command directory1/* command directory2/*
...

Unfortunately, find only expands the “{}” token when it appears by itself. Find will leave anything else like “{}/*” alone, so

154

instead of doing what you want, it will do command {}/* command {}/*
...

once for each directory. This might be a bug, it might be a feature, but we’re stuck with the current behaviour. So how do you get around this? One way would be to write a trivial little shell script, let’s say “./doit”, that consists of command “$1”/* You could then use find /path -type d -exec ./doit {} \; Or if you want to avoid the “./doit” shell script, you can use find /path -type d -exec sh -c ‘command $0/*’ {} \; (This works because within the ‘command’ of “sh -c ‘command’ A B C ...”, $0 expands to A, $1 to B, and so on.) or you can use the construct-a-command-with-sed trick find /path -type d -print | sed ‘s:.*:command &/*:’ | sh If all you’re trying to do is cut down on the number of times that “command” is executed, you should see if your system has the “xargs” command. Xargs reads arguments one line at a time from the standard input and assembles as many of them as will fit into one command line. You could use find /path -print | xargs command which would result in one or more executions of command file1 file2 file3 file4 dir1/file1 dir1/file2 Unfortunately this is not a perfectly robust or secure solution. Xargs expects its input lines to be terminated with newlines, so it will be confused by files with odd characters such as newlines in their names.

3.5) How do I set the permissions on a symbolic link? Permissions on a symbolic link don’t really mean anything. The only permissions that count are the permissions on the file that the link points to.

3.6) How do I “undelete” a file? Someday, you are going to accidentally type something like “rm * .foo”, and find you just deleted “*” instead of “*.foo”. Consider it a rite of passage.

155

Of course, any decent systems administrator should be doing regular backups. Check with your sysadmin to see if a recent backup copy of your file is available. But if it isn’t, read on. For all intents and purposes, when you delete a file with “rm” it is gone. Once you “rm” a file, the system totally forgets which blocks scattered around the disk were part of your file. Even worse, the blocks from the file you just deleted are going to be the first ones taken and scribbled upon when the system needs more disk space. However, never say never. It is theoretically possible if you shut down the system immediately after the “rm” to recover portions of the data. However, you had better have a very wizardly type person at hand with hours or days to spare to get it all back. Your first reaction when you “rm” a file by mistake is why not make a shell alias or procedure which changes “rm” to move files into a trash bin rather than delete them? That way you can recover them if you make a mistake, and periodically clean out your trash bin. Two points: first, this is generally accepted as a bad idea. You will become dependent upon this behaviour of “rm”, and you will find yourself someday on a normal system where “rm” is really “rm”, and you will get yourself in trouble. Second, you will eventually find that the hassle of dealing with the disk space and time involved in maintaining the trash bin, it might be easier just to be a bit more careful with “rm”. For starters, you should look up the “-i” option to “rm” in your manual. If you are still undaunted, then here is a possible simple answer. You can create yourself a “can” command which moves files into a trashcan directory. In csh(1) you can place the following commands in the “.login” file in your home directory:
alias can ‘mv \!* ~/.trashcan’ alias mtcan ‘rm -f ~/.trashcan/*’ # junk file(s) to trashcan # irretrievably empty trash

if ( ! -d ~/.trashcan ) mkdir ~/.trashcan # ensure trashcan exists You might also want to put a: rm -f ~/.trashcan/* in the “.logout” file in your home directory to automatically empty the trash when you log out. (sh and ksh versions are left as an exercise for the reader.) MIT’s Project Athena has produced a comprehensive delete/undelete/expunge/purge package, which can serve as a complete replacement for rm which allows file recovery. This package was posted to comp.sources.misc (volume 17, issue 023-026)

3.7) How can a process detect if it’s running in the background? First of all: do you want to know if you’re running in the background, or if you’re running interactively? If you’re deciding whether or not you should print prompts and the like, that’s probably a better criterion. Check if standard input is a terminal: sh: if [ -t 0 ]; then ... fi C: if(isatty(0)) { ... } In general, you can’t tell if you’re running in the background. The fundamental problem is that different shells and different versions of UNIX have different notions of what “foreground” and “background” mean - and on the most common type of system with a better-defined notion of what they mean, programs can be moved arbitrarily between foreground and background!

156

UNIX systems without job control typically put a process into the background by ignoring SIGINT and SIGQUIT and redirecting the standard input to “/dev/null”; this is done by the shell. Shells that support job control, on UNIX systems that support job control, put a process into the background by giving it a process group ID different from the process group to which the terminal belongs. They move it back into the foreground by setting the terminal’s process group ID to that of the process. Shells that do not support job control, on UNIX systems that support job control, typically do what shells do on systems that don’t support job control.

3.8) Why doesn’t redirecting a loop work as intended? (Bourne shell) Take the following example: foo=bar while read line do # do something with $line foo=bletch done < /etc/passwd echo “foo is now: $foo” Despite the assignment “foo=bletch” this will print “foo is now: bar” in many implementations of the Bourne shell. Why? Because of the following, often undocumented, feature of historic Bourne shells: redirecting a control structure (such as a loop, or an “if” statement) causes a subshell to be created, in which the structure is executed; variables set in that subshell (like the “foo=bletch” assignment) don’t affect the current shell, of course. The POSIX 1003.2 Shell and Tools Interface standardization committee forbids the behaviour described above, i.e. in P1003.2 conformant Bourne shells the example will print “foo is now: bletch”. In historic (and P1003.2 conformant) implementations you can use the following ‘trick’ to get around the redirection problem: foo=bar # make file descriptor 9 a duplicate of file descriptor 0 (stdin); # then connect stdin to /etc/passwd; the original stdin is now # ‘remembered’ in file descriptor 9; see dup(2) and sh(1) exec 9<&0 < /etc/passwd while read line do # do something with $line foo=bletch done # make stdin a duplicate of file descriptor 9, i.e. reconnect # it to the original stdin; then close file descriptor 9 exec 0<&9 9<&-

157

echo “foo is now: $foo” This should always print “foo is now: bletch”. Right, take the next example: foo=bar echo bletch | read foo echo “foo is now: $foo” This will print “foo is now: bar” in many implementations, “foo is now: bletch” in some others. Why? Generally each part of a pipeline is run in a different subshell; in some implementations though, the last command in the pipeline is made an exception: if it is a builtin command like “read”, the current shell will execute it, else another subshell is created. POSIX 1003.2 allows both behaviours so portable scripts cannot depend on any of them.

3.9) How do I run ‘passwd’, ‘ftp’, ‘telnet’, ‘tip’ and other interactive programs from a shell script or in the background? These programs expect a terminal interface. Shells makes no special provisions to provide one. Hence, such programs cannot be automated in shell scripts. The ‘expect’ program provides a programmable terminal interface for automating interaction with such programs. The following expect script is an example of a non-interactive version of passwd(1). # username is passed as 1st arg, password as 2nd set password [index $argv 2] spawn passwd [index $argv 1] expect “*password:” send “$password\r” expect “*password:” send “$password\r” expect eof expect can partially automate interaction which is especially useful for telnet, rlogin, debuggers or other programs that have no built-in command language. The distribution provides an example script to rerun rogue until a good starting configuration appears. Then, control is given back to the user to enjoy the game. Fortunately some programs have been written to manage the connection to a pseudo-tty so that you can run these sorts of programs in a script. To get expect, email “send pub/expect/expect.shar.Z” to library@cme.nist.gov or anonymous ftp same from ftp.cme.nist.gov. Another solution is provided by the pty 4.0 program, which runs a program under a pseudotty session and was posted to comp.sources.unix, volume 25. A pty-based solution using named pipes to do the same as the above might look like this: #!/bin/sh /etc/mknod out.$$ p; exec 2>&1

158

( exec 4<out.$$; rm -f out.$$ <&4 waitfor ‘password:’ echo “$2”
<&4 waitfor ‘password:’

echo “$2”
<&4 cat >/dev/null

) | ( pty passwd “$1” >out.$$ ) Here, ‘waitfor’ is a simple C program that searches for its argument in the input, character by character. A simpler pty solution (which has the drawback of not synchronizing properly with the passwd program) is #!/bin/sh ( sleep 5; echo “$2”; sleep 5; echo “$2”) | pty passwd “$1”

3.10) How do I find the process ID of a program with a particular name from inside a shell script or C program? In a shell script: There is no utility specifically designed to map between program names and process IDs. Furthermore, such mappings are often unreliable, since it’s possible for more than one process to have the same name, and since it’s possible for a process to change its name once it starts running. However, a pipeline like this can often be used to get a list of processes (owned by you) with a particular name: ps ux | awk ‘/name/ && !/awk/ {print $2}’ You replace “name” with the name of the process for which you are searching. The general idea is to parse the output of ps, using awk or grep or other utilities, to search for the lines with the specified name on them, and print the PID’s for those lines. Note that the “!/awk/” above prevents the awk process for being listed. You may have to change the arguments to ps, depending on what kind of Unix you are using. In a C program: Just as there is no utility specifically designed to map between program names and process IDs, there are no (portable) C library functions to do it either. However, some vendors provide functions for reading Kernel memory; for example, Sun provides the “kvm_” functions, and Data General provides the “dg_” functions. It may be possible for any user to use these, or they may only be useable by the super-user (or a user in group “kmem”) if read-access to kernel memory on your system is restricted. Furthermore, these functions are often not documented or documented badly, and might change from release to release. Some vendors provide a “/proc” filesystem, which appears as a directory with a bunch of filenames in it. Each filename is a number, corresponding to a process ID, and you can open

159

the file and read it to get information about the process. Once again, access to this may be restricted, and the interface to it may change from system to system. If you can’t use vendor-specific library functions, and you don’t have /proc, and you still want to do this completely in C, you are going to have to do the rummaging through kernel memory yourself. For a good example of how to do this on many systems, see the sources to “ofiles”, available in the comp.sources.unix archives. (A package named “kstuff” to help with kernel rummaging was posted to alt.sources in May 1991 and is also available via anonymous ftp as usenet/alt.sources/articles/{329{6,7,8,9},330{0,1}}.Z from wuarchive.wustl.edu.)

3.11) How do I check the exit status of a remote command executed via “rsh” ? This doesn’t work: rsh some-machine some-crummy-command || echo “Command failed” The exit status of ‘rsh’ is 0 (success) if the rsh program itself completed successfully, which probably isn’t what you wanted. If you want to check on the exit status of the remote program, you can try using Maarten Litmaath’s ‘ersh’ script, which was posted to alt.sources in October 1994. ersh is a shell script that calls rsh, arranges for the remote machine to echo the status of the command after it completes, and exits with that status.

3.12) Is it possible to pass shell variable settings into an awk program? There are two different ways to do this. The first involves simply expanding the variable where it is needed in the program. For example, to get a list of all ttys you’re using: who | awk ‘/^’”$USER”’/ { print $2 }’ (1) Single quotes are usually used to enclose awk programs because the character ‘$’ is often used in them, and ‘$’ will be interpreted by the shell if enclosed inside double quotes, but not if enclosed inside single quotes. In this case, we want the ‘$’ in “$USER” to be interpreted by the shell, so we close the single quotes and then put the “$USER” inside double quotes. Note that there are no spaces in any of that, so the shell will see it all as one argument. Note, further, that the double quotes probably aren’t necessary in this particular case (i.e. we could have done who | awk ‘/^’$USER’/ { print $2 }’ (2) ), but they should be included nevertheless because they are necessary when the shell variable in question contains special characters or spaces.

160

The second way to pass variable settings into awk is to use an often undocumented feature of awk which allows variable settings to be specified as “fake file names” on the command line. For example: who | awk ‘$1 == user { print $2 }’ user=”$USER” (3) Variable settings take effect when they are encountered on the command line, so, for example, you could instruct awk on how to behave for different files using this technique. For example: awk ‘{ program that depends on s }’ s=1 file1 s=0 file2 (4) Note that some versions of awk will cause variable settings encountered before any real filenames to take effect before the BEGIN block is executed, but some won’t so neither way should be relied upon. Note, further, that when you specify a variable setting, awk won’t automatically read from stdin if no real files are specified, so you need to add a “-“ argument to the end of your command, as I did at (3) above. A third option is to use a newer version of awk (nawk), which allows direct access to environment vairables. Eg. nawk ‘END { print “Your path variable is “ ENVIRON[”PATH”] }’ /dev/null

3.13) How do I get rid of zombie processes that persevere? Unfortunately, it’s impossible to generalize how the death of child processes should behave, because the exact mechanism varies over the various flavors of Unix. First of all, by default, you have to do a wait() for child processes under ALL flavors of Unix. That is, there is no flavor of Unix that I know of that will automatically flush child processes that exit, even if you don’t do anything to tell it to do so. Second, under some SysV-derived systems, if you do “signal(SIGCHLD, SIG_IGN)” (well, actually, it may be SIGCLD instead of SIGCHLD, but most of the newer SysV systems have “#define SIGCHLD SIGCLD” in the header files), then child processes will be cleaned up automatically, with no further effort in your part. The best way to find out if it works at your site is to try it, although if you are trying to write portable code, it’s a bad idea to rely on this in any case. Unfortunately, POSIX doesn’t allow you to do this; the behavior of setting the SIGCHLD to SIG_IGN under POSIX is undefined, so you can’t do it if your program is supposed to be POSIX-compliant. So, what’s the POSIX way? As mentioned earlier, you must install a signal handler and wait. Under POSIX signal handlers are installed with sigaction. Since you are not interested in “stopped” children, only in terminated children, add SA_NOCLDSTOP to sa_flags. Waiting without blocking is done with waitpid(). The first argument to waitpid should be -1 (wait for any pid), the third should be WNOHANG. This is the most portable way and is likely to become more portable in future. If your systems doesn’t support POSIX, there’s a number of ways.

161

The easiest way is signal(SIGCHLD, SIG_IGN), if it works. If SIG_IGN cannot be used to force automatic clean-up, then you’ve got to write a signal handler to do it. It isn’t easy at all to write a signal handler that does things right on all flavors of Unix, because of the following inconsistencies: On some flavors of Unix, the SIGCHLD signal handler is called if one or more children have died. This means that if your signal handler only does one wait() call, then it won’t clean up all of the children. Fortunately, I believe that all Unix flavors for which this is the case have available to the programmer the wait3() or waitpid() call, which allows the WNOHANG option to check whether or not there are any children waiting to be cleaned up. Therefore, on any system that has wait3()/waitpid(), your signal handler should call wait3()/waitpid() over and over again with the WNOHANG option until there are no children left to clean up. Waitpid() is the preferred interface, as it is in POSIX. On SysV-derived systems, SIGCHLD signals are regenerated if there are child processes still waiting to be cleaned up after you exit the SIGCHLD signal handler. Therefore, it’s safe on most SysV systems to assume when the signal handler gets called that you only have to clean up one signal, and assume that the handler will get called again if there are more to clean up after it exits. On older systems, there is no way to prevent signal handlers from being automatically reset to SIG_DFL when the signal handler gets called. On such systems, you have to put “signal(SIGCHILD, catcher_func)” (where “catcher_func” is the name of the handler function) as the last thing in the signal handler, so that it gets reset. Fortunately, newer implementations allow signal handlers to be installed without being reset to SIG_DFL when the handler function is called. To get around this problem, on systems that do not have wait3()/waitpid() but do have SIGCLD, you need to reset the signal handler with a call to signal() after doing at least one wait() within the handler, each time it is called. For backward compatibility reasons, System V will keep the old semantics (reset handler on call) of signal(). Signal handlers that stick can be installed with sigaction() or sigset(). The summary of all this is that on systems that have waitpid() (POSIX) or wait3(), you should use that and your signal handler should loop, and on systems that don’t, you should have one call to wait() per invocation of the signal handler. One more thing—if you don’t want to go through all of this trouble, there is a portable way to avoid this problem, although it is somewhat less efficient. Your parent process should fork, and then wait right there and then for the child process to terminate. The child process then forks again, giving you a child and a grandchild. The child exits immediately (and hence the parent waiting for it notices its death and continues to work), and the grandchild does whatever the child was originally supposed to. Since its parent died, it is inherited by init, which will do whatever waiting is needed. This method is inefficient because it requires an extra fork, but is pretty much completely portable.

3.14) How do I get lines from a pipe as they are written instead of only in larger blocks? The stdio library does buffering differently depending on whether it thinks it’s running on a tty. If it thinks it’s on a tty, it does buffering on a per-line basis; if not, it uses a larger buffer than one line.

162

If you have the source code to the client whose buffering you want to disable, you can use setbuf() or setvbuf() to change the buffering. If not, the best you can do is try to convince the program that it’s running on a tty by running it under a pty, e.g. by using the “pty” program mentioned in question 3.9.

3.15) How do I get the date into a filename? This isn’t hard, but it is a bit cryptic at first sight. Let’s begin with the date command itself: date can take a formatting string, to modify the way in which the date info is printed. The formatting string has to be enclosed in quotes, to stop the shell trying to interpret it before the date command itself gets it. Try this: date ‘+%d%m%y’ you should get back something like 130994. If you want to punctuate this, just put the characters you would like to use in the formatting string (NO SLASHES ‘/’): date ‘+%d.%m.%y’ There are lots of token you can use in the formatting string: have a look at the man page for date to find out about them. Now, getting this into a file name. Let’s say that we want to create files called report.130994 (or whatever the date is today):
FILENAME=report.‘date ‘+%d%m%y’‘

Notice that we are using two sets of quotes here: the inner set are to protect the formatting string from premature interpretation; the outer set are to tell the shell to execute the enclosed command, and substitute the result into the expression (command substitution).

3.16) Why do some scripts start with #! ... ? Chip Rosenthal has answered a closely related question in comp.unix.xenix in the past. I think what confuses people is that there exist two different mechanisms, both spelled with the letter ‘#’. They both solve the same problem over a very restricted set of cases—but they are none the less different. Some background. When the UNIX kernel goes to run a program (one of the exec() family of system calls), it takes a peek at the first 16 bits of the file. Those 16 bits are called a ‘magic number’. First, the magic number prevents the kernel from doing something silly like trying to execute your customer database file. If the kernel does not recognize the magic number then it complains with an ENOEXEC error. It will execute the program only if the magic number is recognizable. Second, as time went on and different executable file formats were introduced, the magic number not only told the kernel if it could execute the file, but also how to execute the file. For example, if you compile a program on an SCO XENIX/386 system and carry the binary over to a SysV/386 UNIX system, the kernel will recognize the magic number and say ‘Aha! This is an x.out binary!’ and configure itself to run with XENIX compatible system calls.

163

Note that the kernel can only run binary executable images. So how, you might ask, do scripts get run? After all, I can type ‘my.script’ at a shell prompt and I don’t get an ENOEXEC error. Script execution is done not by the kernel, but by the shell. The code in the shell might look something like: /* try to run the program */ execl(program, basename(program), (char *)0); /* the exec failed—maybe it is a shell script? */ if (errno == ENOEXEC) execl (“/bin/sh”, “sh”, “-c”, program, (char *)0); /* oh no mr bill!! */ perror(program); return -1; (This example is highly simplified. There is a lot more involved, but this illustrates the point I’m trying to make.) If execl() is successful in starting the program then the code beyond the execl() is never executed. In this example, if we can execl() the ‘program’ then none of the stuff beyond it is run. Instead the system is off running the binary ‘program’. If, however, the first execl() failed then this hypothetical shell looks at why it failed. If the execl() failed because ‘program’ was not recognized as a binary executable, then the shell tries to run it as a shell script. The Berkeley folks had a neat idea to extend how the kernel starts up programs. They hacked the kernel to recognize the magic number ‘#!’. (Magic numbers are 16-bits and two 8-bit characters makes 16 bits, right?) When the ‘#!’ magic number was recognized, the kernel would read in the rest of the line and treat it as a command to run upon the contents of the file. With this hack you could now do things like: #! /bin/sh #! /bin/csh #! /bin/awk -F: This hack has existed solely in the Berkeley world, and has migrated to USG kernels as part of System V Release 4. Prior to V.4, unless the vendor did some special value added, the kernel does not have the capability of doing anything other than loading and starting a binary executable image. Now, lets rewind a few years, to the time when more and more folks running USG based unices were saying ‘/bin/sh sucks as an interactive user interface! I want csh!’. Several vendors did some value added magic and put csh in their distribution, even though csh was not a part of the USG UNIX distribution. This, however, presented a problem. Let’s say you switch your login shell to /bin/csh. Let’s further suppose that you are a cretin and insist upon programming csh scripts. You’d certainly want to be able to type ‘my.script’ and get it run, even though it is a csh script. Instead of pumping it through /bin/sh, you want the script to be started by running: execl (“/bin/csh”, “csh”, “-c”, “my.script”, (char *)0);

164

But what about all those existing scripts—some of which are part of the system distribution? If they started getting run by csh then things would break. So you needed a way to run some scripts through csh, and others through sh. The solution introduced was to hack csh to take a look at the first character of the script you are trying to run. If it was a ‘#’ then csh would try to run the script through /bin/csh, otherwise it would run the script through /bin/sh. The example code from the above might now look something like: /* try to run the program */ execl(program, basename(program), (char *)0); /* the exec failed—maybe it is a shell script? */ if (errno == ENOEXEC && (fp = fopen(program, “r”)) != NULL) { i = getc(fp); (void) fclose(fp); if (i == ‘#’) execl (“/bin/csh”, “csh”, “-c”, program, (char *)0); else execl (“/bin/sh”, “sh”, “-c”, program, (char *)0);
}

/* oh no mr bill!! */ perror(program); return -1; Two important points. First, this is a ‘csh’ hack. Nothing has been changed in the kernel and nothing has been changed in the other shells. If you try to execl() a script, whether or not it begins with ‘#’, you will still get an ENOEXEC failure. If you try to run a script beginning with ‘#’ from something other than csh (e.g. /bin/sh), then it will be run by sh and not csh. Second, the magic is that either the script begins with ‘#’ or it doesn’t begin with ‘#’. What makes stuff like ‘:’ and ‘: /bin/sh’ at the front of a script magic is the simple fact that they are not ‘#’. Therefore, all of the following are identical at the start of a script:
:

: /bin/sh <--- a blank line : /usr/games/rogue echo “Gee...I wonder what shell I am running under???” In all these cases, all shells will try to run the script with /bin/sh. Similarly, all of the following are identical at the start of a script:
#

# /bin/csh #! /bin/csh

165

#! /bin/sh # Gee...I wonder what shell I am running under??? All of these start with a ‘#’. This means that the script will be run by csh only if you try to start it from csh, otherwise it will be run by /bin/sh. (Note: if you are running ksh, substitute ‘ksh’ for ‘sh’ in the above. The Korn shell is theoretically compatible with Bourne shell, so it tries to run these scripts itself. Your mileage may vary on some of the other available shells such as zsh, bash, etc.) Obviously, if you’ve got support for ‘#!’ in the kernel then the ‘#’ hack becomes superfluous. In fact, it can be dangerous because it creates confusion over what should happen with ‘#! /bin/sh’. The ‘#!’ handling is becoming more and more prevelant. System V Release 4 picks up a number of the Berkeley features, including this. Some System V Release 3.2 vendors are hacking in some of the more visible V.4 features such as this and trying to convince you this is sufficient and you don’t need things like real, working streams or dynamically adjustable kernel parameters. XENIX does not support ‘#!’. The XENIX /bin/csh does have the ‘#’ hack. Support for ‘#!’ in XENIX would be nice, but I wouldn’t hold my breath waiting for it.

Unix - Frequently Asked Questions (4) [Frequent posting]
This article includes answers to: 4.1) How do I read characters from a terminal without requiring the user to hit RETURN?

4.2) How do I check to see if there are characters to be read without actually reading?
4.3) How do I find the name of an open file? 4.4) How can an executing program determine its own pathname? 4.5) How do I use popen() to open a process for reading AND writing? 4.6) How do I sleep() in a C program for less than one second? 4.7) How can I get setuid shell scripts to work?

4.8) How can I find out which user or process has a file open or is using a particular file system (so that I can unmount it?)
4.9) How do I keep track of people who are fingering me? 4.10) Is it possible to reconnect a process to a terminal after it has been disconnected, e.g. after starting a program in the background and logging out?

4.11) Is it possible to “spy” on a terminal, displaying the output that’s appearing on it on another terminal?

166

If you’re looking for the answer to, say, question 4.5, and want to skip everything else, you can search ahead for the regular expression “^4.5)”. While these are all legitimate questions, they seem to crop up in comp.unix.questions or comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are correct) and then a period of griping about how the same questions keep coming up. You may also like to read the monthly article “Answers to Frequently Asked Questions” in the newsgroup “news.announce.newusers”, which will tell you what “UNIX” stands for. With the variety of Unix systems in the world, it’s hard to guarantee that these answers will work everywhere. Read your local manual pages before trying anything suggested here. If you have suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com. 4.1) How do I read characters from a terminal without requiring the user to hit RETURN? Check out cbreak mode in BSD, ~ICANON mode in SysV. If you don’t want to tackle setting the terminal parameters yourself (using the “ioctl(2)” system call) you can let the stty program do the work - but this is slow and inefficient, and you should change the code to do it right some time: #include <stdio.h> main()
{

int c; printf(“Hit any character to continue\n”);
/*

• •

ioctl() would be better here; only lazy programmers do it this way:
*/

system(“/bin/stty cbreak”); /* or “stty raw” */ c = getchar(); system(“/bin/stty -cbreak”); printf(“Thank you for typing %c.\n”, c); exit(0);
}

Several people have sent me various more correct solutions to this problem. I’m sorry that I’m not including any of them here, because they really are beyond the scope of this list. You might like to check out the documentation for the “curses” library of portable screen functions. Often if you’re interested in single-character I/O like this, you’re also interested in doing some sort of screen display control, and the curses library provides various portable routines for both functions.

4.2) How do I check to see if there are characters to be read without actually reading?

167

Certain versions of UNIX provide ways to check whether characters are currently available to be read from a file descriptor. In BSD, you can use select(2). You can also use the FIONREAD ioctl, which returns the number of characters waiting to be read, but only works on terminals, pipes and sockets. In System V Release 3, you can use poll(2), but that only works on streams. In Xenix and therefore Unix SysV r3.2 and later - the rdchk() system call reports whether a read() call on a given file descriptor will block. There is no way to check whether characters are available to be read from a FILE pointer. (You could poke around inside stdio data structures to see if the input buffer is nonempty, but that wouldn’t work since you’d have no way of knowing what will happen the next time you try to fill the buffer.) Sometimes people ask this question with the intention of writing

if (characters available from fd) read(fd, buf, sizeof buf); in order to get the effect of a nonblocking read. This is not the best way to do this, because it is possible that characters will be available when you test for availability, but will no longer be available when you call read. Instead, set the O_NDELAY flag (which is also called FNDELAY under BSD) using the F_SETFL option of fcntl(2). Older systems (Version 7, 4.1 BSD) don’t have O_NDELAY; on these systems the closest you can get to a nonblocking read is to use alarm(2) to time out the read.

4.3) How do I find the name of an open file? In general, this is too difficult. The file descriptor may be attached to a pipe or pty, in which case it has no name. It may be attached to a file that has been removed. It may have multiple names, due to either hard or symbolic links. If you really need to do this, and be sure you think long and hard about it and have decided that you have no choice, you can use find with the -inum and possibly -xdev option, or you can use ncheck, or you can recreate the functionality of one of these within your program. Just realize that searching a 600 megabyte filesystem for a file that may not even exist is going to take some time.

4.4) How can an executing program determine its own pathname? Your program can look at argv[0]; if it begins with a “/”, it is probably the absolute pathname to your program, otherwise your program can look at every directory named in the environment variable PATH and try to find the first one that contains an executable file whose name matches your program’s argv[0] (which by convention is the name of the file being executed). By concatenating that directory and the value of argv[0] you’d probably have the right name. You can’t really be sure though, since it is quite legal for one program to exec() another with any value of argv[0] it desires. It is merely a convention that new programs are exec’d with the executable file name in argv[0].

168

For instance, purely a hypothetical example: #include <stdio.h> main()
{

execl(“/usr/games/rogue”, “vi Thesis”, (char *)NULL);
}

The executed program thinks its name (its argv[0] value) is “vi Thesis”. (Certain other programs might also think that the name of the program you’re currently running is “vi Thesis”, but of course this is just a hypothetical example, don’t try it yourself :-)

4.5) How do I use popen() to open a process for reading AND writing? The problem with trying to pipe both input and output to an arbitrary slave process is that deadlock can occur, if both processes are waiting for not-yet-generated input at the same time. Deadlock can be avoided only by having BOTH sides follow a strict deadlock-free protocol, but since that requires cooperation from the processes it is inappropriate for a popen()-like library function. The ‘expect’ distribution includes a library of functions that a C programmer can call directly. One of the functions does the equivalent of a popen for both reading and writing. It uses ptys rather than pipes, and has no deadlock problem. It’s portable to both BSD and SV. See question 3.9 for more about ‘expect’.

4.6) How do I sleep() in a C program for less than one second? The first thing you need to be aware of is that all you can specify is a MINIMUM amount of delay; the actual delay will depend on scheduling issues such as system load, and could be arbitrarily large if you’re unlucky. There is no standard library function that you can count on in all environments for “napping” (the usual name for short sleeps). Some environments supply a “usleep(n)” function which suspends execution for n microseconds. If your environment doesn’t support usleep(), here are a couple of implementations for BSD and System V environments. The following code is adapted from Doug Gwyn’s System V emulation support for 4BSD and exploits the 4BSD select() system call. Doug originally called it ‘nap()’; you probably want to call it “usleep()”;
/*

usleep—support routine for 4.2BSD system call emulations last edit: 29-Oct-1984
*/

D A Gwyn

extern int

select();

169

int
usleep( usec ) long { usec; /* returns 0 if ok, else -1 */ /* delay in microseconds */

static struct

/* ‘timeval’ */

{ long tv_sec; /* seconds */ long tv_usec; /* microsecs */ } delay; /* _select() timeout */

delay.tv_sec = usec / 1000000L; delay.tv_usec = usec % 1000000L; return select( 0, (long *)0, (long *)0, (long *)0, &delay );
}

On System V you might do it this way:
/*

subseconds sleeps for System V - or anything that has poll() Don Libes, 4/1/1991 The BSD analog to this function is defined in terms of microseconds while poll() is defined in terms of milliseconds. For compatibility, this function provides accuracy “over the long run” by truncating actual requests to milliseconds and accumulating microseconds across calls with the idea that you are probably calling it in a tight loop, and that over the long run, the error will even out. If you aren’t calling it in a tight loop, then you almost certainly aren’t making microsecondresolution requests anyway, in which case you don’t care about microseconds. And if you did, you wouldn’t be using UNIX anyway because random system indigestion (i.e., scheduling) can make mincemeat out of any timing code. Returns 0 if successful timeout, -1 if unsuccessful.
*/

#include <poll.h> int usleep(usec) unsigned int usec;
{ static subtotal = 0; int msec; /* microseconds */ /* milliseconds */

/* microseconds */

/* ‘foo’ is only here because some versions of 5.3 have • • a bug where the first argument to poll() is checked for a valid memory address even if the second argument is 0.
*/

struct pollfd foo;

170

subtotal += usec; /* if less then 1 msec request, do nothing but remember it */ if (subtotal < 1000) return(0); msec = subtotal/1000; subtotal = subtotal%1000; return poll(&foo,(unsigned long)0,msec);
}

Another possibility for nap()ing on System V, and probably other non-BSD Unices is Jon Zeeff’s s5nap package, posted to comp.sources.misc, volume 4. It does require a installing a device driver, but works flawlessly once installed. (Its resolution is limited to the kernel HZ value, since it uses the kernel delay() routine.) Many newer versions of Unix have a nanosleep function.

4.7) How can I get setuid shell scripts to work? [ This is a long answer, but it’s a complicated and frequently-asked question. Thanks to Maarten Litmaath for this answer, and for the “indir” program mentioned below. ] Let us first assume you are on a UNIX variant (e.g. 4.3BSD or SunOS) that knows about socalled ‘executable shell scripts’. Such a script must start with a line like: #!/bin/sh The script is called ‘executable’ because just like a real (binary) executable it starts with a socalled ‘magic number’ indicating the type of the executable. In our case this number is ‘#!’ and the OS takes the rest of the first line as the interpreter for the script, possibly followed by 1 initial option like: #!/bin/sed -f Suppose this script is called ‘foo’ and is found in /bin, then if you type: foo arg1 arg2 arg3 the OS will rearrange things as though you had typed: /bin/sed -f /bin/foo arg1 arg2 arg3 There is one difference though: if the setuid permission bit for ‘foo’ is set, it will be honored in the first form of the command; if you really type the second form, the OS will honor the permission bits of /bin/sed, which is not setuid, of course.

OK, but what if my shell script does NOT start with such a ‘#!’ line or my OS does not know about it? Well, if the shell (or anybody else) tries to execute it, the OS will return an error indication, as the file does not start with a valid magic number. Upon receiving this indication the shell ASSUMES the file to be a shell script and gives it another try: /bin/sh shell_script arguments

171

But we have already seen that a setuid bit on ‘shell_script’ will NOT be honored in this case!

Right, but what about the security risks of setuid shell scripts?

Well, suppose the script is called ‘/etc/setuid_script’, starting with: #!/bin/sh Now let us see what happens if we issue the following commands: $ cd /tmp $ ln /etc/setuid_script -i $ PATH=. $ -i We know the last command will be rearranged to: /bin/sh -i But this command will give us an interactive shell, setuid to the owner of the script! Fortunately this security hole can easily be closed by making the first line: #!/bin/sh The ‘-‘ signals the end of the option list: the next argument ‘-i’ will be taken as the name of the file to read commands from, just like it should!

There are more serious problems though: $ cd /tmp $ ln /etc/setuid_script temp $ nice -20 temp & $ mv my_script temp The third command will be rearranged to: nice -20 /bin/sh - temp As this command runs so slowly, the fourth command might be able to replace the original ‘temp’ with ‘my_script’ BEFORE ‘temp’ is opened by the shell! There are 4 ways to fix this security hole: 1) let the OS start setuid scripts in a different, secure way • System V R4 and 4.4BSD use the /dev/fd driver to pass the interpreter a file descriptor for the script 2) let the script be interpreted indirectly, through a frontend that makes sure everything is all right before starting the real interpreter - if you use the ‘indir’ program from comp.sources.unix the setuid script will look like this: #!/bin/indir -u #?/bin/sh /etc/setuid_script 3) make a ‘binary wrapper’: a real executable that is setuid and whose only task is to execute the interpreter with the name of the script as an argument

172

4) make a general ‘setuid script server’ that tries to locate the requested ‘service’ in a database of valid scripts and upon success will start the right interpreter with the right arguments. Now that we have made sure the right file gets interpreted, are there any risks left? Certainly! For shell scripts you must not forget to set the PATH variable to a safe path explicitly. Can you figure out why? Also there is the IFS variable that might cause trouble if not set properly. Other environment variables might turn out to compromise security as well, e.g. SHELL... Furthermore you must make sure the commands in the script do not allow interactive shell escapes! Then there is the umask which may have been set to something strange... Etcetera. You should realise that a setuid script ‘inherits’ all the bugs and security risks of the commands that it calls! All in all we get the impression setuid shell scripts are quite a risky business! You may be better off writing a C program instead!

4.8) How can I find out which user or process has a file open or is using a particular file system (so that I can unmount it?) Use fuser (system V), fstat (BSD), ofiles (public domain) or pff (public domain). These programs will tell you various things about processes using particular files. A port of the 4.3 BSD fstat to Dynix, SunOS and Ultrix can be found in archives of comp.sources.unix, volume 18. pff is part of the kstuff package, and works on quite a few systems. Instructions for obtaining kstuff are provided in question 3.10. I’ve been informed that there is also a program called lsof. I don’t know where it can be obtained.
Michael Fink <Michael.Fink@uibk.ac.at> adds:

If you are unable to unmount a file system for which above tools do not report any open files make sure that the file system that you are trying to unmount does not contain any active mount points (df(1)).

4.9) How do I keep track of people who are fingering me? Generally, you can’t find out the userid of someone who is fingering you from a remote machine. You may be able to find out which machine the remote request is coming from. One possibility, if your system supports it and assuming the finger daemon doesn’t object, is to make your .plan file a “named pipe” instead of a plain file. (Use ‘mknod’ to do this.) You can then start up a program that will open your .plan file for writing; the open will block until some other process (namely fingerd) opens the .plan for reading. Now you can feed whatever you want through this pipe, which lets you show different .plan information every

173

time someone fingers you. One program for doing this is the “planner” package in volume 41 of the comp.sources.misc archives. Of course, this may not work at all if your system doesn’t support named pipes or if your local fingerd insists on having plain .plan files. Your program can also take the opportunity to look at the output of “netstat” and spot where an incoming finger connection is coming from, but this won’t get you the remote user. Getting the remote userid would require that the remote site be running an identity service such as RFC 931. There are now three RFC 931 implementations for popular BSD machines, and several applications (such as the wuarchive ftpd) supporting the server. For more information join the rfc931-users mailing list, >rfc931-users-request@kramden.acf.nyu.edu. There are three caveats relating to this answer. The first is that many NFS systems won’t recognize the named pipe correctly. This means that trying to read the pipe on another machine will either block until it times out, or see it as a zero-length file, and never print it. The second problem is that on many systems, fingerd checks that the .plan file contains data (and is readable) before trying to read it. This will cause remote fingers to miss your .plan file entirely. The third problem is that a system that supports named pipes usually has a fixed number of named pipes available on the system at any given time - check the kernel config file and FIFOCNT option. If the number of pipes on the system exceeds the FIFOCNT value, the system blocks new pipes until somebody frees the resources. The reason for this is that buffers are allocated in a non-paged memory.

4.10) Is it possible to reconnect a process to a terminal after it has been disconnected, e.g. after starting a program in the background and logging out? Most variants of Unix do not support “detaching” and “attaching” processes, as operating systems such as VMS and Multics support. However, there are three freely redistributable packages which can be used to start processes in such a way that they can be later reattached to a terminal. The first is “screen,” which is described in the comp.sources.unix archives as “Screen, multiple windows on a CRT” (see the “screen-3.2” package in comp.sources.misc, volume 28.) This package will run on at least BSD, System V r3.2 and SCO UNIX. The second is “pty,” which is described in the comp.sources.unix archives as a package to “Run a program under a pty session” (see “pty” in volume 23). pty is designed for use under BSD-like system only. The third is “dislocate,” which is a script that comes with the expect distribution. Unlike the previous two, this should run on all UNIX versions. Details on getting expect can be found in question 3.9 . None of these packages is retroactive, i.e. you must have started a process under screen or pty in order to be able to detach and reattach it.

174

4.11) Is it possible to “spy” on a terminal, displaying the output that’s appearing on it on another terminal? There are a few different ways you can do this, although none of them is perfect: • kibitz allows two (or more) people to interact with a shell (or any arbitary program). Uses include: • watching or aiding another person’s terminal session; • recording a conversation while retaining the ability to scroll backwards, save the conversation, or even edit it while in progress; • teaming up on games, document editing, or other cooperative tasks where each person has strengths and weakness that complement one another. kibitz comes as part of the expect distribution. See question 3.9. kibitz requires permission from the person to be spyed upon. To spy without permission requires less pleasant approaches: • You can write a program that rummages through Kernel structures and watches the output buffer for the terminal in question, displaying characters as they are output. This, obviously, is not something that should be attempted by anyone who does not have experience working with the Unix kernel. Furthermore, whatever method you come up with will probably be quite non-portable. • If you want to do this to a particular hard-wired terminal all the time (e.g. if you want operators to be able to check the console terminal of a machine from other machines), you can actually splice a monitor into the cable for the terminal. For example, plug the monitor output into another machine’s serial port, and run a program on that port that stores its input somewhere and then transmits it out another port, this one really going to the physical terminal. If you do this, you have to make sure that any output from the terminal is transmitted back over the wire, although if you splice only into the computer>terminal wires, this isn’t much of a problem. This is not something that should be attempted by anyone who is not very familiar with terminal wiring and such. • The latest version of screen includes a multi-user mode. Some details about screen can be found in question 4.10. • If the system being used has streams (SunOS, SVR4), the advise program that was posted in volume 28 of comp.sources.misc can be used. AND it doesn’t requirethat it be run first (you do have to configure your system in advance to automatically push the advise module on the stream whenever a tty or pty is opened).

Unix - Frequently Asked Questions (5) [Frequent posting]
This article includes answers to: 5.1) Can shells be classified into categories?

5.2) How do I “include” one shell script from within another shell script? 5.3) Do all shells have aliases? Is there something else that can be used?

175

5.4) How are shell variables assigned? 5.5) How can I tell if I am running an interactive shell? 5.6) What “dot” files do the various shells use?

5.7) I would like to know more about the differences between the various shells. Is this information available some place?
If you’re looking for the answer to, say, question 5.5, and want to skip everything else, you can search ahead for the regular expression “^5.5)”. While these are all legitimate questions, they seem to crop up in comp.unix.questions or comp.unix.shell on an annual basis, usually followed by plenty of replies (only some of which are correct) and then a period of griping about how the same questions keep coming up. You may also like to read the monthly article “Answers to Frequently Asked Questions” in the newsgroup “news.announce.newusers”, which will tell you what “UNIX” stands for. With the variety of Unix systems in the world, it’s hard to guarantee that these answers will work everywhere. Read your local manual pages before trying anything suggested here. If you have suggestions or corrections for any of these answers, please send them to to tmatimar@isgtec.com.

5.1) Can shells be classified into categories? In general there are two main class of shells. The first class are those shells derived from the Bourne shell which includes sh, ksh, bash, and zsh. The second class are those shells derived from C shell and include csh and tcsh. In addition there is rc which most people consider to be in a “class by itself” although some people might argue that rc belongs in the Bourne shell class. With the classification above, using care, it is possible to write scripts that will work for all the shells from the Bourne shell category, and write other scripts that will work for all of the shells from the C shell category.

5.2) How do I “include” one shell script from within another shell script? All of the shells from the Bourne shell category (including rc) use the “.” command. All of the shells from the C shell category use “source”.

5.3) Do all shells have aliases? Is there something else that can be used? All of the major shells other than sh have aliases, but they don’t all work the same way. For example, some don’t accept arguments. Although not strictly equivalent, shell functions (which exist in most shells from the Bourne shell category) have almost the same functionality of aliases. Shell functions can do things that aliases can’t do. Shell functions did not exist in bourne shells derived from Version 7 Unix, which includes System III and BSD 4.2. BSD 4.3 and System V shells do support shell functions. Use unalias to remove aliases and unset to remove functions.

176

5.4) How are shell variables assigned? The shells from the C shell category use “set variable=value” for variables local to the shell and “setenv variable value” for environment variables. To get rid of variables in these shells use unset and unsetenv. The shells from the Bourne shell category use “variable=value” and may require an “export VARIABLE_NAME” to place the variable into the environment. To get rid of the variables use unset.

5.5) How can I tell if I am running an interactive shell? In the C shell category, look for the variable $prompt. In the Bourne shell category, you can look for the variable $PS1, however, it is better to check the variable $-. If $- contains an ‘i’, the shell is interactive. Test like so: case $- in
i) # do things for interactive shell ;; *) # do things for non-interactive shell ;;

esac

5.6) What “dot” files do the various shells use? Although this may not be a complete listing, this provides the majority of information. csh Some versions have system-wide .cshrc and .login files. Every version puts them in different places. Start-up (in this order):
.cshrc - always; unless the -f option is used. .login - login shells.

Upon termination: .logout - login shells. Others: .history - saves the history (based on $savehist). tcsh Start-up (in this order): /etc/csh.cshrc - always. /etc/csh.login - login shells.
.tcshrc .cshrc .login - always. - if no .tcshrc was present. - login shells

Upon termination:

177

.logout Others:
.history .cshdirs

- login shells.
- saves the history (based on $savehist). - saves the directory stack.

sh Start-up (in this order): /etc/profile - login shells. .profile - login shells. Upon termination: any command (or script) specified using the command: trap “command” 0 ksh Start-up (in this order): /etc/profile - login shells.
.profile $ENV - login shells; unless the -p option is used. - always, if it is set; unless the -p option is used. /etc/suid_profile - when the -p option is used.

Upon termination: any command (or script) specified using the command: trap “command” 0 bash Start-up (in this order): /etc/profile - login shells. .bash_profile - login shells.
.profile .bashrc $ENV - login if no .bash_profile is present. - interactive non-login shells. - always, if it is set.

Upon termination: .bash_logout - login shells. Others: .inputrc zsh Start-up (in this order): .zshenv - always, unless -f is specified. .zprofile - login shells.
.zshrc - interactive shells, unless -f is specified. .zlogin - login shells.

- Readline initialization.

178

Upon termination: .zlogout - login shells. rc Start-up: .rcrc - login shells

5.7) I would like to know more about the differences between the various shells. Is this information available some place? A very detailed comparison of sh, csh, tcsh, ksh, bash, zsh, and rc is available via anon. ftp in several places: ftp.uwp.edu (204.95.162.190):pub/vi/docs/shell-100.BetaA.Z utsun.s.u-tokyo.ac.jp:misc/vi-archive/docs/shell-100.BetaA.Z This file compares the flags, the programming syntax, input/output redirection, and parameters/shell environment variables. It doesn’t discuss what dot files are used and the inheritance for environment variables and functions.

179

Sign up to vote on this title
UsefulNot useful