Professional Documents
Culture Documents
Unix Administration: David Malone March 20, 2001
Unix Administration: David Malone March 20, 2001
Contents
1 The Function of the Unix Administrator 2 Basic Unix Objects 3 Useful Tools for the administrator 4 Startup, Shutdown and Rebooting 4.1 Finding the Kernel . . . . . . 4.2 After the Kernel . . . . . . . . 4.3 Stopping and Restarting . . . . 4.4 Special Situations . . . . . . . 5 Shell Scripting 5.1 Hupping a Process . . . . 5.2 Zapping Certain Processes 5.3 Rotating Log Files . . . . 5.4 The Cron Facility . . . . . 5.5 A Final Note . . . . . . . 1 2 4 5 5 5 6 7
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
7 . 7 . 7 . 8 . 9 . 10 10 11 11 12 12 13 13 13 13 14 14
6 Monitoring the System 6.1 Monitoring Processes . . . . . . 6.2 Monitoring Available Disk Space 6.3 Disk, Network Performance . . 6.4 Log les . . . . . . . . . . . . . 7 System Security 7.1 Basic Unix Security . . . . 7.2 Security and People . . . . 7.3 Security and Passwords . . 7.4 Security and Files . . . . . 7.5 Some Things to Watch for
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . . 1
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
8 9
15
Disks, File Systems and Swap Space 15 9.1 Formatting, Partitions and File Systems . . . . . . . . . . . . . . . . . . . . . . 16 9.2 Mounting and Unmounting File Systems . . . . . . . . . . . . . . . . . . . . . . 16 9.3 Fsck . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 18 18 18 19 19 19
10 Backup and Restore 10.1 Tar . . . . . . . . . 10.2 Cpio . . . . . . . . 10.3 Dump and Restore 10.4 Bits and Bobs . . . 11 References
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
However Unix utilities such as ls or ps have used user names instead of numbers as this is easier for people to deal with. The mapping of names to numbers is usually done in the /etc/passwd and /etc/group les. These les also provide other information which are useful to Unix utilities (eg. Home directory, shell) and to people (eg. Users real name). For the normal running of a Unix system an uncorrupted /etc/passwd is essential. le system layout The standard Unix le system layout is also not a fundamental Unix object, but is essential for the normal operation of a Unix system. This layout is not completely standard across all Unix systems, but there is usually a strong degree of agreement. /bin /dev /etc /sbin /home /mnt /proc /tmp /usr/bin /usr/include /usr/lib /usr/local /usr/sbin /usr/share /usr/share/man /var /var/spool Essential commands Devices Conguration les and boot scripts Essential system commands Home directories for users and projects Tempery le system mount point Special process monitoring le system Scratch/Scratch les Non-essential user commands Header les for compilers Shared and static libraries Like /usr, but for local software Non-essential system commands Files that can be shared between machines Manual pages (sometimes /usr/man) Files specic to this machines running Spool space for mail, printers, periodic jobs
On older systems /var and /usr/share may be merged into /usr and sbin directories merged into etc and bin directories. privilege The privilege system in Unix is quite simple. Generally a process may only access a le if it is in the running as the user or is in the correct group. The only exception is if the uid is zero. This usually corresponds to the user name root. Root has other privileges the ability to send a signal to any process, the ability to run certain system calls (eg. reboot), the ability to change to any uid and the ability to make certain network connections. Privilege is gained in Unix by running a program with the set user ID attribute (suid). These programs usually perform whatever privileged action is needed and then discard their privilege by switching to another uid. To get a shell which is running as root, the su program is usually used. This is considered better than logging straight in as root, as it logs a message saying who became root. 4
Activating swap devices. Preening le systems with fsck. Setting permissions on les in /dev. Starting the most basic dmons. On System V machines there is a standard procedure for adding new jobs to the sequence for entering a new run level. Take, for example, the startup of run level 2. The script rc2 runs all les in /etc/rc2.d with names like S23cron with the argument start, sorted by the number in their name. These les will usually be links to les in /etc/init.d for easy management. This leads to the following standard procedure for adding new jobs to run level 2. 1. Write a script that when passed the argument start performs the required job. 2. Copy this to /etc/init.d. 3. Make a link to this le from /etc/init.d with a suitable name. Sometimes it may be more appropriate to add a job directly to /etc/inittab - this will usually be when the job does a one off task and does not provide a service.
5 Shell Scripting
Having a basic grasp of shell scripting is essential for effective administration. It provides a way of automating administration tasks and is also necessary for understanding parts of the system such as the boot time scripts. Shell scripting involves knowing how to string Unix commands together, and how to use a shell to connect these commands sensibly. It is probably best learned by example Firsch provides several commented examples in Chapter 8 and Appendix A. However examining boot scripts will provide examples and familiarity with the boot process.
5.2
This script searches the process table for processes which match a certain description. It then asks if you would like to kill those which match.
ps -aux | fgrep $grepstring > $tmpfile for pid in awk { print $2 } $tmpfile do awk $2 == $pid { print $0 } $tmpfile echo -n "Kill this process(y/n)? " read ans if [ $ans = "y" ] ; then kill $pid fi done
5.3
This script moves on log les listed on the command line. It keeps all but the most recent one compressed, and removes the last one once there are more than CYCLES of them. For example, CYCLES=3 ; rotate messages Would have the following effects. messages messages.1 messages.1 messages.2.gz messages.2.gz messages.3.gz messages.3.gz removed
#!/bin/sh # Rotate a log file and keep N copies # Mostly stolen from inn CYCLES=${CYCLES-5} COMPRESS=/usr/local/bin/gzip Z=.gz for F in $* ; do ## Compress yesterdays .1 test -f ${F}.1 \ && ${COMPRESS} <${F}.1 >${F}.1${Z} \ && rm -f ${F}.1 \ && chmod 0440 ${F}.1${Z} ## Do rotation. EXT=${CYCLES} rm -f ${F}.${CYCLES}${Z} while [ ${EXT} -gt 0 ] ; do NEXT=${EXT} EXT=expr ${EXT} - 1 test -f ${F}.${EXT}${Z} \ && rm -f ${F}.${NEXT}${Z} \ && mv ${F}.${EXT}${Z} ${F}.${NEXT}${Z} done mv ${F} ${F}.1 done
5.4
The cron facility in Unix makes shell scripts very useful for performing tasks such as log le rotation, backups and regular system monitoring completely automated. Cron allows a command to be executed every hour, day, week, month or year at a given time. Often this command will be a shell script designed to perform some job and mail the results to the user who set it up. Cron has taken several forms in different versions of Unix, and to nd out which version you are using you will need to look at the manual pages for cron, crontab (the program) and crontab (the le). Files belonging to cron can often be found in /etc/crontab, /var/cron and /var/spool/cron. When removing user accounts on the system it is a good idea to check if they have cron jobs (or at jobs) if they had the ability to set these up. It is often possible to restrict who may set up these jobs.
10
11
Running processes. Available disk space. Disk and network performance. Log les and process accounting. Some of these require very little effort to monitor, others require some degree of scripting to produce useful reports. The key to all is to become familiar with the normal state of the system.
12
If usual use of the system creates temporary les it is useful to have a periodic job which searches for old temporary les and removes them. You should warn your users about this! The best solution to a real disk space problem is to get more disk space. Some people advocate convincing users to clean up this is rarely effective.
6.4
Log les
Log les provide lots of information, often too much. Again grep and friends will allow you to make sense of what is provided. Log les often contain early warnings of pending hardware failures. You should check the log les regularly so you know what to expect. Log les are usually created by syslogd. This program can be congured to log different amounts to different places. It is congured through the le /etc/syslog.conf, and must be sent a HUP signal when a change is made to the le. Various versions of syslogd allow log messages to be sent to les, users terminals, other Unix machines or other programs. Usually syslogd will not create a new le for logging in to. The creation of the le should be performed whenever log les are rotated. Finally, it is sometimes useful to send all log messages to your terminal if you are trying to solve a problem.
13
7 System Security
Computer security involves protecting your computer system (including software, hardware, data and services) from theft, modication and destruction. This includes both accidental and malicious actions. The most basic level of defence is a recent full backup of the system. Recent means recent enough that only an acceptable amount of work will be lost. Full means sufcient to rebuild the system and place it in the same state as it was in when the backup was taken. For security a copy of such a backup should be kept in a location remote from the actual machine.
14
Passwords can be guessed in several ways. The rst it just to try to log in as the person, and try likely passwords. This scheme is rather inefcient as often a system will only allow one login attempt every few seconds. The other scheme for password guessing is to obtain the users encrypted password. This allows you to guess passwords at the full speed allowed by available CPU power. Using a modest network of PCs it would be possible to try 300,000,000 passwords per hour. This makes large dictionaries (with obvious modications) viable guess lists. To attempt to avoid this problem, more recent versions of Unix have moved the encrypted passwords from /etc/password to a le only readable by root, often called /etc/shadow. The other option is to choose difcult to guess passwords. These passwords can be difcult to remember. Most users type their password once or twice a day, which is often frequently enough for their ngers to learn the password within a week. Note, that services such as remote login will not always ask for a password. It is important that if use services which use network based authentication without passwords that you trust the networks and hosts involved. Many vendors advocate regular password changes. There is debate concerning the wisdom of this measure. It does not improve security in all situations.
7.5
Users in the password le with user ID 0. Having . in your path somewhere before the end. People having physical access to the machine. Users should not tell anyone their password. Remind them that the system administrator does not need their password, and no one else should know it. If a security breach is suspected/detected then gather as much information as you can quickly. Check log les, the dot les of the user involved, the last command, and the lastcomm command. The information may be gone tomorrow.
15
4. If the systems lists connected disks at boot time check the disks shows up. 5. Make the device1 les for the disk if necessary. 6. Now format, partition and create le systems on the disk. 7. Mount the le systems on the disk, and activate the swap space. 8. Populate le systems from tape or disk.
9.1
The use of formatting has become somewhat confused by the advent of oppy disks. There are three stages to preparing a disk for use: formatting, partitioning and le system creation. On a oppy disk all these stages are rolled into one. To add to the confusion, hard disks do not need to be formatted any more (unless there is a serious failure), but do need to be partitioned and have le systems created. A disk is usually split into regions called partitions. A partition is just a chunk of the disk starting at sector a and going to sector b. Under Unix partitions are usually allowed to overlap, and as long as two overlapping partitions are not used at the same time all should be well. Partitions are also usually assigned some sort of type. Under DRS/NX a partition may be used for a le system, swap, dump, boot or nothing. This type is assigned at the time the disk is partitioned, but may be changed later if the partition is not in use. The command for partitioning disks under Unix is both vendor and hardware dependent. Sometimes partitioning is called labeling. On the DRS/6000 the command for partitioning is fmthard you can understand why people are confused. Creating a le system on a partition involves creating the basic structure on the hard disk needed to represent an empty le system. There is a greater degree of standardization for this task, and it can usually be performed with the mkfs program. On some versions of Unix this has been replaced/complimented by the less obtuse newfs program. DRS/NX offers you a choice of le systems. The old SysV le system is supported as s5. The new fast le system is supported as ufs and a simple le system, used for booting is provided as bfs. The le system you will want to use for all normal le systems is ufs. There are various parameters for each le system which can be changed, however for most purposes the defaults are ne.
17
Most le systems are mounted automatically at boot time. These les systems are listed in the le /etc/vfstab. Unmounted or new le systems may be mounted while the system is running with the mount command. mount -F bfs /dev/rdsk/c0d1s3 /mnt A le system which is mounted may be unmounted using unmount, however no le in the le system may be in use when you try to unmount it. In use includes directories who are the current directory of some running process this means you may have to kill some processes to be able to unmount a le system. Check the manual pages for mount and unmount they often have useful options which are system specic.
9.3
Fsck
During the Unix boot process, the sanity of the le systems is checked. This is with the fsck program. This program can correct many minor le system problems, without interaction from the user. However sometimes it encounters a problem which it can not x with out potential loss of data. In this case the boot process will usually halt, and you will be asked to check the le systems by hand. This involves running fsck, which will prompt you and ask if you wish to make changes. Unless you know a lot about the le system, you probably have no choice but to answer yes. In fact there is usually an option -y to fsck which answers yes to all questions. There are some alternatives to answering yes. 1. The system may allow you to mount the le system read only. If the le system is not too badly damaged you can now back it up to tape. You can then safely answer yes to all fscks questions and restore the data from tape. However, if the le system is very badly damaged, mounting it read only may cause the system to panic. 2. If all the le systems on a disk seem to be damaged it may be the case that the cable to the disk has come loose or some such. In this case check all connections to the disk before answering yes to these questions. 3. You can go through the list of questions answering no, to see what fsck intends to do. You may nd that it will try to remove various les, some of which you may not consider important. Say yes to the removal of these rst, and then run fsck again to see if it can now clear up the mess. Note, that you may need to rerun fsck several times to get the le system in a sane state. When fsck runs with no complaints, then the damage has been cleared up. In very rare cases fsck may get stuck on a very messed up inode. From the messages fsck produces you should know the number of this inode. In this case you can use clri and fsdb to clear this inode. This procedure is not for the weak of heart. Remember, if the le system is badly enough damaged it may be easier to clear the le system with mkfs and restore from tape. Also, if fsck nds any les and cannot work out where they should be it will put them in the lost+found directory at the top of that le system. 18
10
Backup and restore essentially consists of the copying of large numbers of les from one place to another. DRS/NX provides a suite of tools for backup and restore which are described in some detail in the manual. Unix in general provides three tools for the large scale storage of les: tar, cpio and dump/restore. At least two of these systems should be available on any system. Forming a schedule for backups can be simple if all your data than can be backed up in one session. You just do full backups every night. If you end up with more data you may have to work with incremental backups. In an incremental backup system each backup has a level, which is a number between 0 and 9 (say). During a level 0 backup you backup everything. During a level 1 backup you backup everything that has changed since the last level 0. During a level 2 you back up everything that has changed since the last level 0 or 1 dump, and so on. Using a tape drive on Unix is also an interesting direct encounter with a le in /dev. The tape device behaves like a series of les. When you read it you read the current le on the tape, and when you write it you write to the current le on the tape. Usually there are at least two different tape devices. One rewinds the tape when you nish reading or writing, and the other leaves the tape ready to read the next le. Using the mt command you can rewind, eject, fast forward the tape drive.
10.1 Tar
The tar command is for Tape Archiving. Today is is often used for the distribution of Unix software. What tar does is take a list of les and make one le from them a tar le. This is much like the pkzip program for the PC, except no compression is done2 . The tar program has two weaknesses. First, many vendors implementations of it cannot deal with special les. Second it does not understand the tape system well, and has trouble with multi-volume backups. These reasons combined make tar unsuitable for full backups, but quite suitable for distribution of software. One other use of tar is the copying of directories from one place to another. The cp command is often not suitable for this as it may change the ownership of les. A way to copy directory /dir1 to /dir2 follows. cd /dir1 ; tar cf - . | ( cd /dir2 ; tar xpf - )
10.2 Cpio
All System V machines come with cpio, but some do not come with tar. Both programs do much the same job. The main difference is that tar takes lists of les for archive/extraction on the command line and cpio takes the list on stdin. Many versions of cpio also suffer from the same problems as tar has with multi-volume backups, however it doesnt seem to have as much trouble with device les. For this reason cpio can be used to back up small le systems quite successfully.
2
19
There are several different formats of archive which cpio can write and read. Not all versions of cpio support all the archive formats, so cpio is less often used as a way of distributing software in the Unix world. Is is also possible to copy directories with cpio. cd /dir1 ; find . -depth -print | cpio -pdm /dir2
11
References
Essential System Administration, second edition, leen Frisch, OReilly & Associates. Unix System Administration Handbook, second edition, Evi Nemeth et al., Prentice Hall.
The rst book is a readable and contains a suggested reading list. The second book is a useful reference. The OReilly & Associates nutshell handbooks are highly praised as no nonsense guides to computing. 20