Professional Documents
Culture Documents
Norman J. Buchanan
Douglas M. Gingrich
The UNIX Shell Guide
by Norman J. Buchanan & Douglas M. Gingrich
Copyright ©1996 University of Alberta, Department of Physics. All rights reserved.
UNIX is a registered trademark of Novell.
Many of the designations used by manufactures and sellers to distinguish their products
are claimed as trademarks. Where those designations appear in this book, we are aware of
a trademark claim.
While every precaution has been taken in the preparation of this book, the authors assume
no responsibility for errors or omissions, or for damages resulting from the use of the
information contained here in.
Preface
This book is intended to assist UNIX users in understanding and dealing with five of the
most popular UNIX shells: the Bourne shell (sh); the C shell (csh); the Korn shell (ksh);
the TC shell (tcsh); and the Z shell (zsh). The idea came mainly out of frustration in
trying to get understandable information on shell usage. Much has been written on UNIX
as a whole but rarely is more than a chapter on Bourne shell or C shell included, and
these usually prove to be little more than a loose translation of the online manual (man)
pages. Shells are certainly sophisticated and powerful enough to warrant a detailed
treatment on their own. The man pages, as any UNIX user is well aware, are fine for
listing commands and associated flags but that is where the usefulness ends. Examples
are seldom found and some of the features are covered in one or two lines of text, leaving
the user to overlook their importance. The man pages are, after all, meant to act as a
reference more than an actual instruction manual and should be treated as such.
Hopefully this book will achieve two goals. First, to give a clear and concise look at each
of five shells. Each shell will be examined from the point of view of interactive work
features, and then from a shell programming point of view. Examples will be used as
much as possible to lend illustration to the concepts presented in the text, especially in the
area of shell programming. Hopefully, the range of examples will be such that there will
be something that everyone can use. New UNIX users will then be able to pick up on
some of the powerful features of the particular shell(s) of interest and develop a feel for
shell programming, which is probably the most flexible and exciting feature of UNIX
shells in general. The second goal of the book is to allow users the opportunity to
examine the features of the various shells covered and determine which shell(s) might be
right for them. Each shell will be contrasted with the others to demonstrate the strengths
and weaknesses compared with the rest. This we hope will make this book truly unique.
As with most other things in life, choice can make things more difficult than expected.
While it is nice to have choice, making a choice can require a level of research that most
people just do not have the time to invest. Sometimes the number of choices is not even
known, further complicating the decision process - how can you make a choice if you do
not know what the choices are? In this book enough information will be presented so the
users can make educated decisions without spending the time trying to gather the
information on their own, or having to make sense of it afterwards.
Users new to UNIX will be able to become acquainted with the shell they are provided
with, learning the details of shell usage until they are able to decide if a different shell
may better suit their needs. Even users who have spent many years mastering a particular
shell may find that some of the more recent shells provide powerful features that they can
tailor to their specific needs. If any of the shells mentioned in this book are not available
on your machine, most system administrators can be persuaded to load them on to your
machine for use. To further assist in making the various shells available, FTP sites where
they can be found are included.
This book will be of great interest to users of personal computers who are considering
moving or have just moved to the new UNIX-compatible operating system for the
personal computer - LINUX. LINUX is distributed freely in electronic form and for low
cost from many vendors. The standard software distribution includes many of the shells
described in this book.
Norman J. Buchanan
Douglas M. Gingrich
What is a Shell?
What is a shell? A shell is a command interpreter. While this is certainly true it likely
doesn't enlighten the reader any further. A shell is an entity that takes input from the user
and deals with the computer rather than have the user deal directly with the computer. If
the user had to deal directly with the computer he would not get much done as the
computer only understands strings of 1's and 0's. While this is a bit of a misrepresentation
of what the shell actually does (the idea of an operating system is neglected) it provides a
rough idea that should cause the reader to be grateful that there is such a thing as a shell.
A good way to view a shell is as follows. When a person drives a car, that person doesn't
have to actually adjust every detail that goes along with making the engine run, or the
electronic system controlling all of the engine timing and so on. All the user (or driver in
this example) needs to know is that D means drive and that pressing accelerator pedal
will make the car go faster or slower. The dashboard would also be considered part of the
the shell since pertinent information relating to the user's involvement in operating the car
is displayed there. In fact any part of the car that the user has control of during operation
of the car would be considered part of the shell. I think the idea of what a shell is coming
clear now. It is a program that allows the user to use the computer without him having to
deal directly with it. It is in a sense a protective shell that prevents the user and computer
from coming into contact with one another.
Redirection
Throught this book the topic of redirection will be visited many times. Redirection is
where data is sent to or from when interacting with a command. Fo instance, when a user
logs onto a terminal on the local network, a group of messages may be displayed, or
perhaps just a prompt. The messages or prompt have been sent to the terminal window
for the user to read, in the case of the messages, or interact with, as would be the case of
the prompt. This output stream, as it is called, is sent to what is called the standard output
(STDOUT). This is usually automatically set to be the screen of the workstation or
terminal, although this hasn't always been the case. In the very early days of UNIX,
STDOUT may well have been a teletype machine, long since extinct. When a program
(or equivaently, a command) is executed, the output can be redirected to file by using the
> operator. The general syntax for all of the shells covered in this book (and any other
shells for that matter) is the following:
PROMPT> command name >filename
it would not be an error to have a space between the > and the filename, it is just a matter
of taste. When a program requires input, like the program which runs the login procedure,
accepts data via the input stream called standard input (STDIN). This is almost always set
to the keyboard, for obvious reasons. A command will often take data from either a file or
STDIN, as is the case with the cat (concatenate command):
PROMPT> cat filename
which would send the contents from the file filename to the screen. The following would
repeat to echo whatever the user had entered to the screen, after pressing the <ENTER>
key, until the end of file (EOF) character (usually Control D) had been entered:
PROMPT> cat
Hello There <ENTER>
Hello There
How are you? <ENTER>
How are you?
<CTRL D>
PROMPT>
The last important data stream is the standard error. When a program (or command) is
executed, it might encounter problems completing its task for whatever reason. When this
happens, the shell will, in most cases, echo an error message to STDERR. By default,
STDERR is directed to the screen along with STDOUT, but it doesn't have to be.
Situations may very well arise where the error messages might be sent to a file, or
somewhere else, while the standard output would be sent to the screen. The shells handle
this situation in different ways so examples demonstrating this procedure will be left until
the individual chapters.
In UNIX these data streams can also be referred to by numbers called file descriptors.
This provides another way to represent redirections but is again shell dependent and will
thus left to the appropriate sections for examples. The following table associates each file
descriptor with the corresponding data stream:
• A Bit of History
• Getting to Know the Bourne Shell
• Multiple Commands Per Line
• Redirection and Pipelines
• Filename Substitution
• Special Characters
• Variables
o Variable Substitutions
o Parameters
o Environmental Variables
o Special Variables
• Bourne Shell Programming
o Testing and Branching
o String Comparisons
o Testing Files
o Loop Control
Exiting Loops Early
o Input and Output (I/O)
o Advanced Programming Topics
Functions
Trapping Errors
o Dot Files
A Bit of History
The Bourne shell is the obvious shell to examine first for two reasons. First and foremost,
it was the first shell written. It was was written at Bell Laboratories, by Stephen Bourne.
This makes the Bourne shell the foundation which all other shells are (at least in part)
built on. In fact it will be seen that the shells covered in this book fall basically into two
categories, or families, the Bourne family and the C shell family. The second reason to
start with the Bourne shell is that it comes with all UNIX systems, making it a good
introductory shell purely on the basis of availability.
Special Characters
The previous section should clearly demonstrate that there are a number of characters that
are special to the shell, and hence some extra measures must be taken when attempting to
use them in a casual manner. For illustration of this point, imagine a directory containing
the following files:
Mail/ News/ a.out* prog.c utils/
and the user types, while in this directory,
$ echo * Hello *
Expecting the following output:
* Hello *
The user is shocked to see the actual output:
Mail News a.out prog.c utils Hello Mail News a.out prog.c utils
The shell has simply used its definition of the * symbol which differs greatly from the
users definition of the same character. This is a common example of how care has to be
taken when using shell special characters in strings. There are a few ways to handle this
situation. The first solution, which is probably the easiest for this particular example, is to
escape the *. This means that a backslash should be placed directly in front of the *.
Escaping a character causes it to be treated by the shell as though it were simply a text
character. Therefore the above example could be quickly corrected by entering the
following:
$ echo \* Hello \*
which would give the desired result. The result could also be achieved using single (left
hand) or double quotation marks. When either single or double quotation marks enclose a
string, the special characters (with the exception of the $ and right hand single quote ('))
are taken to be string characters. An advantage of quotation marks over escape sequences
is that if there are many special characters in a string, quotes will save keystrokes and
thus time. Another advantage is that they recognize white space. An echo statement
without quotation marks will not recognize spaces or tabs causing the following:
$ echo \*\*Warning\*\* Disk Space is Low
which will give as output:
**Warning** Disk Space is Low
A quick and easy fix would be
$ echo '**Warning** Disk Space is Low'
and this would give the appropriate output. While both single and double quotes solve the
above problems, they are different. The difference comes in handling variables, which
will be covered shortly. The single quote will not substitute a variable into an expression
whereas the double quote will. When deciding which of the methods to use, the
complexity of the expression (i.e. how many special characters are used) should be
looked at, as well as variable usage.
Command substitution is a topic that ties up these last few sections. Command
substitution allows the output of a command to be substituted into another command. For
example,
$ echo "There are `who | wc -l` users on the system"
may give as output
There are 13 users on the system
The user now has the tools and flexibility to handle even the most specific or complicated
command in the Bourne shell.
Variables
This is where the shell gets interesting. Variables add a level of generality to the
environment. A variable is simply a name that acts as a placeholder for a value or set of
values (an array which will be covered in later sections). In the Bourne shell there are
four different types of variables:
User defined variables are fairly straightforward. In the Bourne shell they take the
following form:
$ SIZE=1024
$ MY_ADDRESS=buchanan@phys.ualberta.ca
$ greeting='Welcome to the Bourne shell'
To access a variable, a $ must be placed in frontof the variable name or else the shell
will not realize that what follows is not a command. The shell will then attempt to
execute the command and return an error message. An example of accessing a user
defined variable is then
$ echo You can e-mail me at $MY_ADDRESS
You can e-mail me at buchanan@phys.ualberta.ca
The above can be used to demonstrate the difference between single- and double-quote
handling of strings. If single quotes are used to enclose the string, the shell will not make
the variable substitution as it treats the $ as a text character rather than a signal that a
variable is coming:
$ echo 'You can e-mail me at $MY_ADDRESS'
You can e-mail me at $MY_ADDRESS
Double quotes however, will allow the shell to recognize the variable and pass its value
to the echo command:
$ echo "You can e-mail me at $MY_ADDRESS"
You can e-mail me at buchanan@phys.ualberta.ca
Variable names can be nested as well. This is another way to say that one variable can be
set equal to another variable which contains another, etcetera.
$ today=Tuesday
$ day=$today
$ echo The day of the week is $day
The day of the week is Tuesday
When assigning a value to a variable, it is important to leave no white space. This is
because the assignment is terminated by white space, unless the appropriate quotation
characters enclose the string value. This allows more than one variable assignment to be
made on a single line:
$ a=cat b=dog c=elephant
It is important however to realize that the assignments are processed from right to left and
therefore if the following assignments were made:
$ VAR1=$VAR2 VAR2=hello
the value of VAR1 would be hello whereas if the the order was reversed:
$ VAR1=hello VAR2=$VAR1
the value of VAR2 would be undefined. This is because VAR2 is assigned the value of
VAR1 before VAR1 has been given a value.
Variable assignments can also be removed using the unset command. For example,
$ VAR="Hello"
$ echo $VAR
Hello
$ unset VAR
$ echo $VAR
• Variable Substitutions
• Parameters
• Environmental Variables
• Special Variables
Variable Substitutions
Once a variable has been assigned a value, it can be easily referenced by the use of a
dollar sign in front of the variable name as shown above. A user may wish to append a
string directly to a string variable such as
$ VAR=Tues
$ echo $VARday
which will return no value since no value has been assigned to VARday. The shell can not
determine that the user meant to treat $VAR as a separate entity and hence it treated the
the string as one complete variable name. This is because the shell uses white space when
interpreting variable 22substitutions. However, if curly braces containing the variable
name are placed within a string, the shell will handle the entire string as desired:
$ VAR=Tues
$ echo ${VAR}day
Tuesday
The Bourne shell also provides variations on the above variable substitution that allow
alternative substitutions under certain conditions. There are four such conditional
substitutions in the Bourne shell which are listed in table 2.2:
Parameters
Parameters are another type of shell variable that carry information from the command-
line. Any command can be broken up into parts. The first element of the command-line is
always a program (either an executable or a shell script, which will be covered shortly).
The following elements can be looked at as special values to be passed to the program.
These values are called positional parameters and are stored in the variables $1 through
$9. The parameter $0 is a special variable that always holds the name of the program.
Positional parameters will be covered in more detail in the section on shell programming.
Environmental Variables
Environmental variables are variables that have special meaning to the shell. These are to
be used to customize the environment for a particular user. These variables should be
defined in the special program that is executed during login called the .profile (dot
profile) file. Any defined variables would then be set until the end of the session unless
explicitly unset or redefined. An example of a .profile file is included in the section on
programming. Table 2.3 contains an alphabetical list of the Bourne shell environmental
variables, a brief description of what each is used for, and the default setting.
Special Variables
Special or builtin variables are define by the shell for use at any time. They are mostly
used for shell programming and are covered in more detail in the next section. Table 2.4
contains descriptions of the Bourne shell special variables:
Table 2.4: Bourne shell special variables.
String Comparisons
Strings are considered to be equal if each character in one is matched exactly in the other,
and they have the same length. For example the following strings are not equal: String is
not equal to String. While the strings contain the same letters, the second string contains
an extra space, and they are therefore not equal. All of the string tests supported by test
are listed in table 2.5.
Testing Files
A very important test is often the status of a file. Perhaps one test would be to test if a file
exists before creating another with the same name (which would of course destroy and
replace the pre-existing file). The test command will test files for such attributes as
existence, writability, readability and file type. Table 2.6 summarizes the file test formats
and return values.
Table 2.6: File test formats and return codes.
To illustrate the use of file testing, the following is a script that acts as a revised version
of the UNIX move (mv) command. Not only does this script check for overwrites, it also
deletes any file that is of zero size.
#! /bin/sh
#
# Move is a script that moves a file if no overwrite will occur
# and will delete the file to be moved if it is empty
#
# Usage: move [file] [destination]
#
if [ ! -s $1 ]
then
if rm $1 2>/dev/null
then
echo "$1 was empty and thus removed"
else
echo "$1 does not exist"
fi
else
mv -i $1 $2
fi
This example uses the if:then:else construct but also uses a few things not discussed in
much detail. The first new idea was briefly covered in the section on variables -
parameters. The $1 passes the name of the file being moved (the first name entered on
the command-line after move), and the $2 is the destination or place that the file is
moving to. The Bourne shell allows nine such parameters, each separated by white space,
and therefore nine pieces of information can be passed to a shell program from the
command-line.
While the Bourne shell only accommodates direct access to nine parameters (i.e. $1
through $9), nothing stops the user from entering more. The shell comes with a tool that
permits access to more than nine parameters. More generally, this tool permits the use of
an undetermined, at run time, number of parameters. This means that the script can let the
user enter as many parameters as he wishes on the command line and the program will
handle any or all of them as required. This tool is the shift command. The shift command
slides the parameter list by a specified number of places to the left and has the following
form:
shift n
where n is the an integer that specifies how many places the list will be adjusted. If n is
not explicitly given, the parameter list is moved one place. The following illustrates how
the shift command works on a parameter list of size n. Before the shift command has
been used, the list would look like
$1 $2 $3 $4 ... $n
and after using shift:
$2 $3 $4 $5 ... $(n-1)
where after the shift command was used, the first parameter is no longer accessible. The
number of parameters in the parameter list drops by the number of places that the list was
shifted as well. For example, if a user inputed 5 parameters and then the script shifted the
list:
echo $#
5
shift
echo $#
4
where the 4 and 5 would be the output of the shell script. This presents a clever way of
accessing any number of parameters in a script. To ensure that all parameters have been
processed, a simple test command can be incorporated into the script as follows:
while [ $# -ge 1 ]
do
echo $1
.
.
.
shift
done
This script piece will print the parameters, carry out some unspecified tasks, and repeat
until all of the parameters passed have been exhausted.
If a script is written to expect a certain number of parameters, and the user places too
many on the command line, the program will only use as many as were required. This
means that if the script expects 2 parameters, and the user enters 3, only the first two
entered will be used. If, on the other hand, the user enters fewer parameters than the
script was written to expect, it will fill the un-entered parameters with a null value (i.e.
they will be empty).
Another new idea (from the previous example script) was that of having one if statement
within another, which is called a nested if statement. This allows tests and decisions to be
made as a result of earlier tests. Nested decisions allow very specific testing with less
coding than would otherwise be the case. Notice that the first if statement uses a !
character, which negates the test, or turns a TRUE value to a FALSE value and vice
versa.
The comments have been included to give some description of the task the script will
carry out. One commented line in particular gives the expected usage of the script which
allows the user to have knowledge of which order to enter the parameters on the
command-line. This is not required but is good practice - especially if the script is to be
used by people other than the programmer.
There is a third type of test that can be made. This is the integer comparison test. This
type of test is done when comparing the values of two integers. The first integer, the one
on the left, is compared with the one on the right. For instance, the following tests to see
whether Integer1 is greater than Integer2:
$ test Integer1 -gt Integer2
There are in total six comparisons that can be made between two integers: if they are
equal (-eq); if the are not equal (-ne); if Integer1 is greater than Integer2 (-gt); if
Integer1 is greater than or equal to Integer2 (-ge); if Integer1 is less than Integer2 (-lt);
and finally, if Integer1 is less than or equal to Integer2 (-le).
In the Bourne shell, mathematical expressions cannot be simply assigned to a variable.
This means that the expression a=b+c is not valid. To get the result of an arithmetic
expression, the expr command must be used. This command evaluates the arithmetic
expression, and then returns the result. There are five arithmetic operators that can be
used in the Bourne shell:
Loop Control
While branching is an integral part of shell programming, it is only one part of program
control. Looping in a program allows a portion of a program to be repeated as long as the
programmer wishes. This can be for a specified number of iterations (or loops), or it can
be until a particular condition is met. For instance, a programmer might want to repeat a
particular operation on every file in a particular directory. Rather than rewrite the section
of the program that carries out the operation over and over for each file, the operation can
be written once and iterated as many times as required. Loop control also allows for
programs that are more general. Rather than having to pre-specify how many iterations
are required for a particular task, the programmer uses conditions to control the iterations.
The Bourne shell provides a rich variety of loop control constructs. Depending on what is
needed the programer can chose the construct that best suit his needs.
The for:in:do construct is used to repeat a group of commands once for each item in a
provided list. The construct has the following form:
for VARIABLE in LIST
do
COMMAND LIST
done
where VARIABLE is a variable name assigned each item in LIST during the execution of
COMMAND LIST. What happens is as follows: the variable takes the value of the first item
in the list and then executes the command list; after the command list has been passed
through, the variable is assigned the value of the second item in the list, and so on, until
the list has been exhausted. Searching for the occurence of a string in a file could be done
like the following:
#! /bin/sh
#
# A script to look for the occurence of a string in a file
# Usage: match [string] [file]
#
for word in `cat $2`
do
if [ ``$word'' = ``$1'' ]
then
echo ``Found $1 in file $2''
else
:
fi
done
where the first parameter passed to the script is string and the second is the file thought to
contain the string. Notice that after the else statement a colon has been placed. This is the
null command which tells the computer to do nothing. This is clearly an unnecessary
section of the script and was only added to demonstrate the use of the null command.
If the list is omitted from the for statement, each parameter in the command line will be
passed to word.
A second type of loop control construct is the while loop. This construct, unlike the
for:in:do construct, checks the TRUE or FALSE value of a condition before proceeding.
It has the following form:
while condition
do
commands
done
where the condition can be obtained in the usual fashion using one of the forms of test.
The done statement signifies the end of the construct. The following script segment
counts backwards from 10 to 1:
number=10
while [ $number -ge 1 ]
do
echo $number
number=`expr $number - 1`
done
Another construct which is very similar to the while loop is the until construct. The
construct works in precisely the same manner with the one exception that it repeats a
series of commands until a condition is met. The until loop looks like
until condition
do
commands
done
To count backwards, as in the above example, the until loop would be used as follows:
number=10
until [ $number -lt 1 ]
do
echo $number
number=`expr $number - 1`
done
In the three looping constructs examined above, the loop would continue to execute until
a specific condition was met (a boolean condition in the while and until loops, or
completion of a certain number of loops in the for loop). There are times however when it
would be beneficial to exit a loop early. The Bourne shell provides two methods for
exiting from a loop early, the break command, and the continue command.
The break command will exit from the current loop and program control will be resumed
directly after the loop construct exited from. The break command takes an integer
parameter n which determines the number of levels to jump out of. For example,
until cond1
do
Command_A
until cond2
do
Command_B
if ! $?
break 2
fi
done
Command_C
done
Command_D
In this example, cond1 is evaluated and if it does not have a TRUE value Command_A is
executed. If cond2 has a FALSE value Command_B is then executed and the following if
statement checks to determine if the return value was TRUE. If the return value is TRUE
cond2 is again tested. If the return value of Command_B is not TRUE (i.e. non-zero), the
break command is executed. Since a parameter value of 2 has been passed to the
command, it jumps out two levels and Command_D is executed. Notice that
Command_C can only be executed if cond2 becomes TRUE.
The second method for exiting a loop early is the continue command. This command
behaves very much like the break command with the exception that it jumps out of the
current loop and places control at the next iteration of the same loop structure. The
continue command also takes an integer parameter which determines the number of
levels to jump back. Looking at the above example, with the continue command
replacing the break command.
until cond1
do
Command_A
until cond2
do
Command_B
if ! $?
continue 2
fi
done
Command_C
done
Command_D
In this example, Command_A is executed as before followed by the test of condition
cond2. If Command_B returns a FALSE value, the continue command jumps out of the
loop and condition cond1 is again evaluated, and if not TRUE, Command_A is again
executed. In this example Command_C will only be evaluated if a test of cond2 returns a
TRUE value, as above, but Command_D will only be executed if the test of cond1
returns a TRUE value. The continue command will not pass program control directly to
Command_D as it did in the first example.
When used in case structures, the break command is a pleasant alternative to the exit
command for handling unwanted choices as it allows for control to be passed to another
section of the program rather than exiting the program entirely.
• Functions
• Trapping Errors
Functions
circ_area ()
area = Pi * rad * rad
return area
squar_area ()
area = side * side
return area
tri_area ()
area = 1/2 base * height
return area
end
These modules are often called subroutines or functions. The difference between the
meanings is language dependent. In Bourne shell programming, these modules are called
functions. Another reason, even more valuable than maintenance, for using functions is
the ability to repeat a task many times without having to re-enter the same code whenever
it is needed. A good example of this is a sine function. A program which tracks a sattelite
may have to calculate a sine function many times. If the list of commands to calculate the
sine function had to be written in to the program for each time it was to be used, the
length of the program would be ridiculous. Instead, a function which calculates the sine
of a number would be written once in a function and called when needed. In shell
programming it may become necessary to repeat a task many times in the context of a
larger task. For example, consider a function which might examine a file and print out
information pertaining to a particular string contained in the file. The actual program
might examine many directories containing many different files, but only want to
examine a certain type of file for this string. The main shell script could then call this
function only when it is needed rather than have it operate on all of the files examined by
the main program. This could save the user much time and make his entire program more
efficient. The following example of a function is an enhancement of the long listing in
UNIX (ls -l) in that it displays a title over each of the columns of information diplayed:
$ list () {
> echo ``Permission Ln Owner Group Size Modified Name''
> echo ``---------- -- ----- ----- ---- -------- ----''
> ls -l $*;
>}
which could be used as follows:
$ list
Permission Ln Owner Group Size Modified Name
---------- -- ----- ----- ---- -------- ----
total 86
-rwx-r---- 1 normb users 60041 Dec 15 19:48 Bourne.tex
drwx------ 2 normb users 1024 Sep 18 3:13 mail/
drwxr-xr-x 2 normb users 1024 Sep 30 11:01 tmp/
$
Because of the use of the $* variable this long listing function will also take wildcards (or
filename meta-characters).
Trapping Errors
This last topic on Bourne shell programming deals with the way programs can handle
interruptions. An interruption can be anything from a terminal being disconnected to the
computer running out of memory. What happens if an interruption occurs during
execution of a script? This is not an easy question to answer as the answer is basically
that it depends on what kind of interruption has occured. If, for example, the power to the
system was lost, there is really nothing that can happen since the computer will no longer
be operational. If however, the user presses an interrupt (or break) key, the question is
still not answered. Will the program stop immediately or will it finish executing the
current loop, or perhaps nothing will happen. What actually happens depends on a
command called trap. The trap command allows the user to have the program carry out a
command string of one or more commands prior to exiting the script. If more than one
command is contained in the string, the commands should be contained in quotes as
described in the section on shell special characters, and command execution. The syntax
of the trap command is as follows:
trap command(s) signal(s)
One situation where this command is extremely useful is where information is being
written to a file (to be removed at normal exit of the program) and an interruption (or
using correct UNIX terminology, a signal) is sent which would by default cause the
program to terminate. If program termination occurred prior to normal exit from the
program, the default action would terminate the program and leave the mess behind.
Adding the following to the script would allow the clean-up to occur before termination
due to a user break (control-C):
trap `rm tmp/*; exit 1;`
Care must be taken for a couple of reasons when trapping signals. First, the signals which
may be trapped vary from machine to machine, and second, the definite kill signal (9)
cannot be trapped on any machine. What follows is a typical table of signals and the
event which causes them:
Table 2.9: Signals and the event which causes them.
The details of this list are not really important as there are only a few which will be
trapped in ordinary day-to-day activities: 0, 1, 2 and 15. One should also note that the
default (ie. not using trap) is always immediate termination of the script.
Dot Files
Throughout this chapter there have been mentions of the environment, such as
environmental variables. For instance it was shown that the environmental variable
$PROMPT allowed the user to change the appearance of his prompt during an interactive
session. What if however, a user would like to have that prompt setting as the default for
every session? As was mentioned earlier, the prompt setting can be placed in a file called
.profile in his home directory which is executed at each login. This file is simply a script
that the shell looks for at login and will execute at that time. This is where any
enviromental variables should be set, such as $PROMPT and $PATH. There will likely be
a file /etc/profile which will also execute during login. This is a file that the system
administrator will have set as a default so that users do not need a .profile in their own
directory. It also allows the system administrator to set some things like the path so that
the users on the system can have access to all of the executables without having to try and
figure out the paths for them. Depending on the user's level of access, the /etc/profile is
often a good file to copy into his directory as a skeleton to start with. As well as
enviromental settings, function definitions can also be placed in this file. This means that
the list function which was described in the section on functions can be used during every
session without having to be re-entered. It is always good practice to keep the functions
in their own file to prevent cluttering up the .profile file. As will be seen in later chapters,
the newer shells can have several of these startup files which can be quite confusing if not
organized in some fashion. The functions can all be put into a file called .functions which
would then be executed by the .profile file. The .function file can be executed using the
source command or the . command. The .profile script would then have one of the
following two lines (which are equivalent):
source .functions
or
. .functions
The C Shell
• A Bit of History
• Getting Started with the C Shell
• Command Execution
• Redirection and Pipelines
• Filename Substitution
• Filename Completion
• Special Characters
• Command History
• Aliases
• Variables
o User Defined Variables
o Array Variables
o Global or Environment Variables
o Parameters
o Special Read Only Variables
o Variable Modifiers
• C Shell Programming
o Testing and Branching
o Signal Handling
• Job Control
• Special Files
o Introduction to UNIX Security
A Bit of History
The shells in this book are grouped into two main families, the Bourne shell family, and
the C shell family. Since we have already covered the Bourne shell, it would thus be a
logical step to examine the C shell next to outline the other major family.
The C shell was developed by Bill Joy, the creator of the vi text editor, at the University
of California, Berkeley. He improved on the Bourne shell in many areas by adding some
new features, and altering some of the original features. He based parts of the syntax on
the C programming language, but the C shell and C programming languages are two very
different entities.
Getting Started with the C Shell
As with the Bourne shell, a good place to start is by checking that the current shell is in
fact the C Shell. The echo command can be used to see which shell is running:
echo $SHELL
which should give
/bin/csh
if the C shell is the current operating shell. If the C shell is not the current shell, it can be
invoked by typing the following command:
chsh /bin/csh
will change the login shell (the shell that is started everytime you login). An interactive
shell can be temporarily invoked, until the end of the session, for learning or using the C
shell by typing csh. The default prompt in the C shell is the percent symbol (%) which
will be used as the default prompt for the rest of this chapter. The following is the typical
appearance and execution of a command line in the C shell:
% ls -F
Mail/ News/ bin/ message.txt slide.ps
%
This is essentially the same as the output from the same command in the Bourne shell
with the exception of the different prompt. All shells handle command lines in the same
manner. They first accept the user input and then they break it down into individual
components such as the command to be executed and any flags that go with it.
The C shell allows long command lines in the same manner that the Bourne shell did. If a
command line is too long for the screen (usually 80 characters on most monitors), simply
break the line with a backslash, or in other words, escape the return character. This
backslash character tells the shell to ignore the enter-key when it is pressed. A use of this
might be when copying a list of files into another directory:
% cp file1 file2 file3 file4 file5 file6 \
> file7 file8 file9 file10 file 11 file12 \
> ~jones\data\storage\old_files
%
which would allow all of the files to be moved in a single command. The > character
signifies the continuation of a command line onto the next terminal line. The ~ character,
or tilde (pronounced til-duh), is a special character in the C shell that is understood by the
shell to mean the current user's home directory. This and other special characters will be
examined in the section on filename substitution.
Command Execution
The C shell allows commands to be entered in different ways. Grouping multiple
commands into a single command line, running commands in the background, and
building complicated commands from commands inside of commands can all be handled
by the C shell.
To execute multiple commands on a single command line, simply place a semicolon
between each of the commands. The commands will be executed from left to right with
each successive command being executed after the previous one. For example, the
following command line changes to a directory containing documents which need to be
moved. A new directory is created for the documents, which are then all moved into it.
% cd docs; mkdir old_docs; mv *.* old_docs
This task could have been accomplished by executing each of the commands separately
on a separate line:
% cd docs
% mkdir old_docs
% mv *.* old_docs
If all of these commands execute quickly, this is a good method for carrying out tasks. If
however, one or more of the tasks, takes a significant amount of time to execute, the user
will be left waiting for its completion - wasting time. To get around this problem, time
consuming commands can be run in the background. This means that the shell handles
the execution and waits for its completion without the user having to wait to do other
things. For example, if the user needs to carry out a few routine disk management tasks,
including archiving all of his directories and subdirectories, he might wish to have the
archiving done in the background.
% rm ~/tmp/*; tar cfv my_dirs.tar ~
Sometimes a situation might arise where the user would like to have commands executed
such that one command is the argument (or input) of another command. For example, the
user may wish to count the number of users logged into the system at the present time.
One way to do this would be to type who and count the number of users, but a better way
would be the following:
% wc -l `who`
In this example, the left single quotes tell the shell to execute what is inside first (the who
command), and use the result as the input to the wc command. The user would only see a
number as the result of this command. While the who command generates a list of users
currently logged onto the system, this output is sent to the wc command and therefore
only the number of lines that the wc command counts is output to the screen.
A group of commands can also be executed in a subshell. This means that a group of
commands can be executed in a shell other than the interactive shell currently being used.
To do this, the user would enter a list of commands separated by semicolons inside of
parantheses (). The following example will illustrate the syntax of this type of command
while also demonstrating the way in which variables are affected by use in subshells.
% set MY_NAME = 'N Buchanan'
% (set MY_NAME = 'Joe Blow'; echo $MY_NAME)
Joe Blow
% echo $MY_NAME
N Buchanan
It can be seen that the same variable is used in both the current interactive shell and in the
subshell. The values of this variable are different however. The good news is that the
subshell did not overwrite (or redefine) the variable in the current shell. The bad news is
however that this type of variable usage can become quite confusing and cause the user to
lose track of what a variable's purpose is. This type of variable usage is therefore
discouraged.
The C shell provides a mechanism for repeating a command a specified of times. While
on the surface this may seem like a useless feature, it can come in handy. The format of
this repeat command is simply
% repeat n command
where n is the specified number of repetitions and command is the command to be
repeated. A particularly useful way to use this feature is to repeatedly echo something to
the screen:
repeat 2 echo ********************
echo ' Error'
repeat 2 echo ********************
which would then be executed as follows:
********************
********************
Error
********************
********************
which would certainly add a sense of urgency to any error message. The alert reader
might be saying that the output would be interrupted by the entry of each successive line.
While this would be true in interactive mode, the omission of a prompt will be used in
this book to signify a shell script or shell script fragment (which the above example is). In
a shell script, the commands are executed in sequential order. This will be covered in
more detail in a later section. One thing to keep in mind when using the repeat command
is that the command to be repeated must be a simple command and not a compound
command using redirection or pipelines (which will be covered in the next section).
Filename Substitution
The C shell allows the user to specify a filename or group of filenames without having to
explicitly type in the entire filename. This is called filename substitution. For instance,
the user might wish to list all of the C source files in a directory, or perhaps list all but the
C source files in a directory. The most common type of of filename substitution is the
metacharacter (*). This character tells the shell to substitute in an arbitrary list of
characters until all possible matches have been made. For example,
% ls *.c
prog.c prog2.c string_search.c
% rm p*.c
% ls *.c
string_search.c
This type of substitution is quite robust since any number of strings will be tried until the
appropriate ones have been chosen. There are however, more specific substitutions that
can be made. The (?) character tells the shell to only try and substitute the one character
until matches are made:
% rm pro?2.c
while this looks like a job for the metacharacter (*), it must be remembered that if a file
called program2.c existed, it would be destroyed if the (*) was used rather than the (?).
The [] substitution will try to insert each character in the brackets into the filename. If a
range is specified by use of a dash, each character in the range will be tried, as well as
any other characters in the brackets. For example, rm file[a-df].txt would delete all of the
following files: filea.txt, fileb.txt, filec.txt, filed.txt, and filef.txt if they existed. The
negation character (^) could be used to prevent substitution of characters in similar
fashion. All of the above were provided in the Bourne shell, but the C shell provides a
couple more types of filename substitution. The curly brace {} will tell the shell to insert
each provided string, separated by commas, into the filename:
% cp data{june,july,august}.95 old_data
would copy the files datajune.95, datajuly.95, and dataaugust.95 into the old_data
directory.
While it may seem that every conceivable type of filename substitution has been covered,
the C shell recognizes a special character that can be best described as yet another. In the
C shell, the (~) character is recognized as the current user's home directory. If the user
types cd ~ anywhere on the system he will be returned to his home directory. If the tilde
(~) is followed by the user name of another user on the system, this user's home directory
is the result. For example,
% cp file ~smith
will copy file to user Smith's home directory if the appropriate write permissions are set.
Filename Completion
It can clearly be seen that filename substitutions can be used to refer to files which are
long to type or possibly off the screen and thus difficult to refer to. For example, a
directory might contain several hundred files (take a look at /usr/bin) and a user might
want to examine a data file that he can only remember starts with the month it was
created, say june. Well he could pipe the listing to the more command like:
ls -l | more
which would allow the user to examine all of the files a page at a time, or he could try:
head june*
which would display the first few lines of each file starting with june, thus allowing the
user to examine the files and find which one he wants to look closer at. There are many
ways to handle the problem using various commands and filename substitions, but the C
shell provides an even better method, command completion. If the variable filec is set
(refer to the section on variables) using the command set filec, a filename can be
completed at any time during command line input using the ESC-key. For example, if
there is a subdirectory called all_of_my_data_files in the current directory, the user could
type the following:
cd al[ESC]
at which point the rest of the name would be filled in by the shell. This is of course only
true if the rest of the filename is unique. If it is not, the shell will beep alerting the user
that the completion is ambiguous. This would occur if there were three files in the current
directory starting with ``th'', and the user tried filename completion after entering just a
``t''. One way to fix this problem is to enter the EOF (end-of-file) key combination
(usually Control-D, often written ^D) rather than the escape key, at which point a list of
choices will be displayed by the shell. The following session illustrates the above (the
text following the hash marks are comments only, for purpose of clarity):
% ls
Mail/ News/ data_june95/ data_march95/ data_may95/ cleanup*
% cd d[ESC] # system beep sounds
% cd data_ # actually the same line as above
% cd data_[^D] # still the same line as above
data_june95 data_march95 data_may95
% cd data_j[ESC]
% pwd
/home/jblow/data_june95
This may look a bit confusing, but with a bit of practice, it will become an invaluable
tool. The example above illustrates how the shell will complete as much of the command
as it can before ambiguity sets in. This will allow command completion at any point of
the desired filename so that the user will not have to enter input to the point of unique
characters before using completion.
Special Characters
It might become apparent at this time that the C shell recognizes a good number of
characters that have special meanings (~, >, <, & and \, just to name a few). This
presents a bit of a difficulty if one is not careful. To see why, try this:
% echo ~ Hi! ~
/home/normb Hi! /home/normb
It is not what one would expect, but this is the way the shell interprets command lines.
After the command echo, the shell makes any substitutions of special characters, and
then proceeds to fill in the rest. In the above example, the tilde was meant as an artistic
touch, not the user's home directory. Unfortunately the shell has no way of interpreting a
user's intentions. There are however, ways to deal with special characters. One of these is
to escape the character in question. The reader might remember that escaping a character
was discussed when discussing command lines that were too long. In that case a
backslash (\) was used so that the shell did not interpret the enter-key as an end of
command line signal. Here escaping a character is precisely the same. To escape a
character simply place a backslash immediately to the left of the character - no spaces.
This will work for any special character. Here is a listing of the characters in the C shell
that are considered special: ; & ( ) | * ? [ ] ~ { } ! < > ^ `` ' \ ` $, as well as, whitespace
(space, tab or newline). To correct the previous example, the escaping method could be
used:
% echo \~ Hi! \~
~ Hi! ~
Notice that the ! character
is listed as a special character above, and yet is not escaped in
the example. The reason for this will become clear in the section on command history,
but a brief explanation is that, alone the ! character is not special. It becomes special
when followed by almost anything else - so be careful.
The C shell, like the others, allows enclosing text in quotations to prevent interpreting
special characters. This is a bit nicer when the number of characters which could be
misinterpreted is more than one, and especially if white space is involved. Suppose for
example that a person wanted to have a word or phrase centered on the output device.
Maybe a title that acted like a heading for some output. Simply typing the title like the
following would not work
% echo Title
Title
since the shell treats the whitespace as a single unit regardless of the amount of spaces
and tabs entered. If the text (including whitespace) was placed inside of double quotation
marks however, the shell will leave the whitespace alone allowing the centering on the
display.
% echo `` Title''
Title
The double quotes will allow all special characters to be taken as text rather than with
special meaning with the following exceptions: '' (ending double quotations), $ (variable
substitution), ` (command substitution), \ (next character escape), ! (history character),
and NL (newline character). The double quote character is used to signify the ternination
of the quoted text, and the $ and ! characters will be covered in the following sections.
Another option is to use the right single quotation character ('), which behaves like the
double quotations with the exception that only the history character (!), the newline (NL)
character, and the right single quote will be recognized as special characters by the shell.
The following example shows the difference between single and double quoting of text:
% ls *.gz
lotsOdata.gz
% echo ``Current shell is $SHELL Last command was !!''
Current shell is /bin/csh Last command was ls *.gz
Command History
While there are indeed many ways described above to reduce the work involved in
command line input, it can still be a daunting task to repeat a command over and over.
There will often be times where a command or filename has been misspelled and requires
reentering, or other times where a command line will be purposely reentered, such as a
debugging session with multiple recompilations. The C shell introduces a mechanism for
making this much easier, the history mechanism. This is simply a record that is kept
detailing the previous commands that the user has entered. These commands can then be
accessed in a variety of ways. The simplest is with the up and down arrows on the
keyboard. If the keyboard is one with the arrows on the number pad as well as the arrow
pad, the current key mapping will determine if the number pad arrow keys will work.
They may not. As a simple example of the history mechanism, take the following session:
% emacs myprog.cc
% g++ -I /users/normb/my_include_files -o myprog myprog.cc
% myprog
various annoying run time errors
%
The user must now attempt to find the cause of the errors and fix them. Rather than
retype the lines again he uses the arrow keys:
% myprog (up once)
% g++ -I /users/normb/my_include_files -o myprog myprog.cc (up twice)
% emacs myprog.cc (up three times -- the desired result)
It should be noted that the text in parentheses is the author's notes, and that each
successive arrow keypress would display the command on the same line. For this
particular example only a few keystrokes are saved but as the process continued it is
rather obvious that the arrow keys are the preferred alternative. The command history has
a retricted size as it is stored in memory as opposed to on the disk. The actual number
will vary from machine to machine, but usually 20 to 40 is reasonable. The number can
be set with the history environment variable which will be covered later. When the
number of history commands exceeds the limit, the oldest are removed to make room for
the newest.
While the arrow keys prove very useful in accessing previous command lines, they are
not the only way to access the history list. The C shell provides different ways to access
the command lines depending on taste and situation. Before looking at the different
mthods of accessing the list, it would probably be a good idea to see what the list looks
like. The history list for the previous example would look something like this:
% history
1 emacs myprog.cc
2 g++ -I /users/normb/my_include_files -o myprog myprog.cc
3 myprog
%
The actual numbering may vary depending on how many commands have been entered.
Now, looking at this command list it is easy to see that if the list gets quite long, say 10 or
more items, the number of keystrokes saved begins to diminish. The C shell alleviates
this problem by giving the user a variety of ways in which to access earlier commands.
The C shell uses the exclamation mark (!) as one quick method to make history
substitutions. By entering two consecutive !'s and the enter key the user will have the last
command in the history list executed. The last command would now appear twice as the
last two entries in the history list. By entering the ! followed directly by a number will
execute the entry in the history list with the entered number. For example, the command !
1 would execute the command emacs myprog.cc in the history list above. Another
method is to enter a string after the !, which would have the command starting with that
string in the history list executed. For example (again using the above history list), the
command !g++ would invoke the rather length command:
g++ -I /users/normb/my_include_files -o myprog myprog.cc
Notice that the string was entered immediately after the ! character. If this was not the
case, an error would occur as the ! character alone is not a recognized command by the
shell. If there is more than one item in the history list that starts with the string, the one
closest to the end of the list (the one with the highest number) will be substituted. By
framing the string with question marks, like !?string?, the shell will execute the
command which contains the string anywhere in it. To execute the command in the last
example the user could have entered the command !?-o. The history list can also be
executed from the bottom up. For instance, !-2 would execute the second last command
added to the history list. By default, when the user logs out, the history list is purged from
memory and lost. The user can however set the variable savehist to a number will save
that number (or less if not as many commands exist in the history list) to a file called
.history in the users home directory. The larger the number of commands saved from the
history list, the longer it will take the C shell to start at login time. This is because the
saved history list has to loaded into memory. Typically 20 to 40 is a good range to set the
savehist variable to. To ensure that this variable is set for each session, it can be added to
the .cshrc file. At this point the reader may well be overwelmed by the number of
different ways to save keystrokes. It might seem that there is an awful lot of remembering
involved in saving a few keystrokes. While this is definitely true, the old adage holds,
practice makes perfect. It will likely take many hours of UNIX use before many of these
tricks will be burned into a users mind - but they will. Every UNIX user has his own bag
of tricks to get the job done, and it will differ from one user to the next.
Aliases
Now it is time to start exploring the power of the C shell. This is the point where the C
shell really starts to stand out over the Bourne shell. The C shell enables the user to
define complicated, or not, commands in terms of easy to type aliases. These aliases can
save a heavy user litteraly hours of typing. As a simple example, take the command for
getting a long listing of files in a directory, ls -l. While this is not very difficult to enter, it
can be simplified using an alias:
% alias ll ls -l
After this, for the rest of the current session, typing ll would result in the long listing of
the current directory being displayed. The alias could even be made permanent for every
session by including it in a login file, but that will be covered in later. A list of active
aliases is kept by the shell. To see a list of aliases the user would type alias with no
arguments. To check a particular alias, the alias command would be entered with the
name of the alias in question as the only argument. If the alias does not exist, the prompt
will simply reappear on the next line. The following session demonstartes the use of the
alias command:
% alias ll ls -l
% alias bye exit
% alias
ll ls -l
bye exit
% alias bye
exit
To remove an alias, one simply types unalias followed by the alias to be removed:
% alias ll
ls -l
% unalias ll
% alias ll
%
More complicated aliases can be designed using the techniques outlined in earlier
sections (like command grouping and escaping special characters). When using an alias
remeber that the shell will process the alias at that time. This can be a bit of a problem if
one is not careful. The following example illustrates the problem:
% alias ll ls -l d*.*
% alias ll
ls -la data1.95 data2.95 doc.tex
% cd ~another_usr
% ll
/bin/ls: data1.95: No such file or directory
/bin/ls: data2.95: No such file or directory
/bin/ls: doc.tex: No such file or directory
The problem can be fixed by simply enclosing the definition in quotes. Here, single or
double will work as no variable substitutions are made. When the definition is not
enclosed in quotes the definition is taken as exact by the shell. This means that it uses the
listing of the current directory - always. When in another directory, the shell looks for
files that were in the directory the alias was created in. This may seem a bit confusing,
but it really comes down to the order in which expansions and substitutions are executed
during the aliasing process. It never hurts to enclose things in quotes (unless a particular
special character is required) so it is wise to use quotes in aliases as often as possible.
Aliases can also be used to redefine current commands. The C shell will not prevent a
user from renaming an alias after say, ls. This means that one could redefine ls to behave
in similar or completely different in behaviour. A good use of this property is as follows:
% alias rm rm -i
which would cause the shell to prompt the user on the deletion of each file given as an
argument. Never a bad idea. Redefining commands using alias may seem dangerous, but
keep in mind that these aliases can be removed as well using the unalias command.
Variables
A variable in the C shell can be any name containing up to 20 letters and digits which
starts with a letter or an underscore. A variable is a name or placeholder used to store one
or more values. Variables give a high level of generality to everyday interactive shell
usage, and are heavily used in shell programming (which will be discussed in a later
section). In interactive shell use, a variable can be treated very similar to an alias with the
exception that a variable must be referenced (used) with a dollar sign. This dollar sign
tells the shell that what follows is a variable, and that the shell should make the
appropriate substitution. In order to use a shell variable in the same manner that an alias
was used earlier, the long listing can be revisited:
% set ll=''ls -l''
% $ll
Ignoring the set command for a moment, the variable ll is given the text string ls -l as a
value, and when the variable is referenced on the following line, the text string is
substituted. When the user enters the variable name (preceeded by a dollar sign) followed
by a carriage return, the shell interprets the text string as though the user had typed it in
himself. The shell does not care how the command line gets entered, it parses it the same
whether it was a history subststituion, variable substitution, alias or standard input. The C
shell variables can be categroized into five types. These types are as follows:
• User Defined - local to current shell (assigned values with command set),
• Parameters - command line arguments,
• Environmental - global (assigned using command setenv),
• Special - read only variable (assigned by the shell) and
• Array - multiple valued variable.
Before each of these types of variable is covered, the way in which variables are accessed
should be examined. While it is true that a variable must be accessed with a $ in front,
that is not the end of the story. It should be remembered form the discussion of quoting
special characters that a text string enclosed in double quotes will allow the shell to make
variable substitutions while a string in single quotes will not. Consider the following
example,
% set NAME=''John Smith''
% echo ``His name is $NAME''
His name is John Smith''
% echo 'His name is $NAME'
His name is $NAME
where the single quoted text string is echoed verbatim. Another complication that can
arise when accessing variables as part of a longer string is, for example, the variable DAY
is assigned the value ``Tues'', and then an attempt is made to make the string ``Tuesday''
containing the variable DAY the following problem arises:
% set DAY=Tues
% echo $DAYday
Dayday: Undefined Variable
%
What has happened here is that there is no whitespace between the variable name and the
rest of the string. When the shell parses the command line, it interprets everything
following the $ and up until whitespace as the variable name. Since DAYday was not
defined as a variable (given a value) the shell displays an error message. To alleviate this
problem, curly braces ({}) can be used. If a curly brace follows directly after a $,
everything inside of the closed curly brace pair will be considered a variable name.
% set DAY=Tues
% echo ${DAY}day
Tuesday
Variables names are case sensitive so DAY, Day and day are all different. It is not good
practice to define more than one variable with the same name but different case. In fact it
has become standard for shell variables to be given all upper case characters.
It is also important to note that when assigning a value to a variable, the shell considers
everything between the equals sign and the next whitespace to be the value of the
variable. It is thus important that no whitespace (other than what is meant to be used) is
introduced during a variable assignment. If whitespace is to be used, the appropriate
quotation characters must be used.
• User Defined Variables
• Array Variables
• Global or Environment Variables
• Parameters
• Special Read Only Variables
• Variable Modifiers
Variables
A variable in the C shell can be any name containing up to 20 letters and digits which
starts with a letter or an underscore. A variable is a name or placeholder used to store one
or more values. Variables give a high level of generality to everyday interactive shell
usage, and are heavily used in shell programming (which will be discussed in a later
section). In interactive shell use, a variable can be treated very similar to an alias with the
exception that a variable must be referenced (used) with a dollar sign. This dollar sign
tells the shell that what follows is a variable, and that the shell should make the
appropriate substitution. In order to use a shell variable in the same manner that an alias
was used earlier, the long listing can be revisited:
% set ll=''ls -l''
% $ll
Ignoring the set command for a moment, the variable ll is given the text string ls -l as a
value, and when the variable is referenced on the following line, the text string is
substituted. When the user enters the variable name (preceeded by a dollar sign) followed
by a carriage return, the shell interprets the text string as though the user had typed it in
himself. The shell does not care how the command line gets entered, it parses it the same
whether it was a history subststituion, variable substitution, alias or standard input. The C
shell variables can be categroized into five types. These types are as follows:
• User Defined - local to current shell (assigned values with command set),
• Parameters - command line arguments,
• Environmental - global (assigned using command setenv),
• Special - read only variable (assigned by the shell) and
• Array - multiple valued variable.
Before each of these types of variable is covered, the way in which variables are accessed
should be examined. While it is true that a variable must be accessed with a $ in front,
that is not the end of the story. It should be remembered form the discussion of quoting
special characters that a text string enclosed in double quotes will allow the shell to make
variable substitutions while a string in single quotes will not. Consider the following
example,
% set NAME=''John Smith''
% echo ``His name is $NAME''
His name is John Smith''
% echo 'His name is $NAME'
His name is $NAME
where the single quoted text string is echoed verbatim. Another complication that can
arise when accessing variables as part of a longer string is, for example, the variable DAY
is assigned the value ``Tues'', and then an attempt is made to make the string ``Tuesday''
containing the variable DAY the following problem arises:
% set DAY=Tues
% echo $DAYday
Dayday: Undefined Variable
%
What has happened here is that there is no whitespace between the variable name and the
rest of the string. When the shell parses the command line, it interprets everything
following the $ and up until whitespace as the variable name. Since DAYday was not
defined as a variable (given a value) the shell displays an error message. To alleviate this
problem, curly braces ({}) can be used. If a curly brace follows directly after a $,
everything inside of the closed curly brace pair will be considered a variable name.
% set DAY=Tues
% echo ${DAY}day
Tuesday
Variables names are case sensitive so DAY, Day and day are all different. It is not good
practice to define more than one variable with the same name but different case. In fact it
has become standard for shell variables to be given all upper case characters.
It is also important to note that when assigning a value to a variable, the shell considers
everything between the equals sign and the next whitespace to be the value of the
variable. It is thus important that no whitespace (other than what is meant to be used) is
introduced during a variable assignment. If whitespace is to be used, the appropriate
quotation characters must be used.
Array Variables
The Bourne shell restricted variables to single values. The C shell imposes no such limit.
It provides the user with the use of array variables, which are variables that contain two
or more discreet values. Ignoring proper C shell notation for a moment, an example of an
array would be a variable called alphabet which contained each of the 26 letters A
through Z. To access a particular letter the variable might be followed by the number of
the sequence which contained the desired letter. For example, the letter C might be
referred to as alphabet(3), while Z would be referred to as alphabet(26). An array is a
nice way to store related elements which could be accessed under the same name. In the
C shell, array variables can be set as follows. All of the array elements can be set at one
time by enclosing the list with parentheses and separating the elements with white space
(usually spaces). For example, to set an array of pets:
% set PETS=(Cat Dog Goldfish Horse Boa Hamster)
To access an element of a variable the variable name is followed by square brackets
containing the element number:
% echo $PETS[3]
goldfish
% echo $PETS[3-5]
Goldfish Horse Boa
The example illustrates how more than a single element can be accessed at once using the
- operator. If this operator is placed before or after a number it will access all of the
elements up to and including or including to the end of the list respectively. For example,
% echo $PETS[-3]
Cat Dog Goldfish
The entire array variable list can be accessed by placing an * in the square brackets.
Table 4.2 summerizes the ways to access array elements:
Parameters
Parameters (or command line arguments) are shell variables which carry information
pertaining to a command line. To reference a parameter, the dollar sign is placed before a
number from 0 to 9. The variable $0 is special and refers to the name of the current
program which would be csh if $0 was accessed at the command line prompt. While the
number after the $ must be 9 or less, higher numbered arguments can be accessed using
the $argv[n], where n is a positive integer. To access the 11th command line argument
the following variable would be used $argv[11]. The - operator can be used in the same
manner as it was for array variables. A shorthand version of the #argv variable is to
simply enclose the number in curly braces, ${11}. Parameters may seem strange and
useless, but there importance will become evident in the sections on shell programming.
Variable Modifiers
Variable substitutions can be altered in a variety of ways using variable modifiers. A
modifier is incorporated into a substitution by appearing immediately after the variable
name. The general format is
% $Variable_Name:M
where M is one of the modifiers summarized in table 4.5.
C Shell Programming
Many of the sections leading up to this point in the chapter have hinted at the fact that the
material covered in those sections would be better utilized in shell programs (called shell
scripts). Indeed shell programming is the most powerful aspect of the C shell (or any
shell for that matter). The C shell provides a rich scripting language that, at best, has a
slight similarity to the programming language C. Shell scripting languages provide the
user with a great many tools for handling everyday tasks around the system and even
some less everyday tasks. A shell script can contain UNIX commands as well as the shell
commands discussed in the previous and following chapters. Unlike compiler based
languages, shell scripts are executed by the shell one line at a time. While this will
obviously make for slower performance, advantage is gained in the ease of modifying
programs without all of the hassle of compiling and linking. All that is required for a
shell script to be executed is that it be made executable with the following command:
% chmod u+x script_name
Shell scripts are often written to handle some of the more tedious tasks that a user
encounters on a regular basis. A simple C shell script could be a list of UNIX commands
that archives and compresses the users home directory and copies it to a specified
mounted disk partition for storage :
#!/bin/csh
# backup tars and compresses ~/ and puts in storage on /dsk2/strg/
#
tar -cvf dec18_95.nbdat.tar ~/
compress dec18_95.nbdat.tar
cp dec18_95.nbdat.tar.Z /dsk2/strg/
With the exception of the lines that start with hash marks (#), the script is a list of simple
UNIX commands. This task could most certainly have been entered on a single command
line with use of a pipeline, but it illustrates the basic format of a C shell script. Almost
any line starting with a hash mark will be ignored by the shell and hence indicate
programmer comments. The one exception to this rule is the hash bang (#!) sequence of
characters, this has special meaning to the shell. It tells the shell which environment to
start for execution of the script. This could be any shell or even other scripting
environments such as perl (Practical Extraction and Report Language) [1] or tcl (Tool
Command Language) [2] which are UNIX scripting languages but not shells (at least not
interactive shells like those discussed in this book). The shells are usually found in the
/bin directory, but this might differ from system to system. The powerful feature of shell
scripts over simply writing the commands on a command line is that scripts can contain
many types of safety, logging, and other features to provide a worry free and organized
working environment. As the scripts in this chapter begin to become more complex, this
point should become clear.
\noindent
This example could have been done relatively easily with nested
\verb+if+ statements, but if there were even one or two more choices
the \verb+if:then:else+ construct begins to lose its appeal.
\subsection{Loop Control}
Along with control branching, loop control gives programs their power.
Branching gives a program the ability to determine the direction
program execution will take and looping allows the program to
repetitiously execute one or more commands, until some condition is or
is not met.
This is after all what gives computers their power.
The C shell provides two tools for controlling program looping, the
while construct and the foreach construct.
A \verb+while+ loop can be best explained in plain English as follows:
while a certain condition is met, a group of commands will be repeatedly
executed.
As soon as the condition is not met, the execution ceases.
The actual form of a \verb+while+ statement is:
\begin{verbatim}
while (condition)
command_list
end
If the condition is never met, the command list is never executed, while on the other hand
if the condition is always met, the execution will never stop. For example, the while loop
in:
set A=1 B=10
while ($B < $A)
echo $A
@ B -= 1
end
will never execute, while in:
set A=1 B=2
while ($B > $A)
echo $B
@ B += 1
end
will execute forever (not really).
The while loop can be used for many things, but one very useful purpose is to handle
arguments given on the command line issuing the script. The following script fragment
will echo back to the screen the arguments given:
while ($#argv)
echo argv[1]
shift
end
The shift statement works to move the elements of the array (argv in this case - the
default) one place to the left. The form for the shift statement is:
shift [variable]
The variable $var[2] becomes $var[1], and $var[3] becomes $var[2], etcetera. The
variable $var[1] (before shifting) is destroyed (at least unusable). An error will result if
any the variable given is unset or has NULL value.
The foreach construct is quite different from the while construct in that now an actual
expression is evaluated. The foreach loop execution is determined by the number of
elements in a list. The form of a foreach construct is as follows:
foreach variable (list)
command_list
end
\noindent
Each element of the list is removed from the list and placed in
\verb+variable+ during execution of the command list.
When there remain no elements in list, the execution of command list
ceases.
This construct can also be used to manipulate arguments given at the
command line that initiates the script.
The following script fragment will, as in the above example, print out
a list of the arguments given:
\begin{verbatim}
foreach ARG ($argv)
echo $ARG
end
This version is shorter and probably a bit clearer than the first, but each user will have his
own preference when it comes to looping constructs for this type of situation. The
foreach list can also contain a list of words such as:
foreach color (red green blue yellow)
something_interesting
end
This is a situation where the foreach construct is the only way to proceed. Likewise,
there are situations where only while statements will provide the control required, such as
arithmetic expressions resulting from the execution of the command list.
There is really one more loop control mechanism provided by the C shell. The over used
quick fix, or goto statement. Anyone who has programmed in BASIC is well aware of
the goto statement as well as how quickly it can add bugs to a piece of code. Regardless
of its downside, the goto statement is provided for use in the C shell scripting language.
The goto statement has the form:
goto LABEL
where LABEL is a string that is placed within the script. The label cannot however reside
within a loop or branch construct, and any attempt to do so will result in an error. Almost
any code where a goto is used can be rewritten using the control constructs described
above. All that is required is a little bit of forethought and imagination. With the tools
outlined in this and earlier sections, complex scripts can be constructed, as will be seen
shortly.
Signal Handling
The C shell is not quite as sophisticated at handling signal interruptions during script
execution as the Bourne shell. The C shell can handle signals in a very general way with
the onintr command. By general it is meant that all signals are essentially handled the
same, which is unlike the way that trap works in the Bourne shell. There are three ways
in which to use the ointr command within a script. The first of these is simply
onintr
which will cause the script to terminate on any signal. The second use of onintr
onintr -
works in the opposite manner by ignoring all signals, which might be used to allow a
script to clean up any files written during execution. The third and last use of the onintr
command is
onintr LABEL
which sends the script to LABEL to continue execution of commands after that point.
Again, this could be used to remove any temporary files or write out any last minute
information to a file. The execution of commands after LABEL will continue until either
an exit command or the end of the script, so an exit command should be placed after the
desired commands.
The C shell has another way of handling signal interruptions caused by turning off a
terminal or disconnecting a line to a remote terminal. The nohup command (do not
hangup) causes a command to continue execution after a hangup. Used on a command
line nohup has the following form:
nohup COMMAND
The C shell provides an automatic nohup in the sense that any commands or programs
run in the background using the (&) (see next section) will not be killed by a hangup
signal (hup). In a shell script, nohup will cause the remainder of the script to run to
completion, ignoring the hup signal.
Job Control
In the UNIX world, a job refers to a command or group of commands sent to the shell for
execution. The job exists so long as the shell is either executing it or keeping track of its
status. A job will be in one of two palces after given to the shell, the forground or the
background. If a command is entered on the command line like the following:
% gcc -g -o myprog myprog.c
the user will be left to watch the compile take place without being able to use the terminal
(of course with xterms this isn't really a problem). This command was executed in the
foreground. For most commands, like cd or ls, this is not a real inconvenience, but many
tasks can take a great deal of time and leave the terminal unusable until they have
completed (or at least terminated). A job executed in the background will not use the
terminal, it will simply execute out of sight. The simplest way to put a job in the
background is to palce an ampersand (&) following the command on the command line:
% find / -name june_report.ps -print >& find.log &
Jobs can be placed in the background in a couple of other ways as well. A job can be
placed in the background using the bg command followed by the name of the job. If a job
is started in the usual fashion in the foreground, it can be stopped using the Control Z
(^Z) sequence, and then moved to the background to continue execution using the bg
command. The following sample session fragment demonstrates moving a grep job into
the background in this fashion:
% grep ``normb'' one_huge_log.file >& grep.info
^Z
Child exited
% jobs
[1] Running xclock -d -update 1
[2] + Child exited grep ``normb'' one_huge_log.file
% bg %2
%
This example warrants a few explanations. First, jobs is a C shell command that gives a
list of background jobs. The number in brackets is the ID of the job and the plus sign
beside the grep job indicates it is a current job (a minus sign would indicate a previous
job). Second, jobs are refered to by their ID number preceded by a percent sign. The
references to %2 refer to the grep job. The first job [1] is just an xclock that had been
started in the background eariler in the session. It can also be seen that the shell issues a
notice that a child process (of the interactive shell) has been exited or stopped. It is also
important to keep in mind that both the grep and find commands used in the previous
examples send their output to stdout by default and so if the output is not redirected to a
file (or device) it will echo back to the terminal, defeating the purpose of putting the job
in the background. In both examples the stderr has also been redirected. Some situations
may arise when the user will want stderr to remain directed to the screen to monitor
errors that would otherwise go unnoticed until analysis of the log file or other output.
This will especially be the case when compiling large and complicated programs or
packages. To stop a job in the background, the stop command followed by the job ID
number can be used.
It may be the case that the user would like to bring a background job to the foreground.
An example of this is emacs. In emacs, the ^Z character will stop emacs and set it in the
background in it's stopped state. To resume work in emacs it must be brought to the
foreground. This is done with the fg command. By itself, fg will bring the last job placed
in the background to the foreground. Usually fg will be given a job ID as an argument.
To kill a background job (i.e. terminate it) the kill command must be used with the -9
(terminate process) option. To kill a background job, the same basic idea as the previous
examples holds. The following script presents a user friendly interactive prompting
environment for killing processes:
#!/bin/csh
#
# BSD Version (SunOS)
# Interactive process killer
#
# Usage: killp [pattern]
unset notify
if ($#argv > 0) then
set joblist="`ps | grep $argv[1] | grep -v grep`"
set counter=1
else
set joblist="`ps`"
set counter=2
endif
Special Files
An appropriate way to end this chapter is with a discussion of the special files recognized
by the shell (often referred to as dot files because of their leading dots) One of these files,
the .history, was covered in the section on command history. It held the last specified
number of commands executed. The C shell has three more files that it recognizes, the
.login, the .cshrc, and the .logout files, which are all found in the user's home directory
(~/). The .login is read by the shell at the start of a session, or more precisely, at login.
This file should contain commands that are only to be executed at login, perhaps running
a welcome message or the date. Any environment variables, such as the path, should be
set here as well. An execption to the rule of the .login file only executing at login can
occur in X windows when using xterm. If an xterm is started with the -ls flag, the
resulting terminal window will be a login shell. This seems a harmless enough situation,
there is no real problem with redefining variables with the same value or even with
displaying a welcome message in each terminal window (if one is displayed in the .login
file), but some subtle problems can arise. One problem that comes to mind is the
following. On most systems, the path will be set to some basic value (such as,
/bin:/usr/bin:/usr/local/bin: etc...) so that if there is not a proper path set by the user, basic
commands like vi will still work. Most users define their path in their .login script as:
set path = ( $path /usr/public/bin ~/bin . )
Now clearly what will happen if the .login script executes on each invocation of a shell is
that the path will continue to grow. This is not a pretty thing to see when echoing the
$path variable, but even worse, it can begin to affect the performance of the shell.
The .cshrc file is executed each time a new shell is invoked and thus any variables
wanted for each shell should be defined here. Even though a global definition can be set
in the .login file, the definition can be changed at any time and it is always wise to define
important variables, like noclobber, in the .cshrc file. It is possible to define the path
variable in the .cshrc file, but if any additions have been appended to the path in any
other shell, they will be overridden within the current shell. This fact may also assure that
the path will be safe from unsetting but the former is usually the case. It is up to each
user, how and where the path will be set. The .cshrc is an ideal place to define aliases.
The .logout file is a file not used by many. Most tasks are handled nicely by the shell
during logout and thus it is not necessary to handle them in the .logout script. There are a
couple of good applications for the .logout shell however. The first is for handling a safe
deletion system. A user can make a directory called ~/tmp.dump and then define a
command called del with the following script:
foreach file ( $argv )
mv $file ~/tmp.dump
shift
end
Instead of the files being deleted, they are stored in the tmp.dump directory, and
undeleting is as simple as copying them later from this directory. Since files will
accumulate and start to dig into the users disk quota, they should be removed (officially
deleted) at some point. The .logout script can be the ideal method of handling this task.
The directory can either by cleaned completely each logout or a more sophisticated script
can be written to discard files after they have been in the directory a certain period of
time. There is no reason that external scripts can not be run from the .logout (or any DOT
file for that matter) using the source command. Another reason to use the .logout file
comes from personal experience. There exists a program called plan which is a date
planner with built in alarms. In order to keep track of scheduled alarms, a daemon
process is initiated when X windows is started to handle this. When X is exited and the
shell is exited and this special process does not die. Fortunately, the package comes with
a special program to kill the process, which when added to the .logout produces the
desired result of killing the daemon at logout.