You are on page 1of 63

THE UNIX SHELL GUIDE

Norman J. Buchanan
Douglas M. Gingrich
The UNIX Shell Guide
by Norman J. Buchanan & Douglas M. Gingrich
Copyright ©1996 University of Alberta, Department of Physics. All rights reserved.
UNIX is a registered trademark of Novell.
Many of the designations used by manufactures and sellers to distinguish their products
are claimed as trademarks. Where those designations appear in this book, we are aware of
a trademark claim.
While every precaution has been taken in the preparation of this book, the authors assume
no responsibility for errors or omissions, or for damages resulting from the use of the
information contained here in.

THE UNIX SHELL GUIDE ...............................................................................................1


Preface..................................................................................................................................2
What is a Shell?...................................................................................................................3
UNIX Commands versus Built in Shell Commands............................................................4
Interactive and Sub Shells....................................................................................................5
Command Line Parsing........................................................................................................5
Redirection...........................................................................................................................6
The Bourne Shell.................................................................................................................7
A Bit of History...................................................................................................................8
Getting to Know the Bourne Shell.......................................................................................8
Multiple Commands Per Line..............................................................................................9
Redirection and Pipelines..................................................................................................10
Redirection and Pipelines..................................................................................................11
Special Characters..............................................................................................................12
Variables............................................................................................................................13
Variable Substitutions....................................................................................................14
Parameters......................................................................................................................15
Environmental Variables...............................................................................................15
Special Variables...........................................................................................................16
Bourne Shell Programming................................................................................................17
Testing and Branching...................................................................................................18
String Comparisons........................................................................................................20
Testing Files...................................................................................................................20
Loop Control..................................................................................................................25
Exiting Loops Early...................................................................................................26
Input and Output (I/O)...................................................................................................27
Advanced Programming Topics....................................................................................28
Functions....................................................................................................................29
Trapping Errors..........................................................................................................30
Dot Files.........................................................................................................................31
The C Shell........................................................................................................................32
A Bit of History.................................................................................................................32
Getting Started with the C Shell........................................................................................33
Command Execution..........................................................................................................33
Redirection and Pipelines..................................................................................................35
Filename Substitution........................................................................................................36
Filename Completion.........................................................................................................37
Special Characters..............................................................................................................38
Command History..............................................................................................................39
Aliases................................................................................................................................41
Variables............................................................................................................................42
Variables............................................................................................................................44
Array Variables..............................................................................................................45
Global or Environment Variables..................................................................................46
Parameters......................................................................................................................47
Special Read Only Variables.........................................................................................48
Variable Modifiers.........................................................................................................48
C Shell Programming.........................................................................................................49
Testing and Branching...................................................................................................50
Signal Handling.............................................................................................................57
Job Control.........................................................................................................................58
Special Files.......................................................................................................................60
Introduction to UNIX Security......................................................................................61

Preface
This book is intended to assist UNIX users in understanding and dealing with five of the
most popular UNIX shells: the Bourne shell (sh); the C shell (csh); the Korn shell (ksh);
the TC shell (tcsh); and the Z shell (zsh). The idea came mainly out of frustration in
trying to get understandable information on shell usage. Much has been written on UNIX
as a whole but rarely is more than a chapter on Bourne shell or C shell included, and
these usually prove to be little more than a loose translation of the online manual (man)
pages. Shells are certainly sophisticated and powerful enough to warrant a detailed
treatment on their own. The man pages, as any UNIX user is well aware, are fine for
listing commands and associated flags but that is where the usefulness ends. Examples
are seldom found and some of the features are covered in one or two lines of text, leaving
the user to overlook their importance. The man pages are, after all, meant to act as a
reference more than an actual instruction manual and should be treated as such.
Hopefully this book will achieve two goals. First, to give a clear and concise look at each
of five shells. Each shell will be examined from the point of view of interactive work
features, and then from a shell programming point of view. Examples will be used as
much as possible to lend illustration to the concepts presented in the text, especially in the
area of shell programming. Hopefully, the range of examples will be such that there will
be something that everyone can use. New UNIX users will then be able to pick up on
some of the powerful features of the particular shell(s) of interest and develop a feel for
shell programming, which is probably the most flexible and exciting feature of UNIX
shells in general. The second goal of the book is to allow users the opportunity to
examine the features of the various shells covered and determine which shell(s) might be
right for them. Each shell will be contrasted with the others to demonstrate the strengths
and weaknesses compared with the rest. This we hope will make this book truly unique.
As with most other things in life, choice can make things more difficult than expected.
While it is nice to have choice, making a choice can require a level of research that most
people just do not have the time to invest. Sometimes the number of choices is not even
known, further complicating the decision process - how can you make a choice if you do
not know what the choices are? In this book enough information will be presented so the
users can make educated decisions without spending the time trying to gather the
information on their own, or having to make sense of it afterwards.
Users new to UNIX will be able to become acquainted with the shell they are provided
with, learning the details of shell usage until they are able to decide if a different shell
may better suit their needs. Even users who have spent many years mastering a particular
shell may find that some of the more recent shells provide powerful features that they can
tailor to their specific needs. If any of the shells mentioned in this book are not available
on your machine, most system administrators can be persuaded to load them on to your
machine for use. To further assist in making the various shells available, FTP sites where
they can be found are included.
This book will be of great interest to users of personal computers who are considering
moving or have just moved to the new UNIX-compatible operating system for the
personal computer - LINUX. LINUX is distributed freely in electronic form and for low
cost from many vendors. The standard software distribution includes many of the shells
described in this book.
Norman J. Buchanan
Douglas M. Gingrich

What is a Shell?
What is a shell? A shell is a command interpreter. While this is certainly true it likely
doesn't enlighten the reader any further. A shell is an entity that takes input from the user
and deals with the computer rather than have the user deal directly with the computer. If
the user had to deal directly with the computer he would not get much done as the
computer only understands strings of 1's and 0's. While this is a bit of a misrepresentation
of what the shell actually does (the idea of an operating system is neglected) it provides a
rough idea that should cause the reader to be grateful that there is such a thing as a shell.
A good way to view a shell is as follows. When a person drives a car, that person doesn't
have to actually adjust every detail that goes along with making the engine run, or the
electronic system controlling all of the engine timing and so on. All the user (or driver in
this example) needs to know is that D means drive and that pressing accelerator pedal
will make the car go faster or slower. The dashboard would also be considered part of the
the shell since pertinent information relating to the user's involvement in operating the car
is displayed there. In fact any part of the car that the user has control of during operation
of the car would be considered part of the shell. I think the idea of what a shell is coming
clear now. It is a program that allows the user to use the computer without him having to
deal directly with it. It is in a sense a protective shell that prevents the user and computer
from coming into contact with one another.

• Basic UNIX primer


o The Man Pages
• UNIX Commands versus Built in Shell Commands
• Interactive and Sub Shells
• Command Line Parsing
• Redirection

UNIX Commands versus Built in Shell


Commands
The UNIX operating system comes with many commands that the user can use to interact
with computer. UNIX commands are simply programs (usually written in the C
programming language) that are executed when called for. The usual place for the storage
of these commands is the /usr/bin directory. The commands that are available on a
particular machine will vary. There is a set number of standard commands that come with
a UNIX system, but there is no limit to the commands that may be available. An example
of this is the more command. Typing more followed by a filename (preferably a text file)
will cause the filename to be presented to the default output device a page at a time.
Pressing the space-bar will display the next page, typing the enter-key will display the
next line, and pressing the q will exit the program. This is how the man pages are
displayed. Many users felt that this was too inflexible and along came another command
found on many UNIX systems called less. The less command is essentially the same as
the more command with the exception that it allows the up and down arrow keys to be
used to scroll around a document. Many of the UNIX commands may be well known to
the user while others may not. Some examples are ls, cd, grep, find and chmod to name
just a select few. It is important to realize that while these commands might vary in
syntax and usage somewhat from one platform of UNIX to another, they are shell
independent. Whether in the C shell or the Z shell, the grep command behaves the same.
Remember to see how a particular command is used on a particular platform, the user can
use the man command. Now each shell comes with its own set of built-in\ commands.
These are commands that are local to the particular shell. Some examples of these are the
history command in the C shell, and the export in the Bourne shell. Built-in commands
can be taken as platform independent. It is important to keep in mind having said this that
there is a possibility of slight variation between different versions of the shells
themselves. This is inevitable, but for the purpose of this book a standarized shell is again
assumed. If there is any discrepency between this book and any particular shell, the man
pages can again be referenced.

Interactive and Sub Shells


When a user logs onto a UNIX account, a shell is immediately set running. The shell that
is started depends on some predetermined settings in special files that are executed during
the logon process. Ultimately then, the shell started during logon is determined by the
system administrator. As will be seen in later chapters, a different shell can be set to start
at logon using the chsh command. This shell is called a login (or root) shell. This is a
shell that runs at all times during a users session. When the user exits the session the shell
is terminated by the operating system. The login shell is a special kind of interactive
shell. The purpose of an interactive shell is to let the user interact with the computer
during a session. The interactive shell (as discussed briefly in the begining of this
chapter) examines each command line entered by the user and then executes the
command if it is correct syntactically (in usage). One must be careful when using the
term correct however, as by correct it is meant that the proper usage of the command
(outlined in the man page for that command) is used. The user could still enter a
command that makes absolutely no sense in terms of what it accomplishes, yet is
syntactically correct. The shell has no way of checking for this. A new interacive shell
can be started within a login shell at any time.
A subshell is a shell that runs underneath of an interactive shell. The user cannot interact
directly with a subshell as subshells only take input from commands or another shell. A
subshell is really just where a command is executed. When a command line is entered in
an interactive shell, the command(s) may be passed to a subshell to execute. This would
occur for exemple if a group of commads is enclosed in parentheses. Programs written in
the shells will be automatically executed in subshells. It is important to understand that
variables declared in an interactive shell are not automatically passed to a subshell. This
will be covered in more detail in the chapters pertaining to the shells themselves.

Command Line Parsing


To parse a command line means to look at each part of the command line and convert it
into something that the computer can execute. Since there variations in how different
shells parse a command line, it can be assumed for the rest of this section, that the shell in
question is generic. For the purpose of this book the small differences between shells as
well as the exact precise details of parsing can be ignored. When a user enters a
command line at the prompt, the shell begins by analyzing the command line. The shell
will break the command line down into small indivisible pieces called tokens (sometimes
they are refered to as atomic). Each token is then analyzed in terms of its relationship
with the other tokens. This is similar to the human examination of an english sentence. If
a noun is present, but no verb, the sentence is deemed incomplete. The shell behaves in
much the same matter. It doesn't only check for missing bits, it also makes sure that what
is there is in correct order. The shell may have to examine a command line more than
once to collect all of the tokens. Each examination is called a pass. The reason for
multiple passes is that command lines can be quite complicated, there can be all kinds of
substitutions that need to be made. On each pass the shell will make a required
substitution and then collect the available tokens. Since the substitutions can be nested
(substitutions containing substitutions), the shell may require several passes to collect all
of the tokens. As stated above, if at this point in the process the shell determines that the
grammar of the command line is incorrect, an error is displayed to the user, or else the
command is executed. While the actual order in which the tokens are gathered is
interesting, it is beyond the scope of this book. Where required (such as aliases) the order
of some of the parsing procedure will be presented.

Redirection
Throught this book the topic of redirection will be visited many times. Redirection is
where data is sent to or from when interacting with a command. Fo instance, when a user
logs onto a terminal on the local network, a group of messages may be displayed, or
perhaps just a prompt. The messages or prompt have been sent to the terminal window
for the user to read, in the case of the messages, or interact with, as would be the case of
the prompt. This output stream, as it is called, is sent to what is called the standard output
(STDOUT). This is usually automatically set to be the screen of the workstation or
terminal, although this hasn't always been the case. In the very early days of UNIX,
STDOUT may well have been a teletype machine, long since extinct. When a program
(or equivaently, a command) is executed, the output can be redirected to file by using the
> operator. The general syntax for all of the shells covered in this book (and any other
shells for that matter) is the following:
PROMPT> command name >filename
it would not be an error to have a space between the > and the filename, it is just a matter
of taste. When a program requires input, like the program which runs the login procedure,
accepts data via the input stream called standard input (STDIN). This is almost always set
to the keyboard, for obvious reasons. A command will often take data from either a file or
STDIN, as is the case with the cat (concatenate command):
PROMPT> cat filename
which would send the contents from the file filename to the screen. The following would
repeat to echo whatever the user had entered to the screen, after pressing the <ENTER>
key, until the end of file (EOF) character (usually Control D) had been entered:
PROMPT> cat
Hello There <ENTER>
Hello There
How are you? <ENTER>
How are you?
<CTRL D>
PROMPT>
The last important data stream is the standard error. When a program (or command) is
executed, it might encounter problems completing its task for whatever reason. When this
happens, the shell will, in most cases, echo an error message to STDERR. By default,
STDERR is directed to the screen along with STDOUT, but it doesn't have to be.
Situations may very well arise where the error messages might be sent to a file, or
somewhere else, while the standard output would be sent to the screen. The shells handle
this situation in different ways so examples demonstrating this procedure will be left until
the individual chapters.
In UNIX these data streams can also be referred to by numbers called file descriptors.
This provides another way to represent redirections but is again shell dependent and will
thus left to the appropriate sections for examples. The following table associates each file
descriptor with the corresponding data stream:

Table 1.1: File descriptors.

The Bourne Shell

• A Bit of History
• Getting to Know the Bourne Shell
• Multiple Commands Per Line
• Redirection and Pipelines
• Filename Substitution
• Special Characters
• Variables
o Variable Substitutions
o Parameters
o Environmental Variables
o Special Variables
• Bourne Shell Programming
o Testing and Branching
o String Comparisons
o Testing Files
o Loop Control
 Exiting Loops Early
o Input and Output (I/O)
o Advanced Programming Topics
 Functions
 Trapping Errors
o Dot Files
A Bit of History
The Bourne shell is the obvious shell to examine first for two reasons. First and foremost,
it was the first shell written. It was was written at Bell Laboratories, by Stephen Bourne.
This makes the Bourne shell the foundation which all other shells are (at least in part)
built on. In fact it will be seen that the shells covered in this book fall basically into two
categories, or families, the Bourne family and the C shell family. The second reason to
start with the Bourne shell is that it comes with all UNIX systems, making it a good
introductory shell purely on the basis of availability.

Getting to Know the Bourne Shell


If the user sits down and starts typing a few UNIX commands he will be utilizing the
shell that happens to be installed. A good place to start is to confirm that the resident shell
is, in this case, the Bourne shell. To check if this is true, type the following command:
echo $SHELL
and the output should be
/bin/sh
indicating that the shell being used is indeed the Bourne shell.
If this is not the case, and another shell is installed, the chsh command can be used to
change the shell, or for the purpose of learning the shell, an interactive shell can be
invoked using the sh command. While the default prompt for a non-root user in the
Bourne shell is a $, this is not really a good way to determine the current active shell
since in any shell the prompt can be changed. For the rest of the examples in this section
a dollar sign will lead the text to emulate the prompt.
Now that the current working shell is a Bourne shell, it can be examined in some detail. It
is instructional to look at what exactly happens when a command is entered. The user
types in any command on a line followed by the enter-key or return-key, for example,
$ cp *.txt text_files <enter>
This is a common enough command, it moves all of the files with a .txt extension from
the current working directory to a subdirectory called text_files. The shell is set into
action by the enter-key. It begins immediately by searching the UNIX command for
anything foreign to the command. For example it looks for variables, pipelines, other
redirection characters and command separators - all of which will be covered in due time.
The above example contains nothing but a simple UNIX command so the shell passes the
command to the operating system which then processes it. If any of the special characters
mentioned above had been contained in the command, the shell would have handled them
before passing the command off to the operating system. What if the command was too
long for the terminal window? No problem. The input may look different to different
users since some terminals will wrap longer commands onto the next line, and others will
adjust in such a way that the command will keep going to the right past the terminal
boundary. In either case, the shell waits until the enter-key has been pressed before taking
any action. There is a way however to enter a long command such that it will be broken
at the end of the top line and continued on the next. This can be accomplished by typing a
backslash (\) character before pressing return at the breakpoint, as follows:
$ echo This is a long command so why not break it here \
> and start on the next line. <enter>
which gives as output:
This is a long command so why not break it here and start on the next line.
The > is the shell's way of letting the user know that the current line is a continuation
of
the previous line. Note however that the output is printed as though it was never broken.

Multiple Commands Per Line


One of the things that the Bourne shell looks for when a command is entered, is if there is
more than one command entered. The most common way to have multiple commands
entered on one line is to use the semicolon separator. This separator tells the shell to send
each command to be processed in the order in which they appear, like the following:
$ cd docs; mkdir old_docs; mv *.* old_docs <enter>
which is the same as
$ cd docs <enter>
$ mkdir old_docs <enter>
$ mv *.* old_docs <enter>
Now suppose there is a situation where a user wants to have some commands carried out
at the same time without having to wait for each to start. A particular situation that comes
to mind is copying a several megabyte file (which can take minutes) while trying to do
anything else. The following could be entered:
$ cp big_file new_dir& rm *.o& who <enter>
which is equivalent to
$ cp big_file new_dir& <enter>
$ rm *.o& <enter>
$ who <enter>
where the shell puts the command (plus arguments) before an ampersand into the
background. In the above case it copies the big_file (in the background), it deletes all of
the object files (in the background) and finally it runs the who command (in the
foreground) .
The Bourne shell also allows command grouping. Command grouping treats a group of
commands as a unit and executes them as such in a subshell. To group commands enclose
them in round parentheses () and separate the commands by semicolons. The grouped
commands are executed in a subshell. For example,
$ MY_NAME='Norm Buchanan'
$ (MY_NAME='George Bush'; echo $MY_NAME)
George Bush
$ echo $MY_NAME
Norm Buchanan
This example was chosen to demonstrate how command grouping works, but also to give
a glimpse of variable scope or visibility (which will be covered in greater detail in
section 2.7, variables). User defined variables are local to the current shell which means
that they can not be accessed or altered in a subshell. This is why the variable MY_NAME
returns to its original value as soon as the commands enclosed by the parentheses have
been executed.
These features lead quite naturally into more powerful features of the Bourne shell, or
shells in general.

Redirection and Pipelines


The situation often arises where the user desires that the output of a command go to a file
rather than to the screen, or that the output of a command go directly to another
command. The Bourne shell provides a group of features to handle both situations. The
shell recognizes the following redirection characters: the output (>), the input (<), the
output append (>>), and the input from standard in (<<) (stdin is almost always the
keyboard). The output (>) symbol sends the output of a command to a file as follows:
$ ls -l >filename
which will send the long listing of the current directory to the file called filename. The
difference between the (>) and the (>>) is that the append (>>) symbol causes the
output of the command to be directed to the end of the file if it already exists or else it
creates the file. The standard output redirection operator (>) will create the file in order
to write the output and if this file exists it will overwrite it. The input operator (<) causes
the filename to the right of it to be used as input to the command. These can mixed up
and used together such as the following example which counts the number of lines in a
file called text_file and puts the result in a file called line_count:
$ wc -l <text_file >line_count
Another way to redirect output is with the use of file descriptors. File descriptors give
numeric values to the three types of input and output (I/O): standard input (stdin) or the
keyboard by default, standard out (stdout) or the screen by default, and standard error
(stderr), error messages are also sent to the screen by default. Table 1.1 lists the file
descriptor for each case:

Table 2.1: File descriptors.


To use these file descriptors to redirect output any of the following forms could be used:
$ command >&D redirects output to D,
$ command <&D get input for command from D,
$ command >&- closes standard output, and
$ command <&- closes standard input,
where D is one of the three file descriptors. Slightly more complicated I/O redirections
can be constructed using file descriptors such as
$ command 2>err.file
which sends the error messages from the command to a file called err.file. Probably the
most common use of file descriptors comes when the user's desire to send stdout and
stderr to a file. This would be accomplished by the following:
$ command out.file 2>&1
This will send both the output and error messages of the command to a file called out.file.
While this is a slightly complicated command to remember, it is well worth the effort.
Pipelines are used to pass the output from a command and feed it into another command
to be processed. The symbol for a pipeline is |. A good illustration of using a pipeline
might be to count the number of users logged onto the user's system which could be
accomplished by
$ who | wc -l
Some command pipelines can be accomplished in more than one way. For example,
$ cat text_file | wc -l
gives the same result as
$ wc -l <text_file
Users can determine which method they prefer, but keystroke-conservation is usually the
motivating factor. For either pipes or redirection, as many items as desired can be linked
together, and pipes and redirection can be mixed on a command-line.

Redirection and Pipelines


The situation often arises where the user desires that the output of a command go to a file
rather than to the screen, or that the output of a command go directly to another
command. The Bourne shell provides a group of features to handle both situations. The
shell recognizes the following redirection characters: the output (>), the input (<), the
output append (>>), and the input from standard in (<<) (stdin is almost always the
keyboard). The output (>) symbol sends the output of a command to a file as follows:
$ ls -l >filename
which will send the long listing of the current directory to the file called filename. The
difference between the (>) and the (>>) is that the append (>>) symbol causes the
output of the command to be directed to the end of the file if it already exists or else it
creates the file. The standard output redirection operator (>) will create the file in order
to write the output and if this file exists it will overwrite it. The input operator (<) causes
the filename to the right of it to be used as input to the command. These can mixed up
and used together such as the following example which counts the number of lines in a
file called text_file and puts the result in a file called line_count:
$ wc -l <text_file >line_count
Another way to redirect output is with the use of file descriptors. File descriptors give
numeric values to the three types of input and output (I/O): standard input (stdin) or the
keyboard by default, standard out (stdout) or the screen by default, and standard error
(stderr), error messages are also sent to the screen by default. Table 1.1 lists the file
descriptor for each case:

Table 2.1: File descriptors.


To use these file descriptors to redirect output any of the following forms could be used:
$ command >&D redirects output to D,
$ command <&D get input for command from D,
$ command >&- closes standard output, and
$ command <&- closes standard input,
where D is one of the three file descriptors. Slightly more complicated I/O redirections
can be constructed using file descriptors such as
$ command 2>err.file
which sends the error messages from the command to a file called err.file. Probably the
most common use of file descriptors comes when the user's desire to send stdout and
stderr to a file. This would be accomplished by the following:
$ command out.file 2>&1
This will send both the output and error messages of the command to a file called out.file.
While this is a slightly complicated command to remember, it is well worth the effort.
Pipelines are used to pass the output from a command and feed it into another command
to be processed. The symbol for a pipeline is |. A good illustration of using a pipeline
might be to count the number of users logged onto the user's system which could be
accomplished by
$ who | wc -l
Some command pipelines can be accomplished in more than one way. For example,
$ cat text_file | wc -l
gives the same result as
$ wc -l <text_file
Users can determine which method they prefer, but keystroke-conservation is usually the
motivating factor. For either pipes or redirection, as many items as desired can be linked
together, and pipes and redirection can be mixed on a command-line.

Special Characters
The previous section should clearly demonstrate that there are a number of characters that
are special to the shell, and hence some extra measures must be taken when attempting to
use them in a casual manner. For illustration of this point, imagine a directory containing
the following files:
Mail/ News/ a.out* prog.c utils/
and the user types, while in this directory,
$ echo * Hello *
Expecting the following output:
* Hello *
The user is shocked to see the actual output:
Mail News a.out prog.c utils Hello Mail News a.out prog.c utils
The shell has simply used its definition of the * symbol which differs greatly from the
users definition of the same character. This is a common example of how care has to be
taken when using shell special characters in strings. There are a few ways to handle this
situation. The first solution, which is probably the easiest for this particular example, is to
escape the *. This means that a backslash should be placed directly in front of the *.
Escaping a character causes it to be treated by the shell as though it were simply a text
character. Therefore the above example could be quickly corrected by entering the
following:
$ echo \* Hello \*
which would give the desired result. The result could also be achieved using single (left
hand) or double quotation marks. When either single or double quotation marks enclose a
string, the special characters (with the exception of the $ and right hand single quote ('))
are taken to be string characters. An advantage of quotation marks over escape sequences
is that if there are many special characters in a string, quotes will save keystrokes and
thus time. Another advantage is that they recognize white space. An echo statement
without quotation marks will not recognize spaces or tabs causing the following:
$ echo \*\*Warning\*\* Disk Space is Low
which will give as output:
**Warning** Disk Space is Low
A quick and easy fix would be
$ echo '**Warning** Disk Space is Low'
and this would give the appropriate output. While both single and double quotes solve the
above problems, they are different. The difference comes in handling variables, which
will be covered shortly. The single quote will not substitute a variable into an expression
whereas the double quote will. When deciding which of the methods to use, the
complexity of the expression (i.e. how many special characters are used) should be
looked at, as well as variable usage.
Command substitution is a topic that ties up these last few sections. Command
substitution allows the output of a command to be substituted into another command. For
example,
$ echo "There are `who | wc -l` users on the system"
may give as output
There are 13 users on the system
The user now has the tools and flexibility to handle even the most specific or complicated
command in the Bourne shell.

Variables
This is where the shell gets interesting. Variables add a level of generality to the
environment. A variable is simply a name that acts as a placeholder for a value or set of
values (an array which will be covered in later sections). In the Bourne shell there are
four different types of variables:

• User Defined - local (only accessible by the current shell),


• Parameters - command-line arguments,
• Environmental - special variables for the shell environment and
• Special - defined by shell.

User defined variables are fairly straightforward. In the Bourne shell they take the
following form:
$ SIZE=1024
$ MY_ADDRESS=buchanan@phys.ualberta.ca
$ greeting='Welcome to the Bourne shell'
To access a variable, a $ must be placed in frontof the variable name or else the shell
will not realize that what follows is not a command. The shell will then attempt to
execute the command and return an error message. An example of accessing a user
defined variable is then
$ echo You can e-mail me at $MY_ADDRESS
You can e-mail me at buchanan@phys.ualberta.ca
The above can be used to demonstrate the difference between single- and double-quote
handling of strings. If single quotes are used to enclose the string, the shell will not make
the variable substitution as it treats the $ as a text character rather than a signal that a
variable is coming:
$ echo 'You can e-mail me at $MY_ADDRESS'
You can e-mail me at $MY_ADDRESS
Double quotes however, will allow the shell to recognize the variable and pass its value
to the echo command:
$ echo "You can e-mail me at $MY_ADDRESS"
You can e-mail me at buchanan@phys.ualberta.ca
Variable names can be nested as well. This is another way to say that one variable can be
set equal to another variable which contains another, etcetera.
$ today=Tuesday
$ day=$today
$ echo The day of the week is $day
The day of the week is Tuesday
When assigning a value to a variable, it is important to leave no white space. This is
because the assignment is terminated by white space, unless the appropriate quotation
characters enclose the string value. This allows more than one variable assignment to be
made on a single line:
$ a=cat b=dog c=elephant
It is important however to realize that the assignments are processed from right to left and
therefore if the following assignments were made:
$ VAR1=$VAR2 VAR2=hello
the value of VAR1 would be hello whereas if the the order was reversed:
$ VAR1=hello VAR2=$VAR1
the value of VAR2 would be undefined. This is because VAR2 is assigned the value of
VAR1 before VAR1 has been given a value.
Variable assignments can also be removed using the unset command. For example,
$ VAR="Hello"
$ echo $VAR
Hello
$ unset VAR
$ echo $VAR

• Variable Substitutions
• Parameters
• Environmental Variables
• Special Variables

Variable Substitutions
Once a variable has been assigned a value, it can be easily referenced by the use of a
dollar sign in front of the variable name as shown above. A user may wish to append a
string directly to a string variable such as
$ VAR=Tues
$ echo $VARday
which will return no value since no value has been assigned to VARday. The shell can not
determine that the user meant to treat $VAR as a separate entity and hence it treated the
the string as one complete variable name. This is because the shell uses white space when
interpreting variable 22substitutions. However, if curly braces containing the variable
name are placed within a string, the shell will handle the entire string as desired:
$ VAR=Tues
$ echo ${VAR}day
Tuesday
The Bourne shell also provides variations on the above variable substitution that allow
alternative substitutions under certain conditions. There are four such conditional
substitutions in the Bourne shell which are listed in table 2.2:

Table 2.2: Conditional variable substitutions.


For example, if the user wants to have an error message printed when a variable gets
assigned a value, he could do the following:
$ ERROR=TRUE
$ echo ${ERROR:+"An error has been detected"}
An error has been detected

Parameters
Parameters are another type of shell variable that carry information from the command-
line. Any command can be broken up into parts. The first element of the command-line is
always a program (either an executable or a shell script, which will be covered shortly).
The following elements can be looked at as special values to be passed to the program.
These values are called positional parameters and are stored in the variables $1 through
$9. The parameter $0 is a special variable that always holds the name of the program.
Positional parameters will be covered in more detail in the section on shell programming.

Environmental Variables
Environmental variables are variables that have special meaning to the shell. These are to
be used to customize the environment for a particular user. These variables should be
defined in the special program that is executed during login called the .profile (dot
profile) file. Any defined variables would then be set until the end of the session unless
explicitly unset or redefined. An example of a .profile file is included in the section on
programming. Table 2.3 contains an alphabetical list of the Bourne shell environmental
variables, a brief description of what each is used for, and the default setting.

Table 2.3: Bourne shell environmental variables.


To make these variables global (so that they maintain their value in any subshell) the
export command must be used. For example if you use ---> for a prompt in the current
shell and then start up a subshell, the prompt will be $ again. To remedy this situation use
the export command:
$ PS1="--->"
$ export PS1
To see a list of environmental variables currently set and their settings, one can use the
env command. A typical listing might be as follows:
$ env
COLUMNS=80
HOME=/users/smith
LINES=24
LOGNAME=smith
MAIL=/usr/mail/smith
PATH=/bin:/usr/bin/:/users/smith/bin
SHELL=/bin/sh
TERM=vt100

Special Variables
Special or builtin variables are define by the shell for use at any time. They are mostly
used for shell programming and are covered in more detail in the next section. Table 2.4
contains descriptions of the Bourne shell special variables:
Table 2.4: Bourne shell special variables.

Bourne Shell Programming


Shell programming gives the shell a tremendous amount of power. Not only can groups
of commands be used together in a shell program (or script), but the Bourne shell comes
with a full programming language that allows such things as arithmetic calculations on
variables, decision making or branching, and controlled looping. Unlike languages such
as FORTRAN, PASCAL and C, shell scripts do not require compiling or linking. This
means that there is no conversion of the source code down into machine language before
execution. A script is really nothing more than a sequential list of instructions (some of
which are UNIX commands) that the shell interprets and executes. Each shell script is
written using the particular syntax, or language, given in the following sections. After the
script has been written to the users satisfaction, it is made to be an exectabule file using
the chmod command. If, for example, a user had a file called my_script that contained a
shell program he would do the following:
chmod u+x my_script
Notice that there is no special extension or any other feature to distinguish the script as a
special program. A shell script can be given any name with any extension, and therefore
it is convenient to give the scripts meaningful names to help the user remember their
purpose. A script can also be run in a subshell by using the command sh script_name.
The most simple shell script is one which contains a group of UNIX commands to carry
out a particular task. The following simple example is a script to backup all the files in
the subdirectories above a directory named data. It first archives all of the subdirectories
and files using the tar utility, it then compresses the archive file to conserve space, and
finally copies the compressed archive to a remote disk for storage:
#! /bin/sh
# Backup is a sh script to back up all of my data
tar cfv mydata.tar data/
compress mydata.tar
cp mydata.tar.Z /dsk2/storage/
All but the top two lines of the shell script are ordinary UNIX utilities. The second line
does nothing, it is just a comment statement. In UNIX shell scripts, the hash character
signifies a comment statement, allowing text documentation to be left for future
reference. This particular script is short and straightforward enough that documentation
may not be required, but in larger scripts it can be essential in order to understand what
the program does. Very few people can remember what exactly a program does more
than a week after writing it. A well documented script is much nicer to debug, or modify,
at a later date than a seemingly endless string of commands and other instructions. The
first line is a special line that should be present at the beginning of any shell script written
in any shell scripting language. The first two characters tell the shell that what follows is
the path of the shell that the script was written under. The current shell will then start a
subshell of the type contained in the first line (a Bourne shell in this example). After the
script has been executed, control is returned to the original shell. After a glance over the
above script, the reader may wonder what the difference is between a shell script and a
group of commands, say a pipeline. Well, in this simple example there is not much
difference, except for the fact that the same group of commands can be carried out by
typing the name of the script rather than re-entering them each time. This is the main
reason for putting a small group of commands into a script.

• Testing and Branching


• String Comparisons
• Testing Files
• Loop Control
o Exiting Loops Early
• Input and Output (I/O)
• Advanced Programming Topics
o Functions
o Trapping Errors
• Dot Files

Testing and Branching


The power of programming lies in the ability to have the computer make a decision based
on one or more choices (either embedded right in the program or provided by the user
when the program is executed). This decision-making process is called branching,
because of the obvious analog of a tree-branch system. The tool provided by the Bourne
shell to build these branching systems is the if:then:else construct, which has the
following form:
if statement_A
then
statement_B
statement_C
else
statement_D
fi
statement_E
What happens here is that statement_A (or more correctly condition_A) is attempted. If a
true value is returned, statement_B and then statement_C are carried out, followed by
statement_E. If a false value is returned from statement_A, statement_D is executed
followed by statement_E. The fi statement signals the end of the construct and allows
sequential execution of the following commands to resume.
The interesting part of the branching process is how the decision is made. The
programmer sets up a test (statement_A above). Statement_A can be a specific condition
(is A bigger than B) or it can be an expression. When the shell executes a command or an
expression it returns a value (called the exit status); if the command executes without
problems, a true value is returned, otherwise a false value is returned. It may seem
strange to have a command act as a test condition, but this is actually a very convenient
way to handle errors and other conditions. For example a user may want to automatically
print a file, and have an error returned if the file does not exist:
if ls plot.ps
then
lpr plot.ps
else
echo ":::: ERROR FILE DOESN'T EXIST ::::"
fi
If the file does not exist the shell will produce an error message, separate from the
message in the echo statement. This error message is what the shell returns automatically
if the argument (filename) for the list command does not exist. The stderr can be
redirected form the screen to a file which would keep the error messages from coming up
on the screen, but it would also leave a file containing the message in the user's account.
UNIX provides a tool called the null device to handle situations like this. The null device
acts like a file that is emptied as soon as anything is placed in it. Any unwanted output
can be redirected to the null device and it will then be immediately disposed of. Care
must be taken when sending anything to the null device as nothing can be recovered once
it has been sent there. The following modification to the line containing the ls command
will send stderr to the null device:
if ls plot.ps 2>/dev/null
The above demonstrates how a return value, or exit status, can be used in a test, but what
about other types of tests? Suppose two strings are to be compared, and the result of the
comparison determines the next step, or if the status of a file determines the branch taken;
what then? The Bourne shell comes with two ways to test conditions, the explicit method:
the test statement; and the shorthand method: square brackets [] . The full general test
expression works as follows:
$ string1="APPLE"
$ string2="ORANGE"
$ test "$string1" = "$string2"
$ echo $?
1
The 1 that is returned is a false value. In the Bourne shell an exit value of zero is
translated as a true value whereas a non-zero value is translated as a false value . It is
also important to note that the white space between the = sign and the values on either
side are required. In assignment expressions there would be no white space around the
equals sign. When used in a decision branch the above example is a lengthy way to get to
the result of the test. A more direct test would take the following form:
if [ "APPLE" = "ORANGE" ]
then
(A)
else
(B)
fi
where statement(s) B will be executed, as the strings are clearly not equivalent. Again, the
white space around the brackets is necessary. This is a much more elegant method of
testing and will be the preferred choice for the remainder of this chapter.
The three conditions that will need to be tested are: string comparisons, file status and
integer comparisons.

String Comparisons
Strings are considered to be equal if each character in one is matched exactly in the other,
and they have the same length. For example the following strings are not equal: String is
not equal to String. While the strings contain the same letters, the second string contains
an extra space, and they are therefore not equal. All of the string tests supported by test
are listed in table 2.5.

Table 2.5: String tests supported by test.

Testing Files
A very important test is often the status of a file. Perhaps one test would be to test if a file
exists before creating another with the same name (which would of course destroy and
replace the pre-existing file). The test command will test files for such attributes as
existence, writability, readability and file type. Table 2.6 summarizes the file test formats
and return values.
Table 2.6: File test formats and return codes.
To illustrate the use of file testing, the following is a script that acts as a revised version
of the UNIX move (mv) command. Not only does this script check for overwrites, it also
deletes any file that is of zero size.
#! /bin/sh
#
# Move is a script that moves a file if no overwrite will occur
# and will delete the file to be moved if it is empty
#
# Usage: move [file] [destination]
#
if [ ! -s $1 ]
then
if rm $1 2>/dev/null
then
echo "$1 was empty and thus removed"
else
echo "$1 does not exist"
fi
else
mv -i $1 $2
fi
This example uses the if:then:else construct but also uses a few things not discussed in
much detail. The first new idea was briefly covered in the section on variables -
parameters. The $1 passes the name of the file being moved (the first name entered on
the command-line after move), and the $2 is the destination or place that the file is
moving to. The Bourne shell allows nine such parameters, each separated by white space,
and therefore nine pieces of information can be passed to a shell program from the
command-line.
While the Bourne shell only accommodates direct access to nine parameters (i.e. $1
through $9), nothing stops the user from entering more. The shell comes with a tool that
permits access to more than nine parameters. More generally, this tool permits the use of
an undetermined, at run time, number of parameters. This means that the script can let the
user enter as many parameters as he wishes on the command line and the program will
handle any or all of them as required. This tool is the shift command. The shift command
slides the parameter list by a specified number of places to the left and has the following
form:
shift n
where n is the an integer that specifies how many places the list will be adjusted. If n is
not explicitly given, the parameter list is moved one place. The following illustrates how
the shift command works on a parameter list of size n. Before the shift command has
been used, the list would look like
$1 $2 $3 $4 ... $n
and after using shift:
$2 $3 $4 $5 ... $(n-1)
where after the shift command was used, the first parameter is no longer accessible. The
number of parameters in the parameter list drops by the number of places that the list was
shifted as well. For example, if a user inputed 5 parameters and then the script shifted the
list:
echo $#
5
shift
echo $#
4
where the 4 and 5 would be the output of the shell script. This presents a clever way of
accessing any number of parameters in a script. To ensure that all parameters have been
processed, a simple test command can be incorporated into the script as follows:
while [ $# -ge 1 ]
do
echo $1
.
.
.
shift
done
This script piece will print the parameters, carry out some unspecified tasks, and repeat
until all of the parameters passed have been exhausted.
If a script is written to expect a certain number of parameters, and the user places too
many on the command line, the program will only use as many as were required. This
means that if the script expects 2 parameters, and the user enters 3, only the first two
entered will be used. If, on the other hand, the user enters fewer parameters than the
script was written to expect, it will fill the un-entered parameters with a null value (i.e.
they will be empty).
Another new idea (from the previous example script) was that of having one if statement
within another, which is called a nested if statement. This allows tests and decisions to be
made as a result of earlier tests. Nested decisions allow very specific testing with less
coding than would otherwise be the case. Notice that the first if statement uses a !
character, which negates the test, or turns a TRUE value to a FALSE value and vice
versa.
The comments have been included to give some description of the task the script will
carry out. One commented line in particular gives the expected usage of the script which
allows the user to have knowledge of which order to enter the parameters on the
command-line. This is not required but is good practice - especially if the script is to be
used by people other than the programmer.
There is a third type of test that can be made. This is the integer comparison test. This
type of test is done when comparing the values of two integers. The first integer, the one
on the left, is compared with the one on the right. For instance, the following tests to see
whether Integer1 is greater than Integer2:
$ test Integer1 -gt Integer2
There are in total six comparisons that can be made between two integers: if they are
equal (-eq); if the are not equal (-ne); if Integer1 is greater than Integer2 (-gt); if
Integer1 is greater than or equal to Integer2 (-ge); if Integer1 is less than Integer2 (-lt);
and finally, if Integer1 is less than or equal to Integer2 (-le).
In the Bourne shell, mathematical expressions cannot be simply assigned to a variable.
This means that the expression a=b+c is not valid. To get the result of an arithmetic
expression, the expr command must be used. This command evaluates the arithmetic
expression, and then returns the result. There are five arithmetic operators that can be
used in the Bourne shell:

Table 2.7: Arithmetic operators in the Bourne shell.


The multiplication and division operators have higher presidence than do the addition and
subtraction operators. The order can be arranged in any fashion however, with the use of
back quotations (`). The Bourne shell does not recognize parenthesese. For example,
$ expr 3 + 2 \* 8
19
is different from
$ expr `3 + 2` \* 8
40
Note also, that the multiplication operator (*) was escaped in the expression. This is
because the shell tries to use the charater as a filename expansion if it is not escaped. One
last important note on arithmetic expressions is that the division operator is integer
division. This means that when one integer is divided by another integer and the result is
not itself an integer, the remainder is dropped and only the integer portion of the division
is returned. The modulus operator can be used however to return the remainder of the
operation.
It may be neccessary to use more sophisticated compound tests in a decision, and
therefore tests can be combined using logical AND and OR operators. A compound test
is a fancy way of saying that two simple tests are combined into a single test using logical
operators. The logical AND operator is written -a, while the logical OR operator is
written -o. To illustrate the use of a compound test, consider the situation where a file is
tested for readability as well as for writability:
if [ -r prog_name -a -w prog_name ]
then
command list
fi
A test can also be negated by placing the ! character before the test:
if [ ! -x prog_name ]
then
lpr prog_name
else
echo ``Are you sure you want to print an executable file?''
fi
The if:then:else construct works well for handling one or more decisions in the flow of a
script, but an even better construct is available for handling decisions where many
choices are available - the case construct. In these situations, the case construct is
simplier to impliment than the if:then:else construct. The case statement takes a value, or
pattern, and compares it with a list of patterns. The number of patterns is up to the
programmer's discression, but there is no imposed limit. Along with each pattern in the
list is a list of commands. The first pattern that matches the initial value has its command
list executed. The case construct has the following format
case value in
pattern1)
command list;;
pattern2)
command list;;
.
.
.
patternN)
command list
esac
The double semicolons are used to signify the end of a command list. The last group of
commands does not require the double semicolons as the esac statement signifies the end
of the construct, which implies the end of the last group of commands. More than one
pattern can be entered on a line if they are separated by the logical OR symbol (|). When
multiple patterns are used in this way, the first line that has any one of its patterns
matched has its command list executed. Character substitutions, such as * and [], can also
be used in a pattern. Since the patterns in the comparison list are compared from top to
bottom it is always good practice to use the meta-character (*) as the last pattern choice to
handle any unexpected choices. The case construct is especially useful in menus of
choices such as the following:
echo ``Do you wish to delete $FILE ?''
echo ' '
echo ``q or quit Quits''
read ANSWER
case $ANSWER in
y|Y|yes|YES)
echo ``Removing $FILE'';
rm $FILE;;
n|N|no|NO)
echo ``$FILE was not removed'';;
q|Q|quit|QUIT)
exit 1;;
*)
echo ``$ANSWER was not an understood option'';
exit 2
esac
The only command used here which is yet to be covered is the read command which gets
input from the user. This is covered in the section on input-output.

Loop Control
While branching is an integral part of shell programming, it is only one part of program
control. Looping in a program allows a portion of a program to be repeated as long as the
programmer wishes. This can be for a specified number of iterations (or loops), or it can
be until a particular condition is met. For instance, a programmer might want to repeat a
particular operation on every file in a particular directory. Rather than rewrite the section
of the program that carries out the operation over and over for each file, the operation can
be written once and iterated as many times as required. Loop control also allows for
programs that are more general. Rather than having to pre-specify how many iterations
are required for a particular task, the programmer uses conditions to control the iterations.
The Bourne shell provides a rich variety of loop control constructs. Depending on what is
needed the programer can chose the construct that best suit his needs.
The for:in:do construct is used to repeat a group of commands once for each item in a
provided list. The construct has the following form:
for VARIABLE in LIST
do
COMMAND LIST
done
where VARIABLE is a variable name assigned each item in LIST during the execution of
COMMAND LIST. What happens is as follows: the variable takes the value of the first item
in the list and then executes the command list; after the command list has been passed
through, the variable is assigned the value of the second item in the list, and so on, until
the list has been exhausted. Searching for the occurence of a string in a file could be done
like the following:
#! /bin/sh
#
# A script to look for the occurence of a string in a file
# Usage: match [string] [file]
#
for word in `cat $2`
do
if [ ``$word'' = ``$1'' ]
then
echo ``Found $1 in file $2''
else
:
fi
done
where the first parameter passed to the script is string and the second is the file thought to
contain the string. Notice that after the else statement a colon has been placed. This is the
null command which tells the computer to do nothing. This is clearly an unnecessary
section of the script and was only added to demonstrate the use of the null command.
If the list is omitted from the for statement, each parameter in the command line will be
passed to word.
A second type of loop control construct is the while loop. This construct, unlike the
for:in:do construct, checks the TRUE or FALSE value of a condition before proceeding.
It has the following form:
while condition
do
commands
done
where the condition can be obtained in the usual fashion using one of the forms of test.
The done statement signifies the end of the construct. The following script segment
counts backwards from 10 to 1:
number=10
while [ $number -ge 1 ]
do
echo $number
number=`expr $number - 1`
done
Another construct which is very similar to the while loop is the until construct. The
construct works in precisely the same manner with the one exception that it repeats a
series of commands until a condition is met. The until loop looks like
until condition
do
commands
done
To count backwards, as in the above example, the until loop would be used as follows:
number=10
until [ $number -lt 1 ]
do
echo $number
number=`expr $number - 1`
done

Exiting Loops Early

In the three looping constructs examined above, the loop would continue to execute until
a specific condition was met (a boolean condition in the while and until loops, or
completion of a certain number of loops in the for loop). There are times however when it
would be beneficial to exit a loop early. The Bourne shell provides two methods for
exiting from a loop early, the break command, and the continue command.
The break command will exit from the current loop and program control will be resumed
directly after the loop construct exited from. The break command takes an integer
parameter n which determines the number of levels to jump out of. For example,
until cond1
do
Command_A
until cond2
do
Command_B
if ! $?
break 2
fi
done
Command_C
done
Command_D
In this example, cond1 is evaluated and if it does not have a TRUE value Command_A is
executed. If cond2 has a FALSE value Command_B is then executed and the following if
statement checks to determine if the return value was TRUE. If the return value is TRUE
cond2 is again tested. If the return value of Command_B is not TRUE (i.e. non-zero), the
break command is executed. Since a parameter value of 2 has been passed to the
command, it jumps out two levels and Command_D is executed. Notice that
Command_C can only be executed if cond2 becomes TRUE.
The second method for exiting a loop early is the continue command. This command
behaves very much like the break command with the exception that it jumps out of the
current loop and places control at the next iteration of the same loop structure. The
continue command also takes an integer parameter which determines the number of
levels to jump back. Looking at the above example, with the continue command
replacing the break command.
until cond1
do
Command_A
until cond2
do
Command_B
if ! $?
continue 2
fi
done
Command_C
done
Command_D
In this example, Command_A is executed as before followed by the test of condition
cond2. If Command_B returns a FALSE value, the continue command jumps out of the
loop and condition cond1 is again evaluated, and if not TRUE, Command_A is again
executed. In this example Command_C will only be evaluated if a test of cond2 returns a
TRUE value, as above, but Command_D will only be executed if the test of cond1
returns a TRUE value. The continue command will not pass program control directly to
Command_D as it did in the first example.
When used in case structures, the break command is a pleasant alternative to the exit
command for handling unwanted choices as it allows for control to be passed to another
section of the program rather than exiting the program entirely.

Input and Output (I/O)


Now that the programming structure has been covered, it would be nice to be able to
interact with the program while it runs. As has been shown above, the program can write
out to the screen, a file or a device, and can even read in information from the user. While
all of the above examples have used some form of output, in the form of an echo
statement, they have not utilized any of the special escape characters provided. The
following is a list of these special characters and their purpose.
Table 2.8: Escape characters in the Bourne shell.
A couple of these characters are extremely useful for interacting with the user. For
example, the \c character can be used when getting input from the user to prevent the
cursor from dropping to the next line:
$ echo ``Do you wish to continue? \c''
Do you wish to continue? $
Until the newline character is entered, any text entered will be displayed on the same line
as the input prompt. Another useful special output character is the system, or alert, beep \
007 which can be used when alerting the user that a problem or error has arisen:
if [ ``$password'' = ``'' ]
then
echo ``You must not have a null password \007''
exit 1
fi
Shell scripts can also take input from stdin. User input allows shell programs to be fully
interactive, which will add to their generality. The read statement is used for this
purpose. The read statement will cause the shell to accept an input string to be read in
until a newline character is encountered. The shell removes all white space (with the
obvious exception of newline characters) but does not interpret filename or character
substitutions. The shell takes the input string and breaks it up into substrings which are
surrounded by white space. The read statement has the following form:
$ read Variable1 Variable2 Variable3 ... VariableN
where each variable is an expected substring. If, for example, the user is expected to input
two filenames, the read statement would be
$ read file1 file2
If the user accidentally enters three files rather than two, the second variable will be
assigned the last two file names. If on the other hand, there are more variables than
substrings entered, the unmatched variables will be null, or empty.

Advanced Programming Topics

• Functions
• Trapping Errors
Functions

As a shell programmer becomes more experienced, he will undoubtably want to write


increasingly more complex shell scripts to handle more interesting tasks. With more
complex tasks come larger more complex programs. As in any computer language, large
programs can become cumbersome and difficult to understand. This can be a real
problem when attempting to maintain scripts. One way that programmers have managed
to keep complicated programs clear and hence easily maintainable is to modularize the
programs. This means to write programs in blocks or modules, where each module
contains a group of commands specific to a certain subtask of the whole program. An
example of this would be a program written in any generic language to calculate the sum
of areas of a circle, square, and a triangle. While it would be possible to write the
program sequentially, using modules gives the program a certain clarity that can not
easily be achieved any other way:
begin areasum

circ_area ()
area = Pi * rad * rad
return area

squar_area ()
area = side * side
return area

tri_area ()
area = 1/2 base * height
return area

sum = circ_area + squar_area + tri_area

end
These modules are often called subroutines or functions. The difference between the
meanings is language dependent. In Bourne shell programming, these modules are called
functions. Another reason, even more valuable than maintenance, for using functions is
the ability to repeat a task many times without having to re-enter the same code whenever
it is needed. A good example of this is a sine function. A program which tracks a sattelite
may have to calculate a sine function many times. If the list of commands to calculate the
sine function had to be written in to the program for each time it was to be used, the
length of the program would be ridiculous. Instead, a function which calculates the sine
of a number would be written once in a function and called when needed. In shell
programming it may become necessary to repeat a task many times in the context of a
larger task. For example, consider a function which might examine a file and print out
information pertaining to a particular string contained in the file. The actual program
might examine many directories containing many different files, but only want to
examine a certain type of file for this string. The main shell script could then call this
function only when it is needed rather than have it operate on all of the files examined by
the main program. This could save the user much time and make his entire program more
efficient. The following example of a function is an enhancement of the long listing in
UNIX (ls -l) in that it displays a title over each of the columns of information diplayed:
$ list () {
> echo ``Permission Ln Owner Group Size Modified Name''
> echo ``---------- -- ----- ----- ---- -------- ----''
> ls -l $*;
>}
which could be used as follows:
$ list
Permission Ln Owner Group Size Modified Name
---------- -- ----- ----- ---- -------- ----
total 86
-rwx-r---- 1 normb users 60041 Dec 15 19:48 Bourne.tex
drwx------ 2 normb users 1024 Sep 18 3:13 mail/
drwxr-xr-x 2 normb users 1024 Sep 30 11:01 tmp/
$
Because of the use of the $* variable this long listing function will also take wildcards (or
filename meta-characters).

Trapping Errors

This last topic on Bourne shell programming deals with the way programs can handle
interruptions. An interruption can be anything from a terminal being disconnected to the
computer running out of memory. What happens if an interruption occurs during
execution of a script? This is not an easy question to answer as the answer is basically
that it depends on what kind of interruption has occured. If, for example, the power to the
system was lost, there is really nothing that can happen since the computer will no longer
be operational. If however, the user presses an interrupt (or break) key, the question is
still not answered. Will the program stop immediately or will it finish executing the
current loop, or perhaps nothing will happen. What actually happens depends on a
command called trap. The trap command allows the user to have the program carry out a
command string of one or more commands prior to exiting the script. If more than one
command is contained in the string, the commands should be contained in quotes as
described in the section on shell special characters, and command execution. The syntax
of the trap command is as follows:
trap command(s) signal(s)
One situation where this command is extremely useful is where information is being
written to a file (to be removed at normal exit of the program) and an interruption (or
using correct UNIX terminology, a signal) is sent which would by default cause the
program to terminate. If program termination occurred prior to normal exit from the
program, the default action would terminate the program and leave the mess behind.
Adding the following to the script would allow the clean-up to occur before termination
due to a user break (control-C):
trap `rm tmp/*; exit 1;`
Care must be taken for a couple of reasons when trapping signals. First, the signals which
may be trapped vary from machine to machine, and second, the definite kill signal (9)
cannot be trapped on any machine. What follows is a typical table of signals and the
event which causes them:
Table 2.9: Signals and the event which causes them.
The details of this list are not really important as there are only a few which will be
trapped in ordinary day-to-day activities: 0, 1, 2 and 15. One should also note that the
default (ie. not using trap) is always immediate termination of the script.

Dot Files
Throughout this chapter there have been mentions of the environment, such as
environmental variables. For instance it was shown that the environmental variable
$PROMPT allowed the user to change the appearance of his prompt during an interactive
session. What if however, a user would like to have that prompt setting as the default for
every session? As was mentioned earlier, the prompt setting can be placed in a file called
.profile in his home directory which is executed at each login. This file is simply a script
that the shell looks for at login and will execute at that time. This is where any
enviromental variables should be set, such as $PROMPT and $PATH. There will likely be
a file /etc/profile which will also execute during login. This is a file that the system
administrator will have set as a default so that users do not need a .profile in their own
directory. It also allows the system administrator to set some things like the path so that
the users on the system can have access to all of the executables without having to try and
figure out the paths for them. Depending on the user's level of access, the /etc/profile is
often a good file to copy into his directory as a skeleton to start with. As well as
enviromental settings, function definitions can also be placed in this file. This means that
the list function which was described in the section on functions can be used during every
session without having to be re-entered. It is always good practice to keep the functions
in their own file to prevent cluttering up the .profile file. As will be seen in later chapters,
the newer shells can have several of these startup files which can be quite confusing if not
organized in some fashion. The functions can all be put into a file called .functions which
would then be executed by the .profile file. The .function file can be executed using the
source command or the . command. The .profile script would then have one of the
following two lines (which are equivalent):
source .functions
or
. .functions

The C Shell

• A Bit of History
• Getting Started with the C Shell
• Command Execution
• Redirection and Pipelines
• Filename Substitution
• Filename Completion
• Special Characters
• Command History
• Aliases
• Variables
o User Defined Variables
o Array Variables
o Global or Environment Variables
o Parameters
o Special Read Only Variables
o Variable Modifiers
• C Shell Programming
o Testing and Branching
o Signal Handling
• Job Control
• Special Files
o Introduction to UNIX Security

A Bit of History
The shells in this book are grouped into two main families, the Bourne shell family, and
the C shell family. Since we have already covered the Bourne shell, it would thus be a
logical step to examine the C shell next to outline the other major family.
The C shell was developed by Bill Joy, the creator of the vi text editor, at the University
of California, Berkeley. He improved on the Bourne shell in many areas by adding some
new features, and altering some of the original features. He based parts of the syntax on
the C programming language, but the C shell and C programming languages are two very
different entities.
Getting Started with the C Shell
As with the Bourne shell, a good place to start is by checking that the current shell is in
fact the C Shell. The echo command can be used to see which shell is running:
echo $SHELL
which should give
/bin/csh
if the C shell is the current operating shell. If the C shell is not the current shell, it can be
invoked by typing the following command:
chsh /bin/csh
will change the login shell (the shell that is started everytime you login). An interactive
shell can be temporarily invoked, until the end of the session, for learning or using the C
shell by typing csh. The default prompt in the C shell is the percent symbol (%) which
will be used as the default prompt for the rest of this chapter. The following is the typical
appearance and execution of a command line in the C shell:
% ls -F
Mail/ News/ bin/ message.txt slide.ps
%
This is essentially the same as the output from the same command in the Bourne shell
with the exception of the different prompt. All shells handle command lines in the same
manner. They first accept the user input and then they break it down into individual
components such as the command to be executed and any flags that go with it.
The C shell allows long command lines in the same manner that the Bourne shell did. If a
command line is too long for the screen (usually 80 characters on most monitors), simply
break the line with a backslash, or in other words, escape the return character. This
backslash character tells the shell to ignore the enter-key when it is pressed. A use of this
might be when copying a list of files into another directory:
% cp file1 file2 file3 file4 file5 file6 \
> file7 file8 file9 file10 file 11 file12 \
> ~jones\data\storage\old_files
%
which would allow all of the files to be moved in a single command. The > character
signifies the continuation of a command line onto the next terminal line. The ~ character,
or tilde (pronounced til-duh), is a special character in the C shell that is understood by the
shell to mean the current user's home directory. This and other special characters will be
examined in the section on filename substitution.

Command Execution
The C shell allows commands to be entered in different ways. Grouping multiple
commands into a single command line, running commands in the background, and
building complicated commands from commands inside of commands can all be handled
by the C shell.
To execute multiple commands on a single command line, simply place a semicolon
between each of the commands. The commands will be executed from left to right with
each successive command being executed after the previous one. For example, the
following command line changes to a directory containing documents which need to be
moved. A new directory is created for the documents, which are then all moved into it.
% cd docs; mkdir old_docs; mv *.* old_docs
This task could have been accomplished by executing each of the commands separately
on a separate line:
% cd docs
% mkdir old_docs
% mv *.* old_docs
If all of these commands execute quickly, this is a good method for carrying out tasks. If
however, one or more of the tasks, takes a significant amount of time to execute, the user
will be left waiting for its completion - wasting time. To get around this problem, time
consuming commands can be run in the background. This means that the shell handles
the execution and waits for its completion without the user having to wait to do other
things. For example, if the user needs to carry out a few routine disk management tasks,
including archiving all of his directories and subdirectories, he might wish to have the
archiving done in the background.
% rm ~/tmp/*; tar cfv my_dirs.tar ~
Sometimes a situation might arise where the user would like to have commands executed
such that one command is the argument (or input) of another command. For example, the
user may wish to count the number of users logged into the system at the present time.
One way to do this would be to type who and count the number of users, but a better way
would be the following:
% wc -l `who`
In this example, the left single quotes tell the shell to execute what is inside first (the who
command), and use the result as the input to the wc command. The user would only see a
number as the result of this command. While the who command generates a list of users
currently logged onto the system, this output is sent to the wc command and therefore
only the number of lines that the wc command counts is output to the screen.
A group of commands can also be executed in a subshell. This means that a group of
commands can be executed in a shell other than the interactive shell currently being used.
To do this, the user would enter a list of commands separated by semicolons inside of
parantheses (). The following example will illustrate the syntax of this type of command
while also demonstrating the way in which variables are affected by use in subshells.
% set MY_NAME = 'N Buchanan'
% (set MY_NAME = 'Joe Blow'; echo $MY_NAME)
Joe Blow
% echo $MY_NAME
N Buchanan
It can be seen that the same variable is used in both the current interactive shell and in the
subshell. The values of this variable are different however. The good news is that the
subshell did not overwrite (or redefine) the variable in the current shell. The bad news is
however that this type of variable usage can become quite confusing and cause the user to
lose track of what a variable's purpose is. This type of variable usage is therefore
discouraged.
The C shell provides a mechanism for repeating a command a specified of times. While
on the surface this may seem like a useless feature, it can come in handy. The format of
this repeat command is simply
% repeat n command
where n is the specified number of repetitions and command is the command to be
repeated. A particularly useful way to use this feature is to repeatedly echo something to
the screen:
repeat 2 echo ********************
echo ' Error'
repeat 2 echo ********************
which would then be executed as follows:
********************
********************
Error
********************
********************
which would certainly add a sense of urgency to any error message. The alert reader
might be saying that the output would be interrupted by the entry of each successive line.
While this would be true in interactive mode, the omission of a prompt will be used in
this book to signify a shell script or shell script fragment (which the above example is). In
a shell script, the commands are executed in sequential order. This will be covered in
more detail in a later section. One thing to keep in mind when using the repeat command
is that the command to be repeated must be a simple command and not a compound
command using redirection or pipelines (which will be covered in the next section).

Redirection and Pipelines


The previous section demonstrated the power of passing the output of a command to
somewhere other than to the screen - to another command. This is a very useful feature,
and the C shell provides two ways other than command substitution to accomplish it. The
first of these is redirections. Redirection allows the output of a command, or more
generally, group of commands, to a file or device. There are four basic types of
redirection operations. The command > filename redirection sends the output from the
command to a file called filename. If this file does not exist it is created and the output is
put there. If the file does exist, any contents that might be contained in it are erased and
then the output is placed into the empty file. The command < filename redirection takes
the contents of filename and uses them as input to command. The command >>
filename works precisely the same as > with the exception that if the file it is to write to
exists, it appends the output to the end of the file rather than eraseing the previous
contents. The command << takes input from the keyboard as input. These four basic
redirections make up the basis for table 4.1 which contains all of the redirections possible
in the C shell:
Table 4.1: Redirections possible in the C shell.
Notice that all of the redirections with the ! symbol refer to noclobber having been set.
noclobber is a predefined shell variable (refer to the section on variables) that prevents
existing files from being overwritten by redirection. The ! symbol overrides this variable
if it is set. It is also important to realize that when the << redirection is used, the shell
will make command, filename, and variable substitutions unless the input string is
surrounded by quotes of some form. For each of these substitutions, a device, such as the
null device, or a file descriptor could be put in place of file.
Another way to pass output of a command to another command is by using a pipe (|). The
pipe is similar to the left single quoted command substitution described above, with the
exception that it is formatted differently:
command1 | command2
which executes command1 and passes the result to command2. For example, the
following are equivalent:
% wc `ls`
is the same as
% ls | wc
Pipelines, redirections, and command substitions can be mixed in any way that the user
desires.

Filename Substitution
The C shell allows the user to specify a filename or group of filenames without having to
explicitly type in the entire filename. This is called filename substitution. For instance,
the user might wish to list all of the C source files in a directory, or perhaps list all but the
C source files in a directory. The most common type of of filename substitution is the
metacharacter (*). This character tells the shell to substitute in an arbitrary list of
characters until all possible matches have been made. For example,
% ls *.c
prog.c prog2.c string_search.c
% rm p*.c
% ls *.c
string_search.c
This type of substitution is quite robust since any number of strings will be tried until the
appropriate ones have been chosen. There are however, more specific substitutions that
can be made. The (?) character tells the shell to only try and substitute the one character
until matches are made:
% rm pro?2.c
while this looks like a job for the metacharacter (*), it must be remembered that if a file
called program2.c existed, it would be destroyed if the (*) was used rather than the (?).
The [] substitution will try to insert each character in the brackets into the filename. If a
range is specified by use of a dash, each character in the range will be tried, as well as
any other characters in the brackets. For example, rm file[a-df].txt would delete all of the
following files: filea.txt, fileb.txt, filec.txt, filed.txt, and filef.txt if they existed. The
negation character (^) could be used to prevent substitution of characters in similar
fashion. All of the above were provided in the Bourne shell, but the C shell provides a
couple more types of filename substitution. The curly brace {} will tell the shell to insert
each provided string, separated by commas, into the filename:
% cp data{june,july,august}.95 old_data
would copy the files datajune.95, datajuly.95, and dataaugust.95 into the old_data
directory.
While it may seem that every conceivable type of filename substitution has been covered,
the C shell recognizes a special character that can be best described as yet another. In the
C shell, the (~) character is recognized as the current user's home directory. If the user
types cd ~ anywhere on the system he will be returned to his home directory. If the tilde
(~) is followed by the user name of another user on the system, this user's home directory
is the result. For example,
% cp file ~smith
will copy file to user Smith's home directory if the appropriate write permissions are set.

Filename Completion
It can clearly be seen that filename substitutions can be used to refer to files which are
long to type or possibly off the screen and thus difficult to refer to. For example, a
directory might contain several hundred files (take a look at /usr/bin) and a user might
want to examine a data file that he can only remember starts with the month it was
created, say june. Well he could pipe the listing to the more command like:
ls -l | more
which would allow the user to examine all of the files a page at a time, or he could try:
head june*
which would display the first few lines of each file starting with june, thus allowing the
user to examine the files and find which one he wants to look closer at. There are many
ways to handle the problem using various commands and filename substitions, but the C
shell provides an even better method, command completion. If the variable filec is set
(refer to the section on variables) using the command set filec, a filename can be
completed at any time during command line input using the ESC-key. For example, if
there is a subdirectory called all_of_my_data_files in the current directory, the user could
type the following:
cd al[ESC]
at which point the rest of the name would be filled in by the shell. This is of course only
true if the rest of the filename is unique. If it is not, the shell will beep alerting the user
that the completion is ambiguous. This would occur if there were three files in the current
directory starting with ``th'', and the user tried filename completion after entering just a
``t''. One way to fix this problem is to enter the EOF (end-of-file) key combination
(usually Control-D, often written ^D) rather than the escape key, at which point a list of
choices will be displayed by the shell. The following session illustrates the above (the
text following the hash marks are comments only, for purpose of clarity):
% ls
Mail/ News/ data_june95/ data_march95/ data_may95/ cleanup*
% cd d[ESC] # system beep sounds
% cd data_ # actually the same line as above
% cd data_[^D] # still the same line as above
data_june95 data_march95 data_may95
% cd data_j[ESC]
% pwd
/home/jblow/data_june95
This may look a bit confusing, but with a bit of practice, it will become an invaluable
tool. The example above illustrates how the shell will complete as much of the command
as it can before ambiguity sets in. This will allow command completion at any point of
the desired filename so that the user will not have to enter input to the point of unique
characters before using completion.

Special Characters
It might become apparent at this time that the C shell recognizes a good number of
characters that have special meanings (~, >, <, & and \, just to name a few). This
presents a bit of a difficulty if one is not careful. To see why, try this:
% echo ~ Hi! ~
/home/normb Hi! /home/normb
It is not what one would expect, but this is the way the shell interprets command lines.
After the command echo, the shell makes any substitutions of special characters, and
then proceeds to fill in the rest. In the above example, the tilde was meant as an artistic
touch, not the user's home directory. Unfortunately the shell has no way of interpreting a
user's intentions. There are however, ways to deal with special characters. One of these is
to escape the character in question. The reader might remember that escaping a character
was discussed when discussing command lines that were too long. In that case a
backslash (\) was used so that the shell did not interpret the enter-key as an end of
command line signal. Here escaping a character is precisely the same. To escape a
character simply place a backslash immediately to the left of the character - no spaces.
This will work for any special character. Here is a listing of the characters in the C shell
that are considered special: ; & ( ) | * ? [ ] ~ { } ! < > ^ `` ' \ ` $, as well as, whitespace
(space, tab or newline). To correct the previous example, the escaping method could be
used:
% echo \~ Hi! \~
~ Hi! ~
Notice that the ! character
is listed as a special character above, and yet is not escaped in
the example. The reason for this will become clear in the section on command history,
but a brief explanation is that, alone the ! character is not special. It becomes special
when followed by almost anything else - so be careful.
The C shell, like the others, allows enclosing text in quotations to prevent interpreting
special characters. This is a bit nicer when the number of characters which could be
misinterpreted is more than one, and especially if white space is involved. Suppose for
example that a person wanted to have a word or phrase centered on the output device.
Maybe a title that acted like a heading for some output. Simply typing the title like the
following would not work
% echo Title
Title
since the shell treats the whitespace as a single unit regardless of the amount of spaces
and tabs entered. If the text (including whitespace) was placed inside of double quotation
marks however, the shell will leave the whitespace alone allowing the centering on the
display.
% echo `` Title''
Title
The double quotes will allow all special characters to be taken as text rather than with
special meaning with the following exceptions: '' (ending double quotations), $ (variable
substitution), ` (command substitution), \ (next character escape), ! (history character),
and NL (newline character). The double quote character is used to signify the ternination
of the quoted text, and the $ and ! characters will be covered in the following sections.
Another option is to use the right single quotation character ('), which behaves like the
double quotations with the exception that only the history character (!), the newline (NL)
character, and the right single quote will be recognized as special characters by the shell.
The following example shows the difference between single and double quoting of text:
% ls *.gz
lotsOdata.gz
% echo ``Current shell is $SHELL Last command was !!''
Current shell is /bin/csh Last command was ls *.gz

% echo 'Current shell is $SHELL Last command was !!'


Current shell is $SHELL Last command was ls *.gz
The difference between the two commands is that the double quoted expression allows
the shell to substitute for variables while the single quoted expression will not. Both
allow history substitution. The decision of which type to use is dependent on the
situation.

Command History
While there are indeed many ways described above to reduce the work involved in
command line input, it can still be a daunting task to repeat a command over and over.
There will often be times where a command or filename has been misspelled and requires
reentering, or other times where a command line will be purposely reentered, such as a
debugging session with multiple recompilations. The C shell introduces a mechanism for
making this much easier, the history mechanism. This is simply a record that is kept
detailing the previous commands that the user has entered. These commands can then be
accessed in a variety of ways. The simplest is with the up and down arrows on the
keyboard. If the keyboard is one with the arrows on the number pad as well as the arrow
pad, the current key mapping will determine if the number pad arrow keys will work.
They may not. As a simple example of the history mechanism, take the following session:
% emacs myprog.cc
% g++ -I /users/normb/my_include_files -o myprog myprog.cc
% myprog
various annoying run time errors
%
The user must now attempt to find the cause of the errors and fix them. Rather than
retype the lines again he uses the arrow keys:
% myprog (up once)
% g++ -I /users/normb/my_include_files -o myprog myprog.cc (up twice)
% emacs myprog.cc (up three times -- the desired result)
It should be noted that the text in parentheses is the author's notes, and that each
successive arrow keypress would display the command on the same line. For this
particular example only a few keystrokes are saved but as the process continued it is
rather obvious that the arrow keys are the preferred alternative. The command history has
a retricted size as it is stored in memory as opposed to on the disk. The actual number
will vary from machine to machine, but usually 20 to 40 is reasonable. The number can
be set with the history environment variable which will be covered later. When the
number of history commands exceeds the limit, the oldest are removed to make room for
the newest.
While the arrow keys prove very useful in accessing previous command lines, they are
not the only way to access the history list. The C shell provides different ways to access
the command lines depending on taste and situation. Before looking at the different
mthods of accessing the list, it would probably be a good idea to see what the list looks
like. The history list for the previous example would look something like this:
% history
1 emacs myprog.cc
2 g++ -I /users/normb/my_include_files -o myprog myprog.cc
3 myprog
%
The actual numbering may vary depending on how many commands have been entered.
Now, looking at this command list it is easy to see that if the list gets quite long, say 10 or
more items, the number of keystrokes saved begins to diminish. The C shell alleviates
this problem by giving the user a variety of ways in which to access earlier commands.
The C shell uses the exclamation mark (!) as one quick method to make history
substitutions. By entering two consecutive !'s and the enter key the user will have the last
command in the history list executed. The last command would now appear twice as the
last two entries in the history list. By entering the ! followed directly by a number will
execute the entry in the history list with the entered number. For example, the command !
1 would execute the command emacs myprog.cc in the history list above. Another
method is to enter a string after the !, which would have the command starting with that
string in the history list executed. For example (again using the above history list), the
command !g++ would invoke the rather length command:
g++ -I /users/normb/my_include_files -o myprog myprog.cc
Notice that the string was entered immediately after the ! character. If this was not the
case, an error would occur as the ! character alone is not a recognized command by the
shell. If there is more than one item in the history list that starts with the string, the one
closest to the end of the list (the one with the highest number) will be substituted. By
framing the string with question marks, like !?string?, the shell will execute the
command which contains the string anywhere in it. To execute the command in the last
example the user could have entered the command !?-o. The history list can also be
executed from the bottom up. For instance, !-2 would execute the second last command
added to the history list. By default, when the user logs out, the history list is purged from
memory and lost. The user can however set the variable savehist to a number will save
that number (or less if not as many commands exist in the history list) to a file called
.history in the users home directory. The larger the number of commands saved from the
history list, the longer it will take the C shell to start at login time. This is because the
saved history list has to loaded into memory. Typically 20 to 40 is a good range to set the
savehist variable to. To ensure that this variable is set for each session, it can be added to
the .cshrc file. At this point the reader may well be overwelmed by the number of
different ways to save keystrokes. It might seem that there is an awful lot of remembering
involved in saving a few keystrokes. While this is definitely true, the old adage holds,
practice makes perfect. It will likely take many hours of UNIX use before many of these
tricks will be burned into a users mind - but they will. Every UNIX user has his own bag
of tricks to get the job done, and it will differ from one user to the next.

Aliases
Now it is time to start exploring the power of the C shell. This is the point where the C
shell really starts to stand out over the Bourne shell. The C shell enables the user to
define complicated, or not, commands in terms of easy to type aliases. These aliases can
save a heavy user litteraly hours of typing. As a simple example, take the command for
getting a long listing of files in a directory, ls -l. While this is not very difficult to enter, it
can be simplified using an alias:
% alias ll ls -l
After this, for the rest of the current session, typing ll would result in the long listing of
the current directory being displayed. The alias could even be made permanent for every
session by including it in a login file, but that will be covered in later. A list of active
aliases is kept by the shell. To see a list of aliases the user would type alias with no
arguments. To check a particular alias, the alias command would be entered with the
name of the alias in question as the only argument. If the alias does not exist, the prompt
will simply reappear on the next line. The following session demonstartes the use of the
alias command:
% alias ll ls -l
% alias bye exit
% alias
ll ls -l
bye exit
% alias bye
exit
To remove an alias, one simply types unalias followed by the alias to be removed:
% alias ll
ls -l
% unalias ll
% alias ll
%
More complicated aliases can be designed using the techniques outlined in earlier
sections (like command grouping and escaping special characters). When using an alias
remeber that the shell will process the alias at that time. This can be a bit of a problem if
one is not careful. The following example illustrates the problem:
% alias ll ls -l d*.*
% alias ll
ls -la data1.95 data2.95 doc.tex
% cd ~another_usr
% ll
/bin/ls: data1.95: No such file or directory
/bin/ls: data2.95: No such file or directory
/bin/ls: doc.tex: No such file or directory
The problem can be fixed by simply enclosing the definition in quotes. Here, single or
double will work as no variable substitutions are made. When the definition is not
enclosed in quotes the definition is taken as exact by the shell. This means that it uses the
listing of the current directory - always. When in another directory, the shell looks for
files that were in the directory the alias was created in. This may seem a bit confusing,
but it really comes down to the order in which expansions and substitutions are executed
during the aliasing process. It never hurts to enclose things in quotes (unless a particular
special character is required) so it is wise to use quotes in aliases as often as possible.
Aliases can also be used to redefine current commands. The C shell will not prevent a
user from renaming an alias after say, ls. This means that one could redefine ls to behave
in similar or completely different in behaviour. A good use of this property is as follows:
% alias rm rm -i
which would cause the shell to prompt the user on the deletion of each file given as an
argument. Never a bad idea. Redefining commands using alias may seem dangerous, but
keep in mind that these aliases can be removed as well using the unalias command.

Variables
A variable in the C shell can be any name containing up to 20 letters and digits which
starts with a letter or an underscore. A variable is a name or placeholder used to store one
or more values. Variables give a high level of generality to everyday interactive shell
usage, and are heavily used in shell programming (which will be discussed in a later
section). In interactive shell use, a variable can be treated very similar to an alias with the
exception that a variable must be referenced (used) with a dollar sign. This dollar sign
tells the shell that what follows is a variable, and that the shell should make the
appropriate substitution. In order to use a shell variable in the same manner that an alias
was used earlier, the long listing can be revisited:
% set ll=''ls -l''
% $ll
Ignoring the set command for a moment, the variable ll is given the text string ls -l as a
value, and when the variable is referenced on the following line, the text string is
substituted. When the user enters the variable name (preceeded by a dollar sign) followed
by a carriage return, the shell interprets the text string as though the user had typed it in
himself. The shell does not care how the command line gets entered, it parses it the same
whether it was a history subststituion, variable substitution, alias or standard input. The C
shell variables can be categroized into five types. These types are as follows:

• User Defined - local to current shell (assigned values with command set),
• Parameters - command line arguments,
• Environmental - global (assigned using command setenv),
• Special - read only variable (assigned by the shell) and
• Array - multiple valued variable.

Before each of these types of variable is covered, the way in which variables are accessed
should be examined. While it is true that a variable must be accessed with a $ in front,
that is not the end of the story. It should be remembered form the discussion of quoting
special characters that a text string enclosed in double quotes will allow the shell to make
variable substitutions while a string in single quotes will not. Consider the following
example,
% set NAME=''John Smith''
% echo ``His name is $NAME''
His name is John Smith''
% echo 'His name is $NAME'
His name is $NAME
where the single quoted text string is echoed verbatim. Another complication that can
arise when accessing variables as part of a longer string is, for example, the variable DAY
is assigned the value ``Tues'', and then an attempt is made to make the string ``Tuesday''
containing the variable DAY the following problem arises:
% set DAY=Tues
% echo $DAYday
Dayday: Undefined Variable
%
What has happened here is that there is no whitespace between the variable name and the
rest of the string. When the shell parses the command line, it interprets everything
following the $ and up until whitespace as the variable name. Since DAYday was not
defined as a variable (given a value) the shell displays an error message. To alleviate this
problem, curly braces ({}) can be used. If a curly brace follows directly after a $,
everything inside of the closed curly brace pair will be considered a variable name.
% set DAY=Tues
% echo ${DAY}day
Tuesday
Variables names are case sensitive so DAY, Day and day are all different. It is not good
practice to define more than one variable with the same name but different case. In fact it
has become standard for shell variables to be given all upper case characters.
It is also important to note that when assigning a value to a variable, the shell considers
everything between the equals sign and the next whitespace to be the value of the
variable. It is thus important that no whitespace (other than what is meant to be used) is
introduced during a variable assignment. If whitespace is to be used, the appropriate
quotation characters must be used.
• User Defined Variables
• Array Variables
• Global or Environment Variables
• Parameters
• Special Read Only Variables
• Variable Modifiers

Variables
A variable in the C shell can be any name containing up to 20 letters and digits which
starts with a letter or an underscore. A variable is a name or placeholder used to store one
or more values. Variables give a high level of generality to everyday interactive shell
usage, and are heavily used in shell programming (which will be discussed in a later
section). In interactive shell use, a variable can be treated very similar to an alias with the
exception that a variable must be referenced (used) with a dollar sign. This dollar sign
tells the shell that what follows is a variable, and that the shell should make the
appropriate substitution. In order to use a shell variable in the same manner that an alias
was used earlier, the long listing can be revisited:
% set ll=''ls -l''
% $ll
Ignoring the set command for a moment, the variable ll is given the text string ls -l as a
value, and when the variable is referenced on the following line, the text string is
substituted. When the user enters the variable name (preceeded by a dollar sign) followed
by a carriage return, the shell interprets the text string as though the user had typed it in
himself. The shell does not care how the command line gets entered, it parses it the same
whether it was a history subststituion, variable substitution, alias or standard input. The C
shell variables can be categroized into five types. These types are as follows:

• User Defined - local to current shell (assigned values with command set),
• Parameters - command line arguments,
• Environmental - global (assigned using command setenv),
• Special - read only variable (assigned by the shell) and
• Array - multiple valued variable.

Before each of these types of variable is covered, the way in which variables are accessed
should be examined. While it is true that a variable must be accessed with a $ in front,
that is not the end of the story. It should be remembered form the discussion of quoting
special characters that a text string enclosed in double quotes will allow the shell to make
variable substitutions while a string in single quotes will not. Consider the following
example,
% set NAME=''John Smith''
% echo ``His name is $NAME''
His name is John Smith''
% echo 'His name is $NAME'
His name is $NAME
where the single quoted text string is echoed verbatim. Another complication that can
arise when accessing variables as part of a longer string is, for example, the variable DAY
is assigned the value ``Tues'', and then an attempt is made to make the string ``Tuesday''
containing the variable DAY the following problem arises:
% set DAY=Tues
% echo $DAYday
Dayday: Undefined Variable
%
What has happened here is that there is no whitespace between the variable name and the
rest of the string. When the shell parses the command line, it interprets everything
following the $ and up until whitespace as the variable name. Since DAYday was not
defined as a variable (given a value) the shell displays an error message. To alleviate this
problem, curly braces ({}) can be used. If a curly brace follows directly after a $,
everything inside of the closed curly brace pair will be considered a variable name.
% set DAY=Tues
% echo ${DAY}day
Tuesday
Variables names are case sensitive so DAY, Day and day are all different. It is not good
practice to define more than one variable with the same name but different case. In fact it
has become standard for shell variables to be given all upper case characters.
It is also important to note that when assigning a value to a variable, the shell considers
everything between the equals sign and the next whitespace to be the value of the
variable. It is thus important that no whitespace (other than what is meant to be used) is
introduced during a variable assignment. If whitespace is to be used, the appropriate
quotation characters must be used.

• User Defined Variables


• Array Variables
• Global or Environment Variables
• Parameters
• Special Read Only Variables
• Variable Modifiers

Array Variables
The Bourne shell restricted variables to single values. The C shell imposes no such limit.
It provides the user with the use of array variables, which are variables that contain two
or more discreet values. Ignoring proper C shell notation for a moment, an example of an
array would be a variable called alphabet which contained each of the 26 letters A
through Z. To access a particular letter the variable might be followed by the number of
the sequence which contained the desired letter. For example, the letter C might be
referred to as alphabet(3), while Z would be referred to as alphabet(26). An array is a
nice way to store related elements which could be accessed under the same name. In the
C shell, array variables can be set as follows. All of the array elements can be set at one
time by enclosing the list with parentheses and separating the elements with white space
(usually spaces). For example, to set an array of pets:
% set PETS=(Cat Dog Goldfish Horse Boa Hamster)
To access an element of a variable the variable name is followed by square brackets
containing the element number:
% echo $PETS[3]
goldfish
% echo $PETS[3-5]
Goldfish Horse Boa
The example illustrates how more than a single element can be accessed at once using the
- operator. If this operator is placed before or after a number it will access all of the
elements up to and including or including to the end of the list respectively. For example,
% echo $PETS[-3]
Cat Dog Goldfish
The entire array variable list can be accessed by placing an * in the square brackets.
Table 4.2 summerizes the ways to access array elements:

Table 4.2: Ways to access array elements.

Global or Environment Variables


A global variable is one that is visible to all shells. It is said to be global because it is not
restricted to use by the current shell. To set a global variable in the C shell, the setenv
command must be used. This command is used precisely the same as the set command
with the exception that it is used for setting global variables (often referred to as
environment variables). To see a listing of the current global variables in use, the setenv
command can be used without any arguments. To remove a global variable from use, the
unsetenv command can be used in the same manner that the unset command was used.
The C shell comes with a number of predefined environmental variables which can be
read and altered by the user. The lower case variables displayed in the set example are a
few of these. Predefined variables are generally lower case which is why it is a good idea
to use all upper case when a user defines his own variables (global or otherwise). Some
predefined variables are set by the shell during login while others need to be set by the
user if he so desires. Table 4.3 contains the predefined environmental variables for the C
shell as well a description and default setting
Table 4.3: Predefined environmental variables and their default setting.
Not all of these variables would be given an explicit value when activated. Some of these
variables are either set or unset. The nobeep variable, for example, would be set by
simply entering setenv nobeep. If the user wished to have any of these variables set
during each session, the variable assignments could be placed into one of the login
scripts.

Parameters
Parameters (or command line arguments) are shell variables which carry information
pertaining to a command line. To reference a parameter, the dollar sign is placed before a
number from 0 to 9. The variable $0 is special and refers to the name of the current
program which would be csh if $0 was accessed at the command line prompt. While the
number after the $ must be 9 or less, higher numbered arguments can be accessed using
the $argv[n], where n is a positive integer. To access the 11th command line argument
the following variable would be used $argv[11]. The - operator can be used in the same
manner as it was for array variables. A shorthand version of the #argv variable is to
simply enclose the number in curly braces, ${11}. Parameters may seem strange and
useless, but there importance will become evident in the sections on shell programming.

Special Read Only Variables


The C shell maintains a list of special variables that can be read by the user, but not
written to or changed explicitly. One of these was examined in the section on parameters,
$argv[n] which returned the value of the nth parameter of the shell command line.
Table 4.4 contains all of the special readonly variables use in the C shell:

Table 4.4: Special read-only variables used in the C shell.


For example, to check whether noclobber is set, one could enter the following:
% echo $?noclobber
1
which would indicate that noclobber is indeed set. The usefulness of these variables will
become apparent in the shell programming sections.

Variable Modifiers
Variable substitutions can be altered in a variety of ways using variable modifiers. A
modifier is incorporated into a substitution by appearing immediately after the variable
name. The general format is
% $Variable_Name:M
where M is one of the modifiers summarized in table 4.5.

Table 4.5: Modifiers in the C shell.


These terms require some additional explanation. In order to best explain them, the
variable VAR will be set as follows:
% set VAR=(/usr/local/src/prog.c /home/normb/memo.text)
The main use for these modifiers is in working with pathnames, so they will be explained
in that particular context. An extension is what is tagged onto a filename following a dot
(.) and the extension of $VAR is c /home/normb/memo.text. This may seem odd, but
what has happened is that the extension of the first pathname (c) has been returned along
with the rest of the variable which is the second complete path. The variable modifiers
will also accept an integer value to isolate a particular word (anything between
whitespace - not necessarily a word in the traditional sense) within the variable. To return
the extension of the first word without returning words that follow, the following
modifier statement would be used:
% echo $VAR[1]:e
c
This way any extension from any word can be returned. To return the extensions from all
of the words contained in a variable, the :ge modifier would be used. This is called the
global extension modifier. A g in front of either the :e, :h, :r or :t modifiers causes the
same effect. Another of the modifiers is the header (:h) modifier. The header is the
directory path where the program is found. For example, $VAR[2]:h would return
/home/normb, while $VAR:h would return /usr/local/src /home/normb/memo.text for
the reasons outlined above for the extension modifier. The root of a variable is everything
up to but not including the extension, and the tail is what follows the header (i.e. the tail
is the filename minus the path). For example,
% echo $VAR[2]:t
memo.text
% echo $VAR:r
/usr/local/src/prog /home/normb/memo.text
% echo $VAR:gt
prog.c memo.text
If the variable name is enclosed in curly braces for any reason, the modifiers would go on
the inside of the bracket.

C Shell Programming
Many of the sections leading up to this point in the chapter have hinted at the fact that the
material covered in those sections would be better utilized in shell programs (called shell
scripts). Indeed shell programming is the most powerful aspect of the C shell (or any
shell for that matter). The C shell provides a rich scripting language that, at best, has a
slight similarity to the programming language C. Shell scripting languages provide the
user with a great many tools for handling everyday tasks around the system and even
some less everyday tasks. A shell script can contain UNIX commands as well as the shell
commands discussed in the previous and following chapters. Unlike compiler based
languages, shell scripts are executed by the shell one line at a time. While this will
obviously make for slower performance, advantage is gained in the ease of modifying
programs without all of the hassle of compiling and linking. All that is required for a
shell script to be executed is that it be made executable with the following command:
% chmod u+x script_name
Shell scripts are often written to handle some of the more tedious tasks that a user
encounters on a regular basis. A simple C shell script could be a list of UNIX commands
that archives and compresses the users home directory and copies it to a specified
mounted disk partition for storage :
#!/bin/csh
# backup tars and compresses ~/ and puts in storage on /dsk2/strg/
#
tar -cvf dec18_95.nbdat.tar ~/
compress dec18_95.nbdat.tar
cp dec18_95.nbdat.tar.Z /dsk2/strg/
With the exception of the lines that start with hash marks (#), the script is a list of simple
UNIX commands. This task could most certainly have been entered on a single command
line with use of a pipeline, but it illustrates the basic format of a C shell script. Almost
any line starting with a hash mark will be ignored by the shell and hence indicate
programmer comments. The one exception to this rule is the hash bang (#!) sequence of
characters, this has special meaning to the shell. It tells the shell which environment to
start for execution of the script. This could be any shell or even other scripting
environments such as perl (Practical Extraction and Report Language) [1] or tcl (Tool
Command Language) [2] which are UNIX scripting languages but not shells (at least not
interactive shells like those discussed in this book). The shells are usually found in the
/bin directory, but this might differ from system to system. The powerful feature of shell
scripts over simply writing the commands on a command line is that scripts can contain
many types of safety, logging, and other features to provide a worry free and organized
working environment. As the scripts in this chapter begin to become more complex, this
point should become clear.

• Testing and Branching


• Signal Handling

Testing and Branching


One of the main features of any programming language is decision making. When writing
a program there will be a time when the program must examine some data and determine
by preprogrammed conditions what to do. Some decisions that might be encountered in a
shell script are: if an end of file (EOF) marker is reached, proceed to the next file; if the
file is write protected, display an error message to the user; or if a number n is greater
than another number m, terminate script. Each of these statements has the form if [some
condition] then [do something] which is the general form of a branch statement in the C
shell. The shell will examine the condition and decide what to do based on the result of
the examination, or test. The actual form of the if statement is as follows:
if [condition_1]
command list
else if [condition_2]
command list
else if [condition_3]
command list
.
.
.
else if [condition_n]
command list
else
default command list
endif
When a test of a condition returns a TRUE value (non-zero) the commands listed
immediately following the condition are executed and the construct is exited with
command execution beginning after the endif statement. If a test returns a FALSE value
(zero) the next else or else if (if either are present) is executed. The else statement
contains the last command list and acts as a default situation - if all else fails, execute
these commands. The else statements are optional and may be omitted if a single branch
test is all that is required. The minimum if test would then be if [condition] command list
endif. The endif is required any time an if statement is used. Every time a command is
executed, an integer value is returned depending on the success of the command
execution. If the command executes as desired without incident, a zero is returned. For
example, the grep command can be used to see if a user is logged onto the system by
piping the output of a who command to it, such as
% who | grep jblow
jblow tty1 Dec 22 2:42
% echo $?
0
The command demonstrated that jblow was logged onto the system, but more
importantly, so did the return value of zero. Had jblow not have been on the system, the
who grep command would have returned a non-zero value as the search would have
failed. This fact gives a new way of testing command success in shell scripts. The above
commands could have been placed in a C shell script like the following:
#!/bin/csh
#
# usron checks to see if the userid given as an argument
# is logged on to the system
#
# Usage: usron id_name
#
who | grep $1 > /dev/null
if (! $status) then
echo ``$1 is logged on''
else
echo ``$1 is not logged on''
This script may look a bit more complicated than the last example, but there is not much
new here. The test checks the return value of the pipeline and returns a string depending
on which case is true. The $status holds the return value of the last executed command.
It is the same as the $? in the Bourne shell. It is important to notice a subtle, yet
important, detail here. The if condition will execute the command list if the expression
inside the parentheses has a non-zero value. The $status variable returns a zero if the last
command executed properly (or in the case of the grep command, returned a match), and
a one otherwise. While this may well seem backwards relative to the if condition
evaluation, it can be easily handled by placing a (!) before the test. The $1 refers to the
first argument given on the command line after the script name which allows the user to
use the script to check on any user without having to alter the script. The only new idea in
this script is the redirection of the output of the pipeline to /dev/null. The null device
(/dev/null) is really just a place to dump unwanted output. Anything that is sent to the
null device vanishes without a trace. It is important to realize that nothing can be
recovered from this device so care must be taken when redirecting anything there. If the
output of the pipeline had not been sent to the null device, it would have been sent by
default to the stdout (the screen) and make an ugly output. When designing and testing a
script it is nice to send the output to either to the screen or a file for examination of any
problems that might arise.
The above example also illustrates what we consider to be good form in script
documentation. Documentation is an important aspect of any programming task, and
UNIX script programming is no exception. The example shows two important
documentation features, a description of the script and a line describing its usage such as
arguments and or options. Since this script was relatively short, further documentation is
not necessary, but more would likely be included in longer more complicated scripts.
When writing scripts to handle system related tasks, it is often necessary to have the
programs make decisions based on file status. For instance, a file might be checked to
determine if it is a file before sending it to a printer, or a directory might be checked for
set write-access bits. The shell provides a method for testing file attributes. If one of the
flags (or operators) in table 4.6 is placed before a filename in an if construct test, a
boolean truth value causes the direction of the branch.

Table 4.6: Operators that test file attributes.


For example, if a.out was executable:
% if (-x a.out) echo ``Yes''
Yes
would be the result of a test of executability. Of course tests such as these are much more
useful in a script since any of the information attained in this manner from a command
line could have been acquired more easily from a long listing (ls -l). The negation
operator (!) can be used to negate the truth value of any test if it is placed before the test
operator. Logical AND (&&) and OR (||) operators can also be used to make compound
condition statements such as
#!/bin/csh
#
# prok checks to see if filename is a file and user readable
# before attempting to print it
#
# Usage: prok filename
#
if (-f $1 && -r $1) then
lpr -Php_printer $1
else
echo ``Error:: $1 cannot be printed''
endif
Comparing string patterns will often be a required task when shell programming in the
UNIX environment. The grep command is often useful when dealing with strings, but
string comparisons can be built into the scripts themselves. A typical string comparison
would look something like
if (``APPLE'' == ``ORANGE'') then
echo Match
else
echo No Match
endif
which would result in the string No Match being echoed to the screen. There are four
comparators for checking string equivalence in the C shell. The == and =~ check for
right and left string equivalence, while the != and !~ check for non-equivalence. While
all four can be used on strings, the =~ and !~ comparators match the string on the left
against a filename substitution on the right. This makes it possible to use a simple if
statement rather than the switch statement for pattern matches.
The C shell goes much further than the Bourne shell in its dealing with arithmetic
operations. Arithmetic in the C shell is quite easy to implement. To give an integer value
to a variable, the same procedure described in the variable section is used. For example,
set X=10, would put the integer value 10 in the variable name X. To square X and place
the value in a variable called SQR, the following expression would be used:
% set X=10
% @ SQR=$X * $X
% echo $SQR
% 100
The @ operator tellsthe C shell that what is about to follow is an arithmetic expression.
Care must be taken to use whitespace when writing arithmetic expressions to prevent
complaints from the shell. Table 4.7 is a list of the arithmetic operators (including
comparisons) provided by the C shell:

Table 4.7: Comparison operators in the C shell.


When evaluating arithmetic expressions, the C shell uses order of precedence. This
means that A*B+C will have different meaning than A*(B+C) and the user will have to
know what he wants to ensure that the proper parentheses are put into place.
Like the C programming language, the C shell provides a wide variety of assignment
operators to simplify expressions. For example, adding 10 to the variable SUM would
normally look like this
@ SUM = $SUM + 10
but it could also be written as follows
@ SUM += 10
which is more to the point although it requires a bit of getting use to for people who are
not use to the C programming language. A complete list of arithmetic assignment
operators is contained in table 4.8.

Table 4.8: Arithmetic operators in the C shell.


The bitwise operators above are special operators that allow operations to be carried out
on bit patterns (ones and zeros). For example the decimal number 7 can be written as the
binary number 0111. The bitwise shift operators do just that, they shift the bit pattern in
one direction or another. 12 >> 2 shifts the bit pattern for 12 two places to the right
resulting in the decimal number 3. The actual binary shift would look like
12 >> 2 3
1100 0011
3 << 2 would shift it back to 12 again. Bitwise shift to the right can be looked at as
dividing a number by 2, while bitwise shifting it to the left could be viewed as
multiplying it by 2.
The bitwise AND operation compares each bit (in terms of its place) in two patterns and
results in a 1 where there were 1's in each and a 0 otherwise.
5 & 12 would result in 4 5 0101
12 1100
5 & 12 0100
8 The bitwise OR operation compares bits in the same fashion as the AND but results in a
0 only if no 1 was in either pattern.
5 | 7 would result in 7 5 0101
7 0111
5 | 7 0111
The bitwise compliment simply reverses the bit pattern.
22
^7 would give 8 7 0111
^7 1000
While these would not likely come up often in day to day shell programming, they are
worth a mention.
The if:then:else construct can be repeatedly nested to allow as many tests as are required
for a particular situation, but this can become very convoluted. A more appropriate
construct for dealing with many choices is the switch statement. The switch statement
allows many choices to be examined for a particular situation, in a more structured
manner than the nested if statement. The format of the switch statement is as follows:
switch (test_pattern)
case pattern1:
command_list1
breaksw
case pattern2:
command_list2
breaksw
.
.
.
case patternN:
command_listN
breaksw
default:
command_list
endsw
which would check for a match between test_pattern and patterns 1 through N,
executing the command list for the first match. If no match is found through the N'th
pattern, the command list following the default label will be executed. The reader will
notice that there is a breaksw following every command list, with the exception of the list
following the default case. The reason for the breaksw is that once a match is made, and
execution of the corresponding command list has completed, the command list for each
successive case statement would then execute. A breaksw placed at the end of each case
statement will then pass execution to after the endsw statement. Since the default case
contains the last command list which could be executed within the case construct, no
breaksw is required.
The following example would prompt the user prior to deleting a file:
echo ``Do you wish to delete $file ?''
set answer=$<
switch ($answer)
case ([yY]*):
echo ``Removing $file''
rm $file
breaksw
case ([nN]*):
echo ``$file was not removed''
breaksw
default:
echo ``Didn't understand response :: Aborting operation''
exit 1
endsw

\noindent
This example could have been done relatively easily with nested
\verb+if+ statements, but if there were even one or two more choices
the \verb+if:then:else+ construct begins to lose its appeal.

\subsection{Loop Control}

Along with control branching, loop control gives programs their power.
Branching gives a program the ability to determine the direction
program execution will take and looping allows the program to
repetitiously execute one or more commands, until some condition is or
is not met.
This is after all what gives computers their power.
The C shell provides two tools for controlling program looping, the
while construct and the foreach construct.
A \verb+while+ loop can be best explained in plain English as follows:
while a certain condition is met, a group of commands will be repeatedly
executed.
As soon as the condition is not met, the execution ceases.
The actual form of a \verb+while+ statement is:

\begin{verbatim}
while (condition)
command_list
end
If the condition is never met, the command list is never executed, while on the other hand
if the condition is always met, the execution will never stop. For example, the while loop
in:
set A=1 B=10
while ($B < $A)
echo $A
@ B -= 1
end
will never execute, while in:
set A=1 B=2
while ($B > $A)
echo $B
@ B += 1
end
will execute forever (not really).
The while loop can be used for many things, but one very useful purpose is to handle
arguments given on the command line issuing the script. The following script fragment
will echo back to the screen the arguments given:
while ($#argv)
echo argv[1]
shift
end
The shift statement works to move the elements of the array (argv in this case - the
default) one place to the left. The form for the shift statement is:
shift [variable]
The variable $var[2] becomes $var[1], and $var[3] becomes $var[2], etcetera. The
variable $var[1] (before shifting) is destroyed (at least unusable). An error will result if
any the variable given is unset or has NULL value.
The foreach construct is quite different from the while construct in that now an actual
expression is evaluated. The foreach loop execution is determined by the number of
elements in a list. The form of a foreach construct is as follows:
foreach variable (list)
command_list
end

\noindent
Each element of the list is removed from the list and placed in
\verb+variable+ during execution of the command list.
When there remain no elements in list, the execution of command list
ceases.
This construct can also be used to manipulate arguments given at the
command line that initiates the script.
The following script fragment will, as in the above example, print out
a list of the arguments given:

\begin{verbatim}
foreach ARG ($argv)
echo $ARG
end
This version is shorter and probably a bit clearer than the first, but each user will have his
own preference when it comes to looping constructs for this type of situation. The
foreach list can also contain a list of words such as:
foreach color (red green blue yellow)
something_interesting
end
This is a situation where the foreach construct is the only way to proceed. Likewise,
there are situations where only while statements will provide the control required, such as
arithmetic expressions resulting from the execution of the command list.
There is really one more loop control mechanism provided by the C shell. The over used
quick fix, or goto statement. Anyone who has programmed in BASIC is well aware of
the goto statement as well as how quickly it can add bugs to a piece of code. Regardless
of its downside, the goto statement is provided for use in the C shell scripting language.
The goto statement has the form:
goto LABEL
where LABEL is a string that is placed within the script. The label cannot however reside
within a loop or branch construct, and any attempt to do so will result in an error. Almost
any code where a goto is used can be rewritten using the control constructs described
above. All that is required is a little bit of forethought and imagination. With the tools
outlined in this and earlier sections, complex scripts can be constructed, as will be seen
shortly.

Signal Handling
The C shell is not quite as sophisticated at handling signal interruptions during script
execution as the Bourne shell. The C shell can handle signals in a very general way with
the onintr command. By general it is meant that all signals are essentially handled the
same, which is unlike the way that trap works in the Bourne shell. There are three ways
in which to use the ointr command within a script. The first of these is simply
onintr
which will cause the script to terminate on any signal. The second use of onintr
onintr -
works in the opposite manner by ignoring all signals, which might be used to allow a
script to clean up any files written during execution. The third and last use of the onintr
command is
onintr LABEL
which sends the script to LABEL to continue execution of commands after that point.
Again, this could be used to remove any temporary files or write out any last minute
information to a file. The execution of commands after LABEL will continue until either
an exit command or the end of the script, so an exit command should be placed after the
desired commands.
The C shell has another way of handling signal interruptions caused by turning off a
terminal or disconnecting a line to a remote terminal. The nohup command (do not
hangup) causes a command to continue execution after a hangup. Used on a command
line nohup has the following form:
nohup COMMAND
The C shell provides an automatic nohup in the sense that any commands or programs
run in the background using the (&) (see next section) will not be killed by a hangup
signal (hup). In a shell script, nohup will cause the remainder of the script to run to
completion, ignoring the hup signal.

Job Control
In the UNIX world, a job refers to a command or group of commands sent to the shell for
execution. The job exists so long as the shell is either executing it or keeping track of its
status. A job will be in one of two palces after given to the shell, the forground or the
background. If a command is entered on the command line like the following:
% gcc -g -o myprog myprog.c
the user will be left to watch the compile take place without being able to use the terminal
(of course with xterms this isn't really a problem). This command was executed in the
foreground. For most commands, like cd or ls, this is not a real inconvenience, but many
tasks can take a great deal of time and leave the terminal unusable until they have
completed (or at least terminated). A job executed in the background will not use the
terminal, it will simply execute out of sight. The simplest way to put a job in the
background is to palce an ampersand (&) following the command on the command line:
% find / -name june_report.ps -print >& find.log &
Jobs can be placed in the background in a couple of other ways as well. A job can be
placed in the background using the bg command followed by the name of the job. If a job
is started in the usual fashion in the foreground, it can be stopped using the Control Z
(^Z) sequence, and then moved to the background to continue execution using the bg
command. The following sample session fragment demonstrates moving a grep job into
the background in this fashion:
% grep ``normb'' one_huge_log.file >& grep.info
^Z
Child exited
% jobs
[1] Running xclock -d -update 1
[2] + Child exited grep ``normb'' one_huge_log.file
% bg %2
%
This example warrants a few explanations. First, jobs is a C shell command that gives a
list of background jobs. The number in brackets is the ID of the job and the plus sign
beside the grep job indicates it is a current job (a minus sign would indicate a previous
job). Second, jobs are refered to by their ID number preceded by a percent sign. The
references to %2 refer to the grep job. The first job [1] is just an xclock that had been
started in the background eariler in the session. It can also be seen that the shell issues a
notice that a child process (of the interactive shell) has been exited or stopped. It is also
important to keep in mind that both the grep and find commands used in the previous
examples send their output to stdout by default and so if the output is not redirected to a
file (or device) it will echo back to the terminal, defeating the purpose of putting the job
in the background. In both examples the stderr has also been redirected. Some situations
may arise when the user will want stderr to remain directed to the screen to monitor
errors that would otherwise go unnoticed until analysis of the log file or other output.
This will especially be the case when compiling large and complicated programs or
packages. To stop a job in the background, the stop command followed by the job ID
number can be used.
It may be the case that the user would like to bring a background job to the foreground.
An example of this is emacs. In emacs, the ^Z character will stop emacs and set it in the
background in it's stopped state. To resume work in emacs it must be brought to the
foreground. This is done with the fg command. By itself, fg will bring the last job placed
in the background to the foreground. Usually fg will be given a job ID as an argument.
To kill a background job (i.e. terminate it) the kill command must be used with the -9
(terminate process) option. To kill a background job, the same basic idea as the previous
examples holds. The following script presents a user friendly interactive prompting
environment for killing processes:
#!/bin/csh
#
# BSD Version (SunOS)
# Interactive process killer
#
# Usage: killp [pattern]

unset notify
if ($#argv > 0) then
set joblist="`ps | grep $argv[1] | grep -v grep`"
set counter=1
else
set joblist="`ps`"
set counter=2
endif

while ($counter < $#joblist)


set temp=($joblist[$counter])
set proc_id=$temp[1]
set proc_name=($temp[5-])
@ counter++

if ($proc_id == $$) then


continue
endif
echo "Kill $proc_name ? [y|n|q]"
set response=$<
switch ($response)
case y:
kill -9 $proc_id >/dev/null
breaksw
case n:
breaksw
case q:
exit
default:
echo "Response not understood"
@ counter--
endsw
end
Rather than refer to the jobs by their job ID, the script uses the UNIX ps command to get
their process ID's and kills them using that. The reason for using ps rather than jobs is to
provide a more general script that works in multiterminal (such as X11) environments.
The jobs command will only present jobs placed in the background from the current
terminal. The ps command will show all processes whether they are in the background or
foreground on any terminal so long as the owner of the process is the current user. The
above script from the example will behave noticibly different depending on the setting of
the notify environment variable. If notify is set, the script will immediately echo back to
the terminal the message that a process has been killed when ever a process is chosen to
be killed. This does not affect the over-all performance of the script but it will make for a
confusing looking screen with prompts for the user mixed with messages from the shell.
The only way to rectify this problem is to unset the notify variable. Unsetting this
variable will make the shell wait until the next command line prompt before echoing job
related messages.
One last point should be made, and that is that referring to jobs is not restricted to the use
of the %n format described above. Table 4.9 contains the job referencing operators
provided by the C shell:

Table 4.9: Job referecing operators in the C shell.

Special Files
An appropriate way to end this chapter is with a discussion of the special files recognized
by the shell (often referred to as dot files because of their leading dots) One of these files,
the .history, was covered in the section on command history. It held the last specified
number of commands executed. The C shell has three more files that it recognizes, the
.login, the .cshrc, and the .logout files, which are all found in the user's home directory
(~/). The .login is read by the shell at the start of a session, or more precisely, at login.
This file should contain commands that are only to be executed at login, perhaps running
a welcome message or the date. Any environment variables, such as the path, should be
set here as well. An execption to the rule of the .login file only executing at login can
occur in X windows when using xterm. If an xterm is started with the -ls flag, the
resulting terminal window will be a login shell. This seems a harmless enough situation,
there is no real problem with redefining variables with the same value or even with
displaying a welcome message in each terminal window (if one is displayed in the .login
file), but some subtle problems can arise. One problem that comes to mind is the
following. On most systems, the path will be set to some basic value (such as,
/bin:/usr/bin:/usr/local/bin: etc...) so that if there is not a proper path set by the user, basic
commands like vi will still work. Most users define their path in their .login script as:
set path = ( $path /usr/public/bin ~/bin . )
Now clearly what will happen if the .login script executes on each invocation of a shell is
that the path will continue to grow. This is not a pretty thing to see when echoing the
$path variable, but even worse, it can begin to affect the performance of the shell.
The .cshrc file is executed each time a new shell is invoked and thus any variables
wanted for each shell should be defined here. Even though a global definition can be set
in the .login file, the definition can be changed at any time and it is always wise to define
important variables, like noclobber, in the .cshrc file. It is possible to define the path
variable in the .cshrc file, but if any additions have been appended to the path in any
other shell, they will be overridden within the current shell. This fact may also assure that
the path will be safe from unsetting but the former is usually the case. It is up to each
user, how and where the path will be set. The .cshrc is an ideal place to define aliases.
The .logout file is a file not used by many. Most tasks are handled nicely by the shell
during logout and thus it is not necessary to handle them in the .logout script. There are a
couple of good applications for the .logout shell however. The first is for handling a safe
deletion system. A user can make a directory called ~/tmp.dump and then define a
command called del with the following script:
foreach file ( $argv )
mv $file ~/tmp.dump
shift
end
Instead of the files being deleted, they are stored in the tmp.dump directory, and
undeleting is as simple as copying them later from this directory. Since files will
accumulate and start to dig into the users disk quota, they should be removed (officially
deleted) at some point. The .logout script can be the ideal method of handling this task.
The directory can either by cleaned completely each logout or a more sophisticated script
can be written to discard files after they have been in the directory a certain period of
time. There is no reason that external scripts can not be run from the .logout (or any DOT
file for that matter) using the source command. Another reason to use the .logout file
comes from personal experience. There exists a program called plan which is a date
planner with built in alarms. In order to keep track of scheduled alarms, a daemon
process is initiated when X windows is started to handle this. When X is exited and the
shell is exited and this special process does not die. Fortunately, the package comes with
a special program to kill the process, which when added to the .logout produces the
desired result of killing the daemon at logout.

Introduction to UNIX Security


With the discussion of DOT files should come a few notes about security. When the ls -l
command is given, (the long listing) the security settings, so to speak, are visible for each
file. For example, giving the ls -l command in a user's home directory might give output
similar to the following:
% ls -l
total 347
-rwxr--r-- 1 author users 2540 Jan 5 20:30 appenda1.tex.gz
-rwxr--r-- 1 author users 67043 Jan 2 20:53 bourne.tex
-rwxr--r-- 1 author users 86011 Feb 23 19:33 cshell.tex
-rwxr--r-- 1 author users 82476 Feb 23 18:46 cshell.tex~
-rwxr--r-- 1 author users 11142 Jan 3 12:58 introduction.tex
drwx------ 2 author users 1024 Jan 12 21:31 mail
drwxr-xr-x 2 author users 1024 Feb 12 21:04 scripts
drwxr-xr-x 2 author users 1024 Jan 14 18:50 bin
The string on the left side of each listing gives the security attributes as well as the file
type attribute. The above example shows that there are three directories (denoted by the d
in the first place of the attribute string) and five regular files (denoted by dash's -'s). The
remaining nine characters in the string represent the security attributes (or more correctly,
permissions) for each file and directory. The directory settings will be left for discussion
in a more appropriate text, as they are not as straight forward as the regular file
permission settings. Each permission string contains nine characters which are actually
three sets of three related settings which are broken down as follows:
USER GROUP OTHER
--- --- ---
rwx rwx rwx
The three sets are the USER set, which corresponds to the user who owns the file, the
GROUP set, which corresponds to group ownership of the file, and the OTHER set which
corresponds to any other user on the system. Displaying the files in a directory with the
long listing command ls -l (ls -lg on SunOS) the user and group owners will be displayed.
In the above example, the user author owns all of the files and the files are all owned by
the users group. Each of the three groups has a read (r), write (w), and execute (x) setting.
If a dash (-) is resent in the string it means that permissions for that action are not given.
In the example all of the files are readable by everyone, but only the owner, author, can
write or execute (which means nothing in this case) the files. To learn about changing the
owner or group ownership of a file (or directory) the UNIX commands chown and chgrp
can be used. More importantly, to change the permissions, the chmod command can be
used. This command takes the form:
% chmod [options] filename
For the particular option the man pages can be examined. For example adding write
permission to all users in the group users for the TeX files would look like this:
% chmod g+w *.tex
This would not necessarily be a good idea as now any user in the users group could alter
or even delete the TeX files. The point of this book is clearly not UNIX security, but the
point of these permissions becomes of great importance with DOT files used by the shell.
If the permissions were set such that any user could write to the .login or .cshrc file, they
could alter it in any way they wanted. If this user wanted to cause damage (say delete all
of the files in this user's directory), all they would have to do is add the line
rm -rf *
While most users are clearly not that mallicious, some are. Another point to keep in mind
is that if group ownership is given to a file, group members can also be given write access
to a DOT file. There is no situation where another person needs to have write access to
another user's DOT files and thus dot files should have permission settings similar to the
following:
-rwxr--r-- 1 jblow users 86011 Feb 23 19:33 .cshrc
-rwxr--r-- 1 jblow users 86011 Feb 23 19:33 .login
-rwxr--r-- 1 jblow users 86011 Feb 23 19:33 .logout
This will prevent anything bad from happening (at least due to unsafe permission
settings).

You might also like