You are on page 1of 75

Introduction To Scripting

(Emphasis On BASH Scripting)

Written by
Enterprise Linux Professionals, LLC
http://www.enterpriselinux.pro

Enterprise Linux Professionals, LLC © 2016


1
Table of Contents
1Overview ...............................................................................................................................................10
2History ...................................................................................................................................................11
2.1Punch Cards ...................................................................................................................................12
2.2Integrated Circuits .........................................................................................................................13
2.3Stored Procedures ..........................................................................................................................14
2.4Applications ...................................................................................................................................14
2.5The Bourne Shell (Bell Labs) ........................................................................................................15
2.6BASH (Free Software Foundation) ...............................................................................................15
2.7Linux (The Linux Foundation) ......................................................................................................15
3Command Line Interpreters .................................................................................................................16
4Shells .....................................................................................................................................................16
5Login Shells ..........................................................................................................................................17
6UNIX and Linux File Systems ..............................................................................................................18
6.1Hidden Files ..................................................................................................................................19
6.2Filename Wildcards .......................................................................................................................19
6.3Globbing.........................................................................................................................................19
7Scripting Basics .....................................................................................................................................21
7.1Input / Output (I/O) Redirection.....................................................................................................21
7.2Input Redirection............................................................................................................................21
7.3Output Redirection ........................................................................................................................21
8Scripts ....................................................................................................................................................23
8.1“#!” (The Magic Cookie) ..............................................................................................................23
8.2echo ...............................................................................................................................................23
8.3“Hello, World!” .............................................................................................................................23
9Basic Commands ...................................................................................................................................24
9.1ls ....................................................................................................................................................24
9.2cd ...................................................................................................................................................24
9.3pwd ................................................................................................................................................24
9.4cat ..................................................................................................................................................24
9.5date ................................................................................................................................................24
9.6man ................................................................................................................................................25
9.7rm ...................................................................................................................................................26
9.8touch ..............................................................................................................................................26
9.9test .................................................................................................................................................26
9.10 which ..........................................................................................................................................26
10Automation ..........................................................................................................................................27
10.1Commands versus Scripts ...........................................................................................................27
10.2Cron (Scheduled Tasks) ...............................................................................................................27
11Operations ...........................................................................................................................................28
11.1 File Manipulation .......................................................................................................................28
11.2 Printing Output ...........................................................................................................................28
11.3 Running Programs ......................................................................................................................28
12Input.....................................................................................................................................................29
12.1“<” (Standard Input).....................................................................................................................29

Enterprise Linux Professionals, LLC © 2016


2
12.2Terminals .....................................................................................................................................29
12.3 Reading from Files .....................................................................................................................29
13Output ..................................................................................................................................................30
13.1“>” (Standard Output)..................................................................................................................30
13.2Displaying Values ........................................................................................................................30
13.3 Writing to Files ..........................................................................................................................30
14Logging ...............................................................................................................................................31
14.1 “Permission denied” ...................................................................................................................31
14.2 “No such file or directory” .........................................................................................................31
14.3 /dev/null (also known as the Bit Bucket) ...................................................................................31
15Exit Status............................................................................................................................................32
15.1 “$?” .............................................................................................................................................32
15.2 Success = 0 .................................................................................................................................32
15.3 Failure ≠ 0 ..................................................................................................................................32
16Important Variables .............................................................................................................................33
16.1 Echo “\“$VARIABLE\”” ............................................................................................................33
16.2 PATH / MANPATH ....................................................................................................................33
16.3 EDITOR / VISUAL ....................................................................................................................33
16.4 PS1 ..............................................................................................................................................33
16.5 PS2 ..............................................................................................................................................33
16.6 PS3 ..............................................................................................................................................33
16.7 PS4 ..............................................................................................................................................34
16.8 HOME ........................................................................................................................................34
16.9 HOSTNAME ..............................................................................................................................34
16.10 HISTSIZE .................................................................................................................................34
16.11 SHELL ......................................................................................................................................34
16.12 LS_COLORS ............................................................................................................................34
17Command Substitution ........................................................................................................................35
17.1 $(Command) ...............................................................................................................................35
17.2 `Command` .................................................................................................................................35
18Login Shell ..........................................................................................................................................35
18.1 /etc/passwd .................................................................................................................................35
19Login Scripts .......................................................................................................................................35
19.1 .bashrc .........................................................................................................................................35
19.2 .bash_profile ...............................................................................................................................35
20Logout Scripts .....................................................................................................................................36
20.1 .bash_logout ...............................................................................................................................36
21Aliases .................................................................................................................................................36
21.1 “When I… Do…” ......................................................................................................................36
22export ..................................................................................................................................................36
22.1 Making Variables Accessible ......................................................................................................36
23Executable ...........................................................................................................................................37
23.1 Extensions Not Required ............................................................................................................37
23.2 Execute Permissions ...................................................................................................................37
24The “bash” Command ........................................................................................................................37
25Bash Syntax .........................................................................................................................................38

Enterprise Linux Professionals, LLC © 2016


3
25.1 Commonly Use “.sh” Extension .................................................................................................38
25.2 “#!/bin/bash” ..............................................................................................................................38
26Comments ...........................................................................................................................................38
26.1 Anything After “#” Is Not Executed ...........................................................................................38
27Quotes .................................................................................................................................................38
27.1 Quote Symmetry .........................................................................................................................39
27.2 Single Quotes .............................................................................................................................39
27.3 Double Quotes ............................................................................................................................39
27.4 Back Quotes ...............................................................................................................................39
27.5 Apostrophe versus Tick Mark .....................................................................................................39
28Interactive Scripts ...............................................................................................................................40
28.1 read -p “Prompt” VAR1 VAR2 …...............................................................................................40
29Positional Parameters ..........................................................................................................................41
29.1 $0, $1, $2, etc. ............................................................................................................................41
29.2 Pass Arguments When Script is Executed ..................................................................................41
29.3 “$*” (All Parameters) .................................................................................................................41
29.4 “$@” (All Parameters) ...............................................................................................................41
29.5 “$#” (Number of Parameters) .....................................................................................................41
29.6 “$$” (Process ID (PID) for this Shell) ........................................................................................42
30Setting (Declaring), Referencing and Unsetting Variables .................................................................42
30.1 VAR=value .................................................................................................................................42
30.2 Referencing Variables ................................................................................................................42
30.3 unset ............................................................................................................................................43
31Variable Scope .....................................................................................................................................43
31.1 Global versus Local Variables ....................................................................................................43
31.2 When Variables Apply ................................................................................................................43
32 sort Command ....................................................................................................................................43
32.1 Organizing Output ......................................................................................................................43
32.2 Consolidating All Duplicates ......................................................................................................43
33cut Command ......................................................................................................................................44
33.1 Isolating Output ..........................................................................................................................44
34uniq Command ....................................................................................................................................44
34.1 Ignoring Contiguous Duplicates .................................................................................................44
35find Command .....................................................................................................................................44
36Definitions ...........................................................................................................................................45
36.1blank ............................................................................................................................................45
36.2word ...........................................................................................................................................45
36.3name ..........................................................................................................................................45
36.4metacharacter ...............................................................................................................................45
36.5control operator ...........................................................................................................................45
37Reserved Words ...................................................................................................................................46
37.1 “!” ...............................................................................................................................................46
37.2 “( )” and “(( ))” and “{ }” .........................................................................................................46
37.3 “[ ]” and “[ [ ] ]” .......................................................................................................................46
37.4 “case” and “esac” .......................................................................................................................46
37.5 “do” and “done” .........................................................................................................................47

Enterprise Linux Professionals, LLC © 2016


4
37.6 “elif” and “else” ..........................................................................................................................47
37.7 “for” ............................................................................................................................................47
37.8 “function” ...................................................................................................................................47
37.9 “if” and “fi” ................................................................................................................................47
37.10 “in” ...........................................................................................................................................47
37.11 “select” .....................................................................................................................................47
37.12 “then” ........................................................................................................................................48
37.13 “until” and “while” ...................................................................................................................48
37.14 “time” .......................................................................................................................................48
38Command Delimiters ..........................................................................................................................49
38.1 “;” “&” or “<newline>” (Command Terminators) .....................................................................49
38.1.1Semicolon ...........................................................................................................................49
38.1.2Ampersand ..........................................................................................................................49
38.1.3Newline ...............................................................................................................................49
38.2“&&” and “||” (Command Separators) .......................................................................................49
38.2.1 “&&” (...if true…) ..............................................................................................................49
38.2.2 “||” (...or else…) ..................................................................................................................50
39Lists .....................................................................................................................................................50
40Comparisons ........................................................................................................................................51
40.1 Integer Test Operators.................................................................................................................51
40.1.1 Equal ...................................................................................................................................51
40.1.2 Not Equal ............................................................................................................................51
40.1.3 Greater Than .......................................................................................................................51
40.1.4 Less Than ............................................................................................................................51
40.1.5 Equal Or Greater Than ........................................................................................................51
40.1.6 Equal Or Less Than ............................................................................................................51
40.2 String Test Operators...................................................................................................................51
40.2.1 Equal to String ....................................................................................................................52
40.2.2Not Equal to String ..............................................................................................................52
40.2.3Equal to Pattern.....................................................................................................................52
40.2.4Not Equal to Pattern .............................................................................................................52
40.2.5Length of String is Not Zero ................................................................................................52
40.2.6Length of String is Zero........................................................................................................52
40.3File Test Operators.......................................................................................................................52
40.4 Exists versus Does Not Exist ......................................................................................................54
41Calculations .........................................................................................................................................54
42Regular Expression (regex) .................................................................................................................55
42.1Regular Expression Anchors ......................................................................................................55
42.2 search* ........................................................................................................................................56
42.3 search+ ........................................................................................................................................56
42.4 search? ........................................................................................................................................56
42.5 search ..........................................................................................................................................56
42.6 “[: alnum :]” ................................................................................................................................56
43expr ......................................................................................................................................................56
43.1Basic Regular Expressions ..........................................................................................................56
44sed .......................................................................................................................................................56

Enterprise Linux Professionals, LLC © 2016


5
44.1 Basic Regular Expressions .........................................................................................................57
44.2 Text Manipulation ......................................................................................................................57
44.3Parameter Gotchas ......................................................................................................................57
45grep ......................................................................................................................................................57
45.1 Basic Regular Expressions .........................................................................................................57
45.2 Filtering ......................................................................................................................................57
45.3 Recursion ....................................................................................................................................57
46egrep ....................................................................................................................................................58
46.1 Extended Regular Expressions ...................................................................................................58
46.2 Filtering ......................................................................................................................................58
46.3 Recursion.....................................................................................................................................58
47awk ......................................................................................................................................................59
47.1 Extended Regular Expressions ...................................................................................................59
47.2 Print Field ...................................................................................................................................59
47.3 Pattern Matching ........................................................................................................................59
47.4 Keyword Matching .....................................................................................................................59
47.5 Passing Local Variables ..............................................................................................................59
48Regular Expression Labs.....................................................................................................................61
48.1Lab 1 ............................................................................................................................................61
48.2Lab 2.............................................................................................................................................61
48.3Lab 3.............................................................................................................................................61
49Daemons ..............................................................................................................................................61
49.1 Non-interactive Processes ..........................................................................................................61
50Services ...............................................................................................................................................62
50.1 Managed Programs .....................................................................................................................62
50.2 SystemV Init Scripts ...................................................................................................................62
50.3 SystemD Targets .........................................................................................................................62
51Processes .............................................................................................................................................62
51.1 Active Programs .........................................................................................................................62
51.2 Running Processes ......................................................................................................................62
51.3 Sleeping Processes .....................................................................................................................62
51.4 Zombie Processes .......................................................................................................................62
51.5 Foreground Processes .................................................................................................................62
51.6 Background Processes ................................................................................................................63
52Jobs ......................................................................................................................................................63
52.1 Processes Attached to a Terminal ...............................................................................................63
52.2 jobs .............................................................................................................................................63
53“ps” (Process Status) ...........................................................................................................................64
53.1 Process Query .............................................................................................................................64
53.2 “ps -ef” (All Processes) ..............................................................................................................64
53.3 “ps -f -p $$ (The current process) ..............................................................................................64
53.4“ps axo pid,ppid,%cpu,%mem,args”............................................................................................64
54Signals .................................................................................................................................................65
54.1 Sending Messages to Processes ..................................................................................................65
54.2 kill ...............................................................................................................................................65
55Loops ...................................................................................................................................................66

Enterprise Linux Professionals, LLC © 2016


6
55.1 while and until Statements .........................................................................................................66
55.2 for Statements .............................................................................................................................66
55.3 Instantiation ................................................................................................................................66
56if Statements .......................................................................................................................................67
57case ......................................................................................................................................................68
57.1 Complex Conditions ...................................................................................................................68
58Solving Real-World Problems .............................................................................................................70
58.1 Determine Which Systems are Available on a Network .............................................................70
58.2 Speed Up Common Tasks ..........................................................................................................70
58.3 Create 100 Users ........................................................................................................................70
58.4 Get a List of Files Accessed in the Last Five Minutes ...............................................................70
58.5 Unexpire a User’s Account and force them to set a new password.............................................70
58.6 Generate a List of Files by Size and Owner................................................................................70
58.7 Generate a List of Home Directories by Size and Owner...........................................................71
59References ..........................................................................................................................................71
60Regular Expression Lab Answers .......................................................................................................72
60.1Lab 1 Answer ...............................................................................................................................72
60.2Lab 2 Answer ...............................................................................................................................73
60.3Lab 3 Answer................................................................................................................................74

Enterprise Linux Professionals, LLC © 2016


7
Enterprise Linux Professionals, LLC © 2016
8
Introduction to Shell Scripting

A good administrator can get almost anything accomplished on a system, but a great
administrator can get the system to accomplish almost anything automatically. It is on this
premise that we begin our journey into the world of scripting. The purpose of this course is to
convert the commands that you already run on GNU/Linux distributions (from this point on
referred to as “Linux”) to tasks that are run programmatically.

As any administrator who has recurring items on their “to do” list will tell you, automation will
make you more efficient. What you do with the time you save by scripting common commands is
up to you. You can spend it organizing your desk, replying to emails, helping colleagues with
their automation scripts, or just let everyday people think that you are extra smart and wonder
how you manage to finish your work and still leave with confidence every day at the exact same
time.

A suggestion we can make is that you use your newfound time to challenge yourself to take
your skills to the next level. We have told system administrators for year that they should learn a
programming language. We also recommend that developers familiarize themselves with the
systems their programs run on. The jobs of developers and administrators seem to be sharing
more common skills as time goes by.

Many organizations are deliberately taking steps to blur the lines between the two in an attempt
to become more efficient. The combination of Development and Operations (DevOps) is a good
thing. Organizations that embrace this philosophy are enjoying substantial benefits in
development and infrastructure costs. We recommend that you become a part of this
movement rather than part of the resistance.

The skills you will learn in this book will provide you more options when approaching
administrative challenges. Something you can look forward to is getting to the point where you
have a repository of code (whether it be your own or from the open source community) and
quickly repurpose an old script to solve new problems.

Enterprise Linux Professionals, LLC © 2016


9
1 Overview

One of the first things that should be acknowledged is that an introduction, such as this, will not
be able to comprehensively cover such a vast topic as “Scripting.” Instead, the aim of this
material is to provide an entry into the world of scripting along with a fundamental understanding
to get you started.

Scripting is the foundation of automation and takes on many different forms. Some scripts are
interactive and require input from a user to run properly. Other scripts are noninteractive and
may require that certain data be available in order to run successfully. Instruction sets within an
application are scripts which conditionally provide feedback or call other scripts within the
application.

Scripts can be written in any language that the system can interpret or compile. Programming
languages that are not compiled are also known as “interpreters” in a system. The following
programs are examples of interpreters: bash, python, and perl.

All Linux operating systems have bash, so we will start with interpreter. Later in the course we
will show you several scripts written for different interpreters that do the exact same thing so that
you can recognize the difference between them and know what you are dealing with when you
see it.

Enterprise Linux Professionals, LLC © 2016


10
2 History

Humans have been “scripting” long before computers were among us. Consider all of the
abbreviations that we still use today. Consistent with how many people use scripts people use
abbreviations because they are commonly interpreted, but many people do not even know what
they are abbreviating. When referencing a date people commonly use AD (Anno Domini, or “in
the year of the Lord”), BC (before Christ), or BCE (Before the Common Era) to reference a
year, but may not be sure what it means. Consider some Latin abbreviations, such as “etc” (et
cetera), and “cf”. Most people will tell you that “cf” somehow stands for “cross reference,” which
is absolutely the concept represented by it, but it really means “conferre,” which means “bring
together.” Thomas Jefferson was an early American scripter. He signed the Declaration
of Independence as “Th. Jefferson.”

Even though these are simple abbreviations they express something that was not
necessary to express in its complete form. The word “abbreviation” describes something
that has been shortened, and that is exactly what we hope to do with scripting. We plan
to take something that might take several minutes or hours to run through an interactive
shell and accomplish the same in seconds.

Enterprise Linux Professionals, LLC © 2016


11
2.1 Punch Cards

The first time punch cards were used was in the textile industry. In 1805 Joseph Marie Jacquard
innovated punch cards to easily “program” the textile machines concerning which pattern to
used when creating cloth. His invention saved a lot of time and money.

Charles Babbage is credited with making the first computing machine, but he borrowed
Jacquard's punch card design and applied preconfigured instructions in punch cards.

Herman Hollerith improved upon previous designs to create a statistical analysis system and
found a perfect realworld application as a tabulator for the 1890 US Census. In 1911 Hollerith
merged with some strategic partners to create the Computer Tabulating and Recording
Company. This company was renamed in 1924 to International Business Machines Corporation,
but you would probably recognize it as IBM Corporation.

Punch cards played a very important role in storing data, but also providing instructions for
machines. To give you an idea of how large the role of punch cards were, consider these
production numbers:
Year Millions of Cards Produced Per Day
1914 2
1937 Between 5 and 10
1955 72.5

Punch Cards were commonly used until the mid 1980’s. They were eventually replaced by disk
storage. Disks today contain data, but they also contain the instructions for computers just like
punch cards did.

Enterprise Linux Professionals, LLC © 2016


12
2.2 Integrated Circuits

While IBM was printing millions of punch cards every day, other organizations were doing their
best to innovate in other ways. When electricity became widely available machines started using
electrical impulses as signals (instructions) and data. In 1903 John Fleming invented the
Vacuum Tube Diode which returned an on or off value to the system. Eventually (in the 1940’s)
complex logic circuits were created using thousands of these vacuum tubes, but the vacuum
tubes had a high failure rate and had to be replaced like light bulbs.

In 1947, Bell Labs invented the transistor. These replaced vacuum tubes and could be soldered
to a circuit board for rapid development of advanced logic circuits. The problem still existed,
though, of connecting thousands of individual components in one machine. The solution to this
problem would prove to be the most important invention in modern times: The Integrated Circuit.

In 1958, Jack Kilby of Texas Instruments and Dr. Robert Noyce of Fairchild Semiconductor
invented the integrated circuit (IC). This allowed many advantages for the advancement of
computers and devices. The first is that the process of creating circuits was dramatically
reduced because the process could now be automated by printing circuits on silicon wafers
rather than soldering individual components on boards. The second advantage is that the size
of the circuits was reduced. Now we are able to embed chips in devices because of advances to
this technology. Eventually multiple integrated circuits were put into one chip. These IC chips
reduced latency between components and power consumption, thus further shortening every
transaction significantly.

In 1971, the microprocessor was invented. By combining several integrated circuits onto one
chip the processing speed has continued to expound as Moore’s Law predicted, which states
that number of transistors on a circuit will double every 18 months. Since Gordon Moore made
that prediction in 1965, ICs have gone from 2300 transistors per circuit to over a billion.
Dr. Noyce and Gordon Moore decided to start their own company that would address the
memory and processing needs of mainframe computers. The company was Intel and in 1971
Ted Hoff and other engineers created a single microprocessor with 12 separate integrated
circuits for a Japanese client that used the chip for a calculator.

Enterprise Linux Professionals, LLC © 2016


13
2.3 Stored Procedures

In the mid 1970’s, Donald Chamberlin and Raymond Boyce invented Structured Query
Language (SQL). Their work did a lot to allow retrieving data from relational databases.
Anybody who has worked with databases is probably familiar with basic SQL statements.

A stored procedure is a group of SQL statements saved as a group. So, instead of running
every SQL statement separately they can all be run at the same time by calling the stored
procedure group.

If you are familiar with this concept then you will easily be able to understand scripts, because a
script simply a group of commands that get run together. The group of commands is identified
by the file they are written in, or the script.

2.4 Applications

Any application you run on a computer is made up of scripts. It may be one gigantic script, but it
is likely made up of a lot of small scripts. It is not uncommon for an application to start out as a
small script and then grow into a larger project.

There are many sections of scripts that often lay dormant because a condition was not satisfied
for the code to execute. Applications contain many dormant scripts that are just waiting for some
kind of input. Unless the script is expected to run on a very small system or consume enormous
amounts of cycles the scripters’ primary concern is usually to make the script work. By stepping
back and looking at an application it is easy to see that the purpose of it is to simplify and
automate a series of tasks.

Enterprise Linux Professionals, LLC © 2016


14
2.5 The Bourne Shell (Bell Labs)

In 1977, Stephen Bourne from Bell Labs released the Bourne shell, which became the default
command line interpreter for Unix. The command for this shell was simply “sh.” This shell
became a great interface between users and systems.

The original shell lacked all kinds of features, such as job control, aliases, command history, tab
completion, custom prompt, local variables, etc. This shell, however, gave a lot of scripting
abilities to administrators for the first time in a multiuser systems.

2.6 BASH (Free Software Foundation)

When Richard Stallman of the Free Software Foundation created free software versions of most
of the software in the UNIX operating system he also upgraded the Bourne shell. The name
“bash” is an acronym for “Bourne Again Shell.” The bash shell has existed since it was created
in the 1980’s with relatively few changes.

The Free Software Foundation was the organization that produced the GNU operating system.
GNU is a recursive acronym that means “GNU is Not UNIX.” By the time The Free Software
Foundation was considering a new kernel for the GNU operating system there were free
alternatives to most software that was included in the Bell Labs’ paid version of UNIX.

2.7 Linux (The Linux Foundation)

The search for a new kernel for GNU ended when The Free Software Foundation found a
community contribution made by a student at the University of Helsinki named Linus Torvalds.
The minimalistic kernel came to be known as the Linux kernel. Although many people refer to
the GNU/Linux operating system (GNU with the Linux kernel) as “Linux,” most GNU/Linux
distributions are primarily made up of GNU software.

The Linux kernel supercharged the GNU operating system. The performance and stability of the

Enterprise Linux Professionals, LLC © 2016


15
GNU/Linux distributions today can be largely attributed to the design of the Linux kernel. It is
upon this background that we will begin our adventure into scripting.

3 Command Line Interpreters

Although we most commonly use bash as a shell, there are many command line interpreters
available in every modern operating system, but they all have their origin and strength in some
application or programming language. Prior to UNIX, the command line interpreters were part of
the operating system. When Stephen Bourne abstracted the shell from the operating system it
opened the door for other shells.

4 Shells

The first alternative shell was the “C Shell,” which was created by Bill Joy from the University of
Berkeley for the Berkeley Software Distributions (BSD) of UNIX. Most commonly known as
“csh,” the C Shell resembles the C programming language.

DEC had an operating system named TENEX and Ken Greer improved on the C Shell to create
tcsh (Tenex C Shell, a.k.a. Twenex, and 20X). This shell added command completion and many
built in history commands. Until Mac OS X 10.3, tcsh was the default shell. It is still the default
shell for FreeBSD and derivative operating systems.

In 1986, AT&T released an unsupported version of the Korn Shell, written by David Korn. The
Korn Shell claims to be all that the bash shell is with additional customization and command line
editing abilities. AT&T open sourced the code for the Korn Shell in March of the year 2000.

Enterprise Linux Professionals, LLC © 2016


16
5 Login Shells

The term “shell” is a generic reference to any text interface with the system. Login shells are text
interfaces that specifically relate to the logging in of a user. Login shells have the important
responsibility of customizing a user’s environment.

Every user is able to set their own local variables that may be different from the default values
or other variables of other users on the same system.

Users are also able to create aliases and run commands automatically as they log into a
system. A simple alias example is the following, which tells the system to recognize “c” as the
“clear” command: alias c=”clear”

By placing this alias on its own line in the .bashrc file of a user’s home directory this alias is
automatically set each time the user logs in and is valid for the entire login session.

Enterprise Linux Professionals, LLC © 2016


17
6 UNIX and Linux File Systems

[Note: From here going forward in this document, when Linux is mentioned, assume it to include
UNIX and all UNIX and Linux derivatives unless otherwise specified.]
There is the old saying that says, “When all you have is a hammer, everything looks like a nail.” Linux
operates like a hammer, except that, instead of nails, to Linux everything looks like a file. So we will
spend a little time describing the Linux file system.
The Linux file system is a hierarchical file system composed of directories and files. Every file is a
member of exactly one directory and directories can also have sub-directories. So, similar to a tree,
there is a trunk with branches and leaves with the directories being the branches and the files being the
leaves. At the root of the tree is the root directory denoted by a slash (/). The slash is also used as the
delimiter between directories, forming a path to a file or directory. So /home/karl/bin has home as
a directory under the root directory, slash, and karl is a directory under home and bin is a directory
under karl.
Four terms that you will encounter related to the file system are “absolute path,” “relative path,”
“basename” and “dirname.” An absolute path always begins with a slash, indicating the root directory.
So /usr/share/man/man1/cat.1 is an absolute path. Contrast this with man/cat.1 which is a
relative path. This begs the question, “Relative to what?” A relative path is relative to the current
working directory. The current working directory is always an absolute path, so adding a relative path
to an absolute path yields another absolute path. The shells use the period (.) as a shortcut meaning the
current working directory when it is by itself or between slashes. Thus, .bashrc does not mean the
current working directory followed by bashrc. Similarly, two successive periods (..) is a shortcut to
indicate the parent of the current working directory. When double periods are used in a path, one can
move around with (pardon the pun) relative ease. For example, if the current working directory is
/usr/share/zoneinfo/posix/US, referencing ../../../../lib/mozilla/extensions would get to
/usr/lib/mozilla/extensions.
Symbolically, each underlined portion disappears until no more double periods remain:
/usr/share/zoneinfo/posix/US/../../../../lib/mozilla/extensions
/usr/share/zoneinfo/posix/../../../lib/mozilla/extensions
/usr/share/zoneinfo/../../lib/mozilla/extensions
/usr/share/../lib/mozilla/extensions
/usr/lib/mozilla/extensions
The basename is the part of a path to the right of the right-most slash or the entire name if there are no
slashes in the path. The dirname (directory name) portion of a path is the path with the right-most slash
and the basename removed. If there are no slashes in the path, the dirname is a period, i.e., the current
working directory. The basename and dirname programs return the basename and directory
names, of their arguments, respectively. Since filenames can contain whitespace (space, tab or
newline), parameters to these programs should be quoted. Note that a file does not need to exist for
basename and dirname to work, they merely work on the supplied string and treat it as a path.
Multiple consecutive slashes are effectively one, e.g., //usr////share///lib is the same as

Enterprise Linux Professionals, LLC © 2016


18
/usr/share/lib.
Embedded single periods are effectively the same as if they were not there, e.g., /usr/./share/././lib
is the same as /usr/share/lib.
Since filenames can begin with a hyphen (-), they can be confused with options on the command line.
Many commands now recognize two hyphens (--) as the end of options so what follows can begin with
a hyphen and not be considered as an option. However, if one has two options that take filenames and
both filenames begin with hyphens, the use of a double hyphen may not be sufficient. An easy way
around this is to use ./-filename in place of -filename since the hyphen is no longer the first
character of the field.
One final note about path names. Putting a trailing slash on a path does not guarantee that the path is or
will be a directory. Effectively, all trailing slashes on a path name are ignored.

6.1 Hidden Files

Hidden files are those files (and directories) that have a period as their first character, e.g., .bashrc. In
general, hidden files are ignored by most commands unless specifically mentioned or requested. This
prevents them from being accidentally removed by /bin/rm *.

6.2 Filename Wildcards

Filename wildcards are the question mark (?), the asterisk (*) and square brackets ([ ]) where characters
are enclosed in the brackets. The question mark says any single character is allowed in this position in
the filename. The asterisk matches zero or more characters in the filename. The square brackets
represent a single character in the filename. Individual characters may be specified or a range of
characters may be specified by putting a hyphen between the two ends of the sequence. An
exclamation point as the first character following the left bracket says match any characters except
those specified.
Some examples of filename wildcards:
*.o All files that end with a .o suffix.
?????.o All files with five characters in front of the period and a .o suffix.
*.* All files containing at least one period.
ab*yz All files that begin with “ab” and end with “yz”.
a[a-m][!n-z] All three letter files beginning with “a” and followed by a letter from “a” through “m” and
followed by a character not in the range from “n” through “z”.

6.3 Globbing

Globbing is the process of giving the shell filename prototypes with unquoted wildcards and the shell

Enterprise Linux Professionals, LLC © 2016


19
responds by providing all the filenames that match the prototypes with the wildcards resolved. If a
filename containing wildcards is quoted, the wildcard characters stand for themselves.

Enterprise Linux Professionals, LLC © 2016


20
7 Scripting Basics

When commands and scripts begin, they normally have “files descriptors” associated with them. The
first three provide standard locations for input, output and errors though these can be changed by I/O
redirection on the command line. File descriptor zero (fd 0) is “standard input” or “stdin” and is input
to the command, fd 1 is “standard output” or “stdout” and gets output from the command and fd 2 is
“standard error” or “stderr” and gets error message (assuming the command writes the errors to stderr).
Standard input is typically from the keyboard. Standard output is typically the display terminal.
Standard error provides a place for the command to send any error messages, so as not to mess up
standard output, though initially this will be the same place as stdout, e.g., the display terminal. File
descriptors 3 through 9 are also available to the command, typically as holding places.

7.1 Input / Output (I/O) Redirection

Redirection is the way the shells control where stdin comes from and where stdout and stderr go when
different than their standard places. Redirection is done on the command line for that command.
Redirection is evaluated left-to-right. In the examples that follow in this section, where whitespace is
shown, the whitespace is optional, but if no whitespace is shown, there must be none.

7.2 Input Redirection

Standard input can be redirected to read from a file by using < filename or from the script itself
using a “Here document” with the syntax <<sentinel. It is called a “Here document” because the
input is “here” in the script file. For a Here document, redirect stdin to read from the script file until
the sentinel is seen. If the sentinel is quoted (or escaped, e.g., <<\EOF), dollar signs and backquotes
are not evaluated. If the sentinel is not quoted, e.g., <<EOF, dollar signs and backquotes are evaluated.
This allows variable references to provide values to the document. An escaped dollar (\$) reverses
evaluation. The sentinel (EOF) must appear beginning in column one on a line by itself unless a
hyphen follows the “<<”, e.g., <<-EOF or <<-\EOF. The hyphen allows leading tabs to appear in front
of the sentinel. As long as different sentinels are used, Here documents can be nested but that may be
inviting confusion and errors. If you have problems with here documents, ensure there is no trailing
whitespace or characters following the sentinel on its line. Specifically avoid a backquote or a right
parenthesis to follow the sentinel on the same line.

7.3 Output Redirection

To redirect output to a file, use n> file or n>> file or n>&m where n and m are file descriptors
from 1 through 9. In n is omitted, 1 (stdout) is assumed. The first form tells the shell to send output to
fd n to a file. The file will be overwritten if it exists. The second form appends to the file, creating the
file if it does not exist. The third form says send output from file descriptor n to the same place as file

Enterprise Linux Professionals, LLC © 2016


21
descriptor m.
One additional output redirection is the pipe, indicated by the vertical stroke (|). This says to send only
standard output from a command through the pipeline to the next program, so the output of the first
program becomes the input of the second command. This is one of the features of Linux that makes it
so powerful. When filenames are not specified, each tool generally assumes input is coming from stdin
and output is going to stdout. Each tool can be very good (and efficient) at what it does and if further
processing is needed, the output can be piped to another tool. Remembering that redirection is
processed left-to-right, if we want both stdout and stderr to be passed to the next program, we use
command1 2>&1 | command2. This says send stderr (fd 2) of command1 to the same file
descriptor as stdout (fd 1) then send stdout (fd 1) through the pipeline to command2 where it is
available as stdin (fd 0). A synonym for “2>&1 |” is “|&”.
There are some special cases: n<&- says close fd n for input (if n is omitted, 0 is assumed) and n>&-
says close fd n for output (if n is omitted, 1 is assumed).
A few examples:
Send stdout and stderr from command to the same file: command > file 2>&1
Append stdout and stderr from command to the same file: command >> file 2>&1
Send stdout and stderr into the pipeline: command1 2>&1 | command2
Swap stdout and stderr: 3>&2 2>&1 1>&3
Send stdout to a file and stderr to the pipeline, for example to allow command2 to analyze any errors:
command1 2>&1 1> file | command2. When using something like this, it is probably a good
idea to include a comment that the redirects are in the correct order.
Send stderr to the Bit Bucket to avoid seeing error messages: 2>/dev/null

Enterprise Linux Professionals, LLC © 2016


22
8 Scripts

Scripts are generally used to run a series of commands in a repeatable fashion. Much of Linux is
processing textual data in files. The shells make it easy to process text files because most of the tools
operate on ASCII text and the shells have syntax to easily identify files.

8.1 “#!” (The Magic Cookie)

Scripts typically begin with an octothorpe (pound sign or hash tag) and an exclamation mark. This is
sometimes called the “Shebang.” This combination of characters tells Linux to load the program that
follows these two characters. Since the pound sign indicates a comment in Linux, it is unlikely to
cause problems when not being executed.

8.2 echo

The echo command is built-in to all shells and merely prints out the values on its command line.

8.3 “Hello, World!”

Generally the first program to write is one that prints “Hello, World!” to indicate that the computer is
able to talk to the output device (e.g., printer, display terminal). So we can use echo “Hello,
World!” and see if it appears.

Enterprise Linux Professionals, LLC © 2016


23
9 Basic Commands

This section describes some commands that can be used to navigate the file system. Some commands
are built-in to the shell, like period (.), colon (:), break, cd, chdir, continue, eval, exec, exit, export,
pwd, return, shift, test and “[”, trap, umask and unset and are described in the shell's man page. Other
commands are standalone and have their own man page.

9.1 ls

The ls (list) command prints a listing of the files and directories specified on the command line or the
current directory if no files or directories are specified, like the dir command in MS-DOS. Parameters
on the command line can include options to specify what properties to print and the files or directories
to look at.

9.2 cd

The cd command is built-in to the shells. A synonym for cd is chdir. It changes the current directory
to that specified on the command line, or $HOME if no directory is specified. There are two reasons a
cd may fail; the directory does not exist or the directory does not have execute (search) permission.

9.3 pwd

The pwd command is built-in to the shells and prints the working directory. This is the current
directory.

9.4 cat

The cat (concatenate) command prints one or more files. UUOC is a common acronym used in news
feeds related to Linux. It stands for Unnecessary Use Of Cat. Since cat is a program, when executed, it
must be loaded into memory before it can process the files. Most other programs are written to accept
file names on their command lines, so instead of using cat file | program, just use program file.
It has been said that if you are using cat with a single file, you are probably doing it wrong.

9.5 date

The date program is used to display or set the current date and time on the computer. Setting the date
requires root privilege, also known as super-user privilege. The output from the date program can be
controlled by options on the command line with a format override indicated with a leading plus sign (+)

Enterprise Linux Professionals, LLC © 2016


24
that will not appear in the output. One mistake novice programmers make is to use multiple calls to
date to get the various pieces of the date and time. Consider extracting the year, month, day, Julian day,
hour, minute and second to be able to use some of those pieces as a filename for logging and the year
and Julian day for other calculations such as the number of days difference between two dates. The
typical sequence could look like this:
yr=`date “+%Y”`
mo=`date “+%m”`
da=`date “+%d”`
jd=`date “+%j”`
hr=`date “+%H”`
mn=`date “+%M”`
se=`date “+%S”`
Then these variables are put together to create a date stamp portion of a file, e.g., “log.$mo$da.
$hr$mn” and yrjd=”$yr$jd”. Now consider what would happen if this sequence of statements
happened right at midnight, specifically when yr, mo and da is captured at Jan 31, 2016 at 23:59:59
and jd, hr, mn and se are captured at Feb 1, 2016 at 00:00:00. This would cause no end of confusion
since now the log file would look like it was created at the beginning of Jan 31, possibly before other
log files created on that day. The Julian date would be off by one as well.
The preferred way to do this is to capture all date formats the script will use with one date command
and split the result as appropriate using sed or cut. For example:
alldates=`date “+%m%d.%H%M,%Y%j”`
logname=`echo “$alldates” | sed -e “s/,.*//”`
yrjd=`echo “$alldates” | cut -d, -f2`
The sed command removes the comma and all following characters from the alldates string, leaving the
month, day, a period, the hour and minute. The cut says using the comma as a delimiter, return the
second field which contains the year and Julian date.
Of course, if you are including beginning and ending time stamps to analyze performance, the end time
must use its own date call at the end of the script.
Another tip: when using timestamps in log files, the preferred format is %Y/%m/%d-%H:%M:%S
because it can be sorted as a single field and you don't need to worry about whether Jun gets sorted
before or after Jul or whether 10 gets sorted between 1 and 2.

9.6 man

The man command is used to print or display manual pages for the various commands that are
available on the system. This command uses the MANPATH environment variable to provide the
directories in which to look for man pages. The standard path for man pages is /usr/share/man. If
third-party tools are installed, they probably have their own man pages so their directories must be

Enterprise Linux Professionals, LLC © 2016


25
added to MANPATH to be able to see the man pages for those third-party tools. MANPATH is a colon-
separated list of directory names.

9.7 rm

The rm command removes files. Generally directories will not be removed by the rm command unless
the -r (recursive) option is used. The remove command is the one Linux command that can really do
damage because its effects are irreversible without a lot of work to restore files from backups. There is
an i option to make the command interactive and ask whether a file should be removed or not, but in a
script that should not be used. Similarly, in a script, you probably want to specify the full path for the
rm command (/bin/rm …) just in case the user running the script has an alias that includes the -i option.
Also, parameters to the rm command should be absolute paths. Imagine what would happen if you did
a cd to a directory that did not have execute permission, and then removed all the files (rm -f *) . This
would remove all the files in the current directory since the cd would have failed.

9.8 touch

The touch command updates the timestamp of a file and creates the file if it does not exist. One
problem with the touch command is that if you are expecting to create an empty file, but the file
already exists with data in it, the touch will not make the file empty. Scenario: Perhaps the previous
run of the script failed to remove the file because it was interrupted before the file could be removed. A
better solution is to copy from /dev/null (the Bit Bucket) to the file as cp /dev/null filename or
even more simply, >filename. This ensures that the file will be empty and has a current timestamp.

9.9 test

The test command is built-in to the shells and evaluates a condition and sets its exit status to zero if
the condition is true, and non-zero if false. A synonym for test is the left square bracket ([).

9.10 which

The which command will print the path of a program if it can be found through the PATH variable or
it has path information included. The type command does much the same thing.

Enterprise Linux Professionals, LLC © 2016


26
10 Automation

10.1 Commands versus Scripts

There is a difference between commands and scripts. In general, commands are run at a command
prompt and are probably single commands or a few commands connected with pipelines ( | ). One
problem with commands is that to rerun a set of commands, the possibility of having a typographical
error (a typo) increases with the complexity of the commands and the tiredness of the programmer.
Scripts are generally whole sets of commands to do a lot of work and, once debugged, should not have
typos. That is not to say that errors cannot occur with scripts, but usually these are data-dependent
errors.
A few examples of data-dependent errors are:
Using globbing to get a file list but the file list becomes too large. Typically this would be something
like cd LogDirectory; grep error *, where all the files in the log directory are searched by grep
for the word “error.” When the shell sees the “*”, it replaces it with the file names of all files in the
current directory. One way to prevent this error is to use the xargs program, not described in this
class.
Using data provided by the user without first validating that it has proper syntax. For instance, if a
numeric comparison is used (e.g., -eq) but the user-provided value is not numeric, one could see a
message “integer expression expected.” This could be data provided by the user on the script
command line or in a data file.

10.2 Cron (Scheduled Tasks)

Cron jobs are command sequences or scripts that get run by the operating system on a recurring basis.
This includes things like nightly file backups, disk space monitoring, daily transaction batch
processing, etc.
One important note regarding cron jobs is that the environment in which they run is significantly
simpler that the typical user's environment. This problem manifests itself by the script running
perfectly during development, but when it gets moved to cron, it breaks. The usual suspects are
environment variables that are not set in the cron script, etc.

Enterprise Linux Professionals, LLC © 2016


27
11 Operations

11.1 File Manipulation


In addition to creating files (e.g., >mfile) and appending to files (e.g., >>myfile) with redirection,
there are a number of ways files can be manipulated.
Files contain data but they also have metadata (kept in the file's directory) that contains information
like who is the owner of the file, what group does the file belong to, what are the file access
permissions, when was the file last modified, etc.
File names can be manipulated by moving from one place to another (e.g., mv myfile /tmp), rename a
file (e.g., mv ./myfile ./itisafile), copying to another place (e.g., cp myfile /tmp). The difference
between moving a file to another place and copying the file to another place is that the move will
remove it from the source location, whereas the copy will leave the original alone, resulting if two
copies of the file.

11.2 Printing Output


The lp command does printing of files. Output from a program or script can be printed. The lp
command takes a list of filenames to print, or if a hyphen (-) is used as the filename, data may be piped
directly into lp.

11.3 Running Programs


Running programs in Linux is easy. The program name is really just the filename containing the
program. Whether the program is a compiled program (e.g., written in C or C++ and linked) or a
script, they both are invoked the same way. The filename is entered as the first token on a command
line or a line in a script. Parameters to the program follow on the command line. If the program is just
a basename (with no path information), the PATH variable, composed of a colon-separated list of
directory names, is searched until the program name is found. If it exhausts PATH without finding the
name, the message command not found will be printed.

Enterprise Linux Professionals, LLC © 2016


28
12 Input

12.1 “<” (Standard Input)


Programs can receive data from many sources, but they all channel through the input location known as
Standard Input. Input could come from a terminal, a file, from a network device, from the keyboard or
any other input device.

12.2 Terminals
Terminals are where we do our everyday work with computers. We type on a keyboard and see the
results on the screen. We may use a mouse or trackball to move a cursor around on the screen and we
might keep data on a clipboard.

12.3 Reading from Files


Reading from files is done in a script with the read command, generally in a loop. One way is the
following:
# Save stdin in fd 3, point stdin to file $myfile
exec 3<&0 0< “$myfile”
while read field1 field2 remain
do
# Process the data.
done
# Close input from stdin, restore stdin from fd 3.
exec 0<&- 0<&3
Another way is to include the file descriptor on the read command with the -u option.
# Point fd 3 to file $myfile
exec 3< “$myfile”
while read -u 3 field1 field2 remain
do
# Process the fields
done
# Close input from fd 3.
exec 3<&-

Enterprise Linux Professionals, LLC © 2016


29
13 Output

13.1 “>” (Standard Output)


After a program executes code the result is some kind of output. This output can be to a display, a log
file, a terminal, some robotic or other output device, such as a printer.

13.2 Displaying Values


Values can be displayed using the echo command, the print command and the printf command.
The echo command just prints the values. The print command does the same. The printf command
is a formatting print statement, akin to the printf statement in the C language. The first parameter to
printf is the format statement and optional following arguments are used as parameters to the format to
supply values to the %-fields, for example %d says treat the parameter as an integer .

13.3 Writing to Files


Writing to files is typically done using redirection of stdout and stderr.

Enterprise Linux Professionals, LLC © 2016


30
14 Logging
Logging is something most programs do, especially when unattended or non-interactive. A log file is
just a text file containing entries. Most entries will have a timestamp so the analyst can know when a
logged event occurred.

14.1 “Permission denied”


Permission denied is a message that appears when a user does not have permission to access a file.
There are three permissions: read, write and execute. There are three entities that can try to access a
file: the file's owner, the group of the owner or other (i.e., the world). When you run ls -l file, the first
field has the permissions of the file. Each triplet can be rwx or rw- or r-x or r--, etc. The letter
indicates the permission, a hyphen indicates no permission, so r-x says the file has read and execute
permission but not write permission. The three triplet groups correspond to the owner, the group and
the world. So the owner must match the owner permissions to access the file. A user who is not the
owner will be constrained by the group permissions if the user is in the same group as the owner, and
the world permissions if not a member of the owner's group.

14.2 “No such file or directory”


No such file or directory is a common message when one types a filename that does not exist.
Things to look for are typos in the filename or the wrong directory or a missing directory.

14.3 /dev/null (also known as the Bit Bucket)


The file /dev/null is used when one needs a bit bucket. This can be use to provide a place for error
message you don't want to see, e.g., ls -l myfile 2>/dev/null. It can also be used to create empty
files, e.g., cp /dev/null myfile.

Enterprise Linux Professionals, LLC © 2016


31
15 Exit Status

15.1 “$?”

The question mark variable, $?, contains the final status of the previous command. It is usually a good
idea to capture it if you want to refer to it later, since the next command may change it.

15.2 Success = 0

Success of a command is indicated by a zero status.

15.3 Failure ≠ 0

Failure of a command is indicated by a non-zero status. Using non-zero for failures allows different
values to indicate different types of errors.

Enterprise Linux Professionals, LLC © 2016


32
16 Important Variables

16.1 Echo “\“$VARIABLE\””

To find out what value a variable has one can use the echo command. It is always a good idea to use
double quotes when referring to variables. By including escaped quotes inside the bounding quotes,
one can see if there is trailing whitespace in the variable.

16.2 PATH / MANPATH

The PATH and MANPATH variables are used to provide a colon-separated list of directories to search.
When a program is specified by only its basename, PATH is searched for the name of an executable
file. When a man page is requested, MANPATH is searched to find a man file for the name specified.

16.3 EDITOR / VISUAL

The EDITOR and VISUAL environment variables provide the names of editing programs (e.g., ed, vi,
respectively) to be used by another program. One such program is crontab. When using crontab -e,
the EDITOR program is used to allow the user to make edits to the cron table. When the editor exits,
the crontab program regains control and can signal cron to reread the cron table. When VISUAL is set,
EDITOR is ignored.

16.4 PS1

The PS1 variable is the primary prompt string to use when reading commands from the terminal.

16.5 PS2

The PS2 variable is the secondary prompt string to use when reading commands from the terminal and
the previous line ended with a backslash. The default is ‘‘> ’’.

16.6 PS3

The PS3 variable value is used as the prompt for the select command .

Enterprise Linux Professionals, LLC © 2016


33
16.7 PS4

The PS4 variable value is printed before each command bash displays during an execution trace as
when set -x is in effect. The first character of PS4 is replicated multiple times, as necessary, to
indicate multiple levels of indirection. The default is ‘‘+ ’’.

16.8 HOME

The HOME environment variable contains the absolute path of the user's home directory.

16.9 HOSTNAME

The HOSTNAME environment variable contains the output of the hostname command.

16.10 HISTSIZE

The HISTSIZE environment variable contains the number of commands that should be remembered in
the history file.

16.11 SHELL

The SHELL variable is the user's shell at login time. The shell can change easily but the SHELL
variable must be changed manually if it is to be kept in sync.

16.12 LS_COLORS

The LS_COLORS variable is used to provide color information when --color=auto is specified on the
ls command.

Enterprise Linux Professionals, LLC © 2016


34
17 Command Substitution

17.1 $(Command)

This is the newer syntax for command substitution. The command is executed and its standard output
is put in the command line between and including the dollar sign and left parenthesis and the right
parenthesis. The benefit of this over the backquote method is that the backquotes cannot be easily
nested.

17.2 `Command`

This is the original (older) syntax for command substitution. The command is executed and its
standard output is put in the command line between and including the backquotes.

18 Login Shell

A login shell is one where you log in by providing your username and password.

18.1 /etc/passwd

The seventh colon-separated field of the /etc/passwd file indicates the shell to use as the login shell.

19 Login Scripts

19.1 .bashrc

The $HOME/.bashrc file contains commands to run when a bash shell begins.

19.2 .bash_profile

The $HOME/.bash_profile file contains commands to run when a bash login shell begins.

Enterprise Linux Professionals, LLC © 2016


35
20 Logout Scripts

Logout scripts are only called when a login shell is terminating.

20.1 .bash_logout

The $HOME/.bash_logout file contains command to run when the login shell is bash and it is
terminating. It does not need to exist.

21 Aliases

Aliases are used to provide abbreviations or shortcuts for commands. They can be used to always
provide certain options for commands. One example is alias ls='ls --color=auto' which will
always include the --color=auto option when ls is executed. There are two ways to avoid using an
alias. One is to provide a path to the command, e.g, /bin/rm, and the other is to escape the first
character, e.g., \ls, though escaping any character would be sufficient (l\s). Shell functions can
provide a rough equivalent to aliases.

21.1 “When I… Do…”

For example, root could have an alias rm='rm -i' which effectively says, when I say rm, do rm -i
which runs the remove command interactively, and asks for permission to remove files.

22 export

Exported variables are available to the current shell and scripts and programs it calls. It must be noted
that there is no simple way to make an exported variable visible to the parent process.

22.1 Making Variables Accessible

The best place for defining exported variables is in the profile file that gets read at the beginning of the
login shell. Once these exported variables are set, they are available to all subprocesses.

Enterprise Linux Professionals, LLC © 2016


36
23 Executable

An executable is a script or program that the shell can run when it is mentioned on a command line as
the first name.

23.1 Extensions Not Required

Extensions are not required for executable programs. Whereas DOS executable programs have a
“.exe” extension, Linux has no such restriction. Some people choose to include a “.sh” extension to
indicate that a file is an executable shell script. From C++ we learned about “Information Hiding”
which means one should not advertise what the code is doing or how it is doing it. If a script is
eventually replaced with a compiled program for performance reasons, the learning curve to get users
to not include the “.sh” suffix is more trouble than it is worth.

23.2 Execute Permissions

Files (whether scripts or programs) must have execute permission in order to execute. This is a form of
protection for the system so a file containing random bits cannot be executed unless execute permission
is explicitly set.

24 The “bash” Command

The bash command is the Bourne Again Shell.


The remainder of this document should be viewed in concert with the system's man pages. Just do a
man command and it should take you the bash_builtins page for built-in commands or the standalone
command page for non-built-in commands. As such, the descriptions below of commands do not
simply repeat text that can be found in the man pages. Instead, the descriptions below provide insight
into the commands garnered from years of experience.

Enterprise Linux Professionals, LLC © 2016


37
25 Bash Syntax

25.1 Commonly Use “.sh” Extension

Some programmers like to include a .sh extension on their scripts. This is certainly not a requirement
like execute permission.

25.2 “#!/bin/bash”

The first line of a bash script should be #!/bin/bash. This tells the kernel to load the bash interpreter
to run this file.

26 Comments

Comments in scripts are indicated by a pound sign followed by text through the end of the line. It is
important to document scripts well. Returning to a script after six months of running perfectly and
trying to resolve a new error can be difficult, even for the script's author, let alone his or her
replacement. Kernighan and Pike in The Practice of Programming, Addison-Wesley, have a section on
comments. Their subsections are labeled, “Don't belabor the obvious,” “Comment functions and global
data,” “Don't comment bad code, rewrite it,” “Don't contradict the code” and “Clarify, don't confuse.”

26.1 Anything After “#” Is Not Executed

While anything after a pound sign and the pound sign itself are treated as comments, it must be noted
that the pound sign is a valid filename character and need not be quoted when embedded in a filename.
The more accurate statement would be “Anything after a non-quoted whitespace-pound sign is a
comment as is a line beginning with a pound sign.”

27 Quotes

Quotes are important because they hold multiple words together to be treated as a single field. They
also allow the shells to recognize a field even though it may be empty. Some scripts may use code like,
if [ “x$var” = “x” ] … to prevent a “unary operator expected” error when var is empty. The
truth is that with the quotes, if [ “$var” = “” ] … is sufficient and makes reading the script easier.
There are a few reasons to not quote variables. The primary one is when the variable can be empty and
it is being used on a command line. If an empty variable has quotes around it, even though it is empty,
it is still counted as an argument to the command line and may cause problems. An example from this
author's experience is calling the fmt program with unquoted variable $Fc. If the kernel name

Enterprise Linux Professionals, LLC © 2016


38
(uname -s) is “Linux” or “SunOS”, Fc will be set to “-c”, allowing crown formatting, where the
first line can have different length than the other lines, and when not “Linux” or “SunOS” Fc is set to
“” so no parameter is passed to fmt. Another example is in the Lab 1 Answer at the end of this
document.
Quoting may be mixed, for instance, try echo '$HOME is'”$HOME”. Here, the first $HOME is in
single quotes, so “$HOME is” is printed and the second $HOME is in double quotes so the variable
value is returned, so “/home/userid” is printed.

27.1 Quote Symmetry

Quotes must be matched. That is, for every leading quote there must be a trailing quote. Lots of
strange syntax errors can appear when a quote is mismatched.

27.2 Single Quotes

Single quotes (') are special in the sense that the special meaning of the enclosed characters is removed
except the trailing single quotes. So, echo '$USER' will display a dollar sign and the string “USER”.

27.3 Double Quotes

Double quotes (“) should be used for most quoting since variable references ($var) will be expanded
by replacing the dollar sign and variable name with that variable's value. So, echo “$USER” will
display the user's login id. Similarly, in double quotes, command substitution is performed.

27.4 Back Quotes

Back quotes are not really used for quoting in the normal sense. Instead, they are used to bound a
command indicating command substitution with `command`. The new command substitution uses
$(command). This $(...) syntax has the benefit that commands can be nested.

27.5 Apostrophe versus Tick Mark

If you want an apostrophe, use a backslash followed by a single quote.

Enterprise Linux Professionals, LLC © 2016


39
28 Interactive Scripts

Interactive scripts are those that request input from the user in addition to the parameters provided on
the command line.

28.1 read -p “Prompt” VAR1 VAR2 …

The read command is a way for a script to get input from the user via stdin (standard input). The -p
option and prompt are optional and the prompt can contain variable references. The input is split on
the characters in the internal field separator (IFS) which are normally the space, tab and newline, in
other words, whitespace. Generally, leading whitespace on the input line is ignored unless IFS is
empty or just the newline. If there is more data of the line than variables after being split by IFS, the
last variable on the read line will get all the remaining data. So a read line will get the entire line
except any leading whitespace while IFS= read line will get the entire line including any leading
whitespace.

Enterprise Linux Professionals, LLC © 2016


40
29 Positional Parameters

The shell allows parameters on the command line. Note: The original Bourne shell only allowed direct
access to positional parameters 0-9, but newer shells (e.g., bash, ksh) allow access to more than nine,
but they must be referenced using braces (${11}).

29.1 $0, $1, $2, etc.

The parameters on the command line include $0 which is the command being executed. This could be
the script path or the basename of the script. $1 is the first parameter following the command name, $2
the second and so forth. “How do you refer to the tenth and following parameters?” since only a single
digit following the dollar sign is recognized. There are a few ways. First, there is a shift command
with an optional number following (shift n). If n is not specified, one (1) is assumed. The shift
command removes $1 from the list and all following parameters move down so $2 becomes the new
$1, etc. The other way is to use a for loop with just a variable, e.g., for arg ; do … ; done. This is
the same as for arg in “$@” ; do … ; done. Each iteration will see one command line parameter.

29.2 Pass Arguments When Script is Executed

Arguments on the command line are passed to the script. Note that any unquoted redirection
parameters are not considered as arguments to the script. The shell processes them and does not pass
them to the script before it gives control to the script.

29.3 “$*” (All Parameters)

When referenced with this form, all parameters are seen as based on whitespace, so even arguments
enclosed in quotes with embedded whitespace are seen as individual words. This is the same as “$1
$2 $3 … $n”. For example, if the arguments are 1 “2a 2b” 3 4, “$*” will be treated as “1 2a
2b 3 4”.

29.4 “$@” (All Parameters)

When referenced with this form, arguments with embedded whitespace maintain that whitespace and
the number of arguments does not change. This is the same as “$1” “$2” “$3” “$4” … “$n”.
Using the same arguments as above, “$@” will be treated as “1” “2a 2b” “3” “4”.

29.5 “$#” (Number of Parameters)

Enterprise Linux Professionals, LLC © 2016


41
The $# variable is the number of arguments on the command line. It does not count $0 as an
argument, instead $0 is the program or script being executed.

29.6 “$$” (Process ID (PID) for this Shell)

Every process currently running on the system has a unique Process ID or PID. The PID is used on the
kill command when it is necessary to send a signal to a process. It is also fairly common practice to use
the PID as a part of temporary filenames, since it is unique. When doing this, ensure that there is a
non-digit character in front of the number because if you end your script with rm /tmp/*$$ and your
PID is 123, this remove could also remove /tmp/1123, /tmp/2123, etc., which could pull the rug out
from under other users. Better would be to use something like suffix=”.$$” and use rm
/tmp/*$suffix .

30 Setting (Declaring), Referencing and Unsetting Variables

30.1 VAR=value

Variables are set by providing the name immediately followed by an equal sign and an optional value.
If no value is provided, an empty string (“”) is assigned to the variable. There must be no whitespace
around the equal sign.

30.2 Referencing Variables

Referring to a variable is as simple as $variablename or ${variablename}. The braces are only required
if the name is followed by a character that is a valid variable name character (letter, digit or underscore)
or one of alternate forms below is used.
To provide a variable's value but use a default value if it is not set, use ${variable-default) where the
default value will be used if variable is not set. Consider the sequence following. First, a default
variable is set and a variable is set to an empty string (dftlogfile=”/log/name.log”; logfile=“”).
Then the command line parameters are processed which may set logfile to a non-empty value. Finally,
if logfile is still empty after all command line parameters have been processed, if there is a non-empty
environment variable, use its value, otherwise use the default. This last sentence would look like this:
if [ “$logfile” = “” ] ; then
logfile=“${LOGFILE-$dftlogfile}”
fi

Enterprise Linux Professionals, LLC © 2016


42
30.3 unset

The unset command removes the definition of one or more variables.

31 Variable Scope

31.1 Global versus Local Variables

Exported variables can be thought of as global variables. They are sometimes called environment
variables because they are part of the environment when scripts begin. Local variables are those
variables that are not exported. These will not be seen by called scripts.

31.2 When Variables Apply

Variables apply when they are referenced.

32 sort Command

The sort command takes one or more files as input and sorts the lines based on the values of the sort
keys specified on the command line. If no files are mentioned on the command line, stdin is read.
Output goes to stdout where it may be redirected to an output file or piped to another program or script.

32.1 Organizing Output

Sort can be used for organizing records of data. One reason for doing this is to be able to implement a
binary search.

32.2 Consolidating All Duplicates

When the -u option is used with sort, it indicates that the output be composed of only unique records.
Uniqueness is determined by all the sort keys of multiple records having the same value. Caution: If
there are other fields that are not mentioned as sort keys on, say, two records and these non-key fields
have different values, data will be lost because sort will only keep one of the records with all of its
fields and the data from the non-key fields of the discarded record will be lost.

Enterprise Linux Professionals, LLC © 2016


43
33 cut Command

The cut command allows one to extract character sequences or fields from a file. Some important
things to note are: 1) the default delimiter for the -f (field) option is the tab character, 2) successive
delimiters produce empty fields, 3) characters and fields are returned in strict increasing order
regardless of the order specified in the command line parameters, 4) a character or field will ever only
be returned once.

33.1 Isolating Output

Just as grep filters records to only show certain records, cut can be thought of as a filter that works on
records to return only certain fields.

34 uniq Command

The uniq command returns only the unique adjacent lines from an input file. This means that if the
input file is not first sorted, duplicates could appear in the output. The alternative is to use sort -u.

34.1 Ignoring Contiguous Duplicates

Contiguous (adjacent) records with the same key values will be removed from the output.

35 find Command

The find command is one of a system administrator's most useful and important utilities. It finds files
and directories based on their names or other attributes. It has the ability to dive down directory
structures.
The syntax of the find command is not the usual command [options] [filenames]. Instead it is
find path [...] operators [...].
The path is typically the starting point and there can be multiple starting points, e.g., /bin /usr/bin. Path
may also be individual filenames, for example, to check for certain ownership or permissions.
The operators are processed sequentially with success of one operator being necessary to process the
next operator. There are two operator connectors, -a (AND) and -o (OR). If there is no connector
between two operators, -a is assumed. It also allows grouping using the \( and \) operators.
See the find man page for full details.

Enterprise Linux Professionals, LLC © 2016


44
36 Definitions

The following definitions are used throughout the rest of this document.

36.1 blank

A space or tab. Also called whitespace.

36.2 word

A sequence of characters considered as a single unit by the shell. Also known as a token.

36.3 name

A word consisting only of alphanumeric characters and underscores, and beginning with an alphabetic
character or an underscore. Also referred to as an identifier.

36.4 metacharacter

A character that, when unquoted, separates words. Generally, metacharacters lose there special status
when escaped by preceded by a backslash. One of the following: | & ; ( ) < > space tab

36.5 control operator

A token that performs a control function. It is one of the following symbols: || & && ; ;; ( ) | |&
<newline>

Enterprise Linux Professionals, LLC © 2016


45
37 Reserved Words

Reserved words are words that have a special meaning to the shell. The following words are
recognized as reserved when unquoted and either the first word of a simple command or the word “in”
when the third word of a case, for or select command:
!case do done elif else esac fi for function if in select then until while { } time [[ ]]

37.1 “!”

The exclamation mark is used to indicate a history reference. A good shell script will not use the
history feature. It is also used with the time command to logically negate the exit status of a pipeline.

37.2 “( )” and “(( ))” and “{ }”

Parentheses create a subshell and run the list of commands inside in that subshell. Output from the
subshell's commands may be redirected by a single redirect outside the parentheses, not one for each
enclosed command. Environment changes (e.g., variable assignments, cd newdir) made in the
subshell are not seen by the parent.
(( )) are used with a leading dollar sign, $(( expression )), for arithmetic evaluation.
Braces define a list and run the commands in that list without needing a subshell. Individual
commands in a list are separated by a semicolon or a newline. Output from the list's commands may be
redirected by a single redirect, not one for each enclosed command. Environment changes (e.g.,
variable assignments, cd newdir) made in the list are seen by the parent.

37.3 “[ ]” and “[ [ ] ]”

Single brackets are usually used to denote a test.


Double brackets surround a conditional expression and a zero or one status is returned depending on
the results of the condition.

37.4 “case” and “esac”

The case command begins a case statement.


The esac (“case” spelled backwards) terminates a case statement.

Enterprise Linux Professionals, LLC © 2016


46
37.5 “do” and “done”

The do and done statements bound a loop. Both words must be on lines by themselves or follow
semicolons. Putting the same whitespace in front of the do and done statements makes it easy to see
which done is associated with which do.

37.6 “elif” and “else”

The elif command is “else if ...” and is seen (processed) when the preceding “if” is false.
The else command is seen (processed) when all preceding “if” statements are false.

37.7 “for”

The for command begins a loop.

37.8 “function”

The function statement begins a shell function definition. The syntax is: function name { list }. A
synonym is: name () { list }.

37.9 “if” and “fi”

The if statement begins an if statement. The format is: if [ condition ] ; then list ; fi
The fi (if spelled backwards) command indicates the end of the if statement. Both “then” and “fi” must
be on lines by themselves or follow semicolons. Putting the same whitespace in front of the if and fi
statements makes it easy to see which fi is associated with which if.

37.10 “in”

The in statement is used with the “for” statement as: for variable in list; do list ; done

37.11 “select”

The select statement is used for menu-type processing.

Enterprise Linux Professionals, LLC © 2016


47
37.12 “then”

The then statement follows an “if” condition.

37.13 “until” and “while”

The until statement begins a loop as: until [ condition ] ; do list ; done. If condition is initially
true, the list is not executed.
The while statement begins a loop as: while [ condition ] ; do list ; done. If the condition
isinitially false, the list is not executed. A good infinite loop construct is: while : ; do list ; done.
Ensure that there is a conditioned break in the list of commands.

37.14 “time”

The time command provides timing information for the command following “time”.
Loop Control
“break” and “continue”
The break command exits a loop immediately.
The continue command iterates the current loop.
Loop controls are “for”, “while” and “until” and the loop is enclosed by “do” and “done”.

Enterprise Linux Professionals, LLC © 2016


48
38 Command Delimiters

There are command terminators and command separators.

38.1 “;” “&” or “<newline>” (Command Terminators)

There are three command terminators. These tell the shell to run the commands that are to the left of
these characters.

38.1.1 Semicolon

The semicolon allows multiple serial commands on a single line, e.g., ls -ld . ; ls -l . where the first ls
prints information about the current directory and the second ls prints the contents of the current
directory.

38.1.2 Ampersand

The ampersand tells the shell to run the commands that are to the left of this character in the
background.

38.1.3 Newline

The (non-escaped) newline tells the shell to run the commands that are to the left. If the newline is
escaped as when typing a backslash immediately before hitting the ENTER key, it is not a command
terminator, rather it indicates that the command is to be continued on the next line.

38.2 “&&” and “||” (Command Separators)

The command separators tell the shell to run a command and depending on the success or failure of that
first command, run another command.

38.2.1 “&&” (...if true…)

The syntax is command1 && command2 … where command1 must finish normally (zero exit
status) to allow command2 to run. If command1 fails, command2 is not run.

Enterprise Linux Professionals, LLC © 2016


49
38.2.2 “||” (...or else…)

The syntax is command1 || command2 … where command1 must finish abnormally (non-zero
exit status) to allow command2 to run. If command1 is successful, command2 is not run.
These two compound operators may be combined thus: command1 && command2 || command3 where
if command1 is successful, command2 is run and if it is successful, command3 is not run. If either of
command1 or command2 fails, command3 is run. Be careful because if command1 is successful but
command2 fails, command3 is still run even though command1 was successful.

39 Lists

Lists take two forms. They both allow capturing the output of the enclosed commands with a single
redirect of the output to follow the right delimiter. One uses parentheses, ( and ), which creates a
subshell to do the work. The other, the true list, uses curly braces, { and }, and does not create a
subshell and so could be faster. [Some implementations run a list in a subshell if there is a pipe
involved. See UNIX Power Tools, section 14.08.] Three notable differences between the two are, 1) a
cd command in the subshell does not change the parent shell's current directory, it does in a list, 2) a
variable set in a subshell is not passed to the parent shell, from a list, the variable is passed out,, and 3)
the list right brace must be on first on a line or follow a semicolon.
Within the two lists, normal command syntax is allowed, including usage of command delimiters.

Enterprise Linux Professionals, LLC © 2016


50
40 Comparisons

40.1 Integer Test Operators

40.1.1 Equal

To test for two integers being equal, use -eq, e.g., if [ $num1 -eq $num2 ] ; then … fi

40.1.2 Not Equal

To test for two integers being not equal, use -ne, e.g., if [ $num -ne $num2 ] ; then … fi

40.1.3 Greater Than

To test for one integer being greater than another, use -gt, e.g., if [ $num1 -gt $num2 ] ; then …
fi

40.1.4 Less Than

To test for one integer being less than another, use -lt, e.g., if [ $num1 -lt $num2 ] ; then … fi

40.1.5 Equal Or Greater Than

To test for one integer being greater than or equal to another, use -ge, e.g., if [ $num1 -ge
$num2 ] ; then … fi

40.1.6 Equal Or Less Than

To test for one integer being less than or equal to another, use -le, e.g., if [ $num1 -le $num2 ] ;
then … fi

40.2 String Test Operators

Enterprise Linux Professionals, LLC © 2016


51
40.2.1 Equal to String

Use string1 = string2

40.2.2 Not Equal to String

Use string1 != string2

40.2.3 Equal to Pattern

Use string = pattern

40.2.4 Not Equal to Pattern

Use string != pattern.

40.2.5 Length of String is Not Zero

Use -n string.

40.2.6 Length of String is Zero

Use -z string.

40.3 File Test Operators

-b name True if name exists and is a block special file


-c name True if name exists and is a character special file
-d name True if name exists and is a directory
-e name True if name exists
-f name True if name exists and is a regular file
-g name True if name exists and its set group id bit is set
-h name True if name exists and is a symbolic link
-k name True if name exists and its sticky bit is set

Enterprise Linux Professionals, LLC © 2016


52
-L name True if name exists and is a symbolic link
-P name True if name exists and is a fifo special file or a named pipe
-r name True if name exists and is a readable
-s name True if name exists and its size is greater than zero
-S name True if name exists and is a socket
-t n True if file descriptor n is open and associated with a terminal device
-u name True if name exists and its set user-id bit is set
-w name True if name exists and is a writable
-x name True if name exists and is a executable

And now a few notes about the above.


If you know you want a file or you know you want a directory, do not use -e, instead use -f or -d as
appropriate. If the name exists as the wrong type you could have problems.
-h and -L are the same, -h came first but -L is more intuitive.
If doing tests to produce very detailed error messages, consider getting all the pieces into separate
variables first, since multiple trips to the file system for the same information is a waste of time. For
example:
rd=$(test -r $file; echo $?);
wr=$(test -w $file; echo $?);
ex=$(test -x $file; echo $?);
if [ “$rd$wr$ex” = “000” ] ; then
: “file has read, write and execute permissions”
else
# At least one of rd, wr or ex is non-zero.
msg=””; msep=””;
if [ $rd -ne 0 ] ; then msg=”$msg${msep}read”; msep=”, “; fi
if [ $wr -ne 0 ] ; then msg=”$msg${msep}write”; msep=”, “; fi
if [ $ex -ne 0 ] ; then msg=”$msg${msep}execute”; msep=”, “; fi
# Change the final “, name permission”
# to “ and name permissions”.
msg=$(echo “$msg permission” | sed \
-e 's/, \([^,][^,]*\)$/ and \1s/' )
echo “File $file is missing $msg.” >&2

Enterprise Linux Professionals, LLC © 2016


53
fi

40.4 Exists versus Does Not Exist

The test -e name test returns true (zero) if the name exists and non-zero otherwise.

41 Calculations

Calculations may be done in three ways: Arithmetic Expansion, “$(( expression ))” in the shell;
expr arg1 operator arg2 [operator arg3...]; or bc.
See the bash man page sections on Arithmetic Expansion and ARITHMETIC EVALUATION. These
calculations are done within the shell without needing a new process.
Two gotchas with expr are that each argument and operand must be separated from the others by
whitespace and shell metacharacters like “*” (multiply) need to be escaped. See the expr man page for
its arithmetic operations.
Both the shell and expr only do integer arithmetic. For floating point operations and extremely large
numbers, use bc, the arbitrary precision calculator. In a script this is usually done as follows.
echo “scale=3; 1.5+2.25; 10 /3” | bc produces two lines: 3.75 and 3.333, where any of the
numbers in the echo statement could be variable references. See the bc man page.

Enterprise Linux Professionals, LLC © 2016


54
42 Regular Expression (regex)

This section describes Regular Expressions (REs) which are one very powerful feature of Linux.
In the subsections that follow, “search” is either a single character to be repeated some number of times
or a character list to be repeated some number of times.
A character list item is bounded with square brackets ([ ]).
Within a character list item, there can be a list of actual characters, a range of alphabetic or digit
characters, e.g., A-Za-z0-9, or special characters. Since the hyphen (-) is used to indicate a range in a
character list item, to recognize a literal hyphen it must be first or last in the brackets. If the character
list needs to include the left or right bracket, it must be first following the leading left bracket. The
caret (^) when immediately following the left bracket indicates that the characters in the character list
are not to appear at the current position. All other characters will be recognized.
Just as CD-ROMs are Write Once, Read Many (WORM), it has been said that Regular Expressions are
WORN (Write Once, Read NEVER !) As such it is a good idea to provide comments describing what
REs are trying to do when writing scripts, because going back later to try to add comments requires
understanding what the expressions were trying to do. Keeping true to “Don't contradict the code,” if
there are comments to compare with the code and a contradiction is seen, the comments may provide
insight into a Regular Expression error.
That being said, it is very important to thoroughly test REs, with both good and bad input, to ensure
that the RE is working as desired. Once an RE gets into a production environment, it will be hard to
explain that it failed for lack of testing.
To test basic REs, one can start by echoing a few lines and piping them into sed -n -e “/newRE/p”
with the newRE being developed. This will print the lines that match the new RE. Then expand this to
a file of values.
To test Extended REs, start by piping individual lines to egrep and then expand to a file with several
values.
Carefully check the output to ensure that the RE sees what it should see and doesn't see what it
shouldn't see.

42.1 Regular Expression Anchors

Regular expression anchors are the caret (^) and dollar sign ($).
The caret at the beginning of an RE says the RE is anchored to the beginning of the string.
The dollar sign at the end of an RE says the RE is anchored to the end of the string. So an RE for an
apparently empty line would be “^[ $T]*\$” where T is a tab character, possible created with T=$
(echo “t” | tr ”t” “\t” ).
Regular Expressions are “greedy,” that is, they find the left-most longest string that matches the RE.

Enterprise Linux Professionals, LLC © 2016


55
42.2 search*

The asterisk (*) is a Regular Expression (RE) metacharacter that says, “Match zero or more times.”

42.3 search+

The plus sign (+) is an extended RE metacharacter that says, “Match one or more times.”

42.4 search?

The question mark (?) is a RE metacharacter that says, “Match zero or one time.”

42.5 search

A search item with no *, + or ? following must match exactly one time.

42.6 “[: alnum :]”

43 expr

The expr program does arithmetic calculations and string calculations.

43.1 Basic Regular Expressions

The expr program uses basic regular expressions for string evaluation, so will not recognize the
extended RE metacharacters +, |, (, ) and \{ and \}.

44 sed

The sed program is the stream editor. It basically takes input from a file or pipeline and processes
each input line and writes the result to stdout.

Enterprise Linux Professionals, LLC © 2016


56
44.1 Basic Regular Expressions

The sed program uses basic regular expressions, so will not recognize the extended RE metacharacters
+ and \{ and \}.

44.2 Text Manipulation

The sed program takes all parameters, whether from individual command line parameters (-e
expression) or a set of parameters from a file (-f file). It then reads the input and processes the
parameters in sequence. Note that following parameters work on the previous parameter's changes to
the input.

44.3 Parameter Gotchas

There are a few Regular Expression gotchas. The most important one is that all characters used for a
substitute command (s/fromRE/to/) will eat those input characters and if the global flag (g) is used, the
next attempted match will start following the eaten characters. An example will show it best. In the
answer to Lab 2, if the second sed expression is changed from "s/ \([a-z]\)[a-z]*/\1/g" to "s/ \([a-
z]\)[a-z]* /\1/g", (a space added after [a-z]*), you can see how the output is different. Only the odd
words become a single character and the even words are unchanged.

45 grep

The grep program is the version of grep that assumes basic Regular Expressions.

45.1 Basic Regular Expressions

Only basic Regular Expressions are available.

45.2 Filtering

The grep program is excellent at filtering records to reduce the amount of data one must look at.

45.3 Recursion

Enterprise Linux Professionals, LLC © 2016


57
46 egrep

The egrep program is the version of grep that assumes extended Regular Expressions.

46.1 Extended Regular Expressions

The egrep program uses extended regular ex

46.2 Filtering

The egrep program is excellent at filtering records to reduce the amount of data one must look at.

46.3 Recursion

Enterprise Linux Professionals, LLC © 2016


58
47 awk

The awk program name is derived from the first letter of the last names of the original program authors
– Aho, Weinberger and Kernighan. The awk program is an interpreter that reads a script, either self-
contained on the command line itself or in a file, and performs the operations as specified. The
program is made to process input files, so little has to be done to read files. It automatically splits up
input records into fields making it easy to look for data, using a dollar sign and a number per field. So
$0 is the entire input line, $1 is the first field, $2 is the second field, etc. Several built-in variables are
maintained as records are read, one being NF, the number of fields. Referencing the last field on each
line is then just looking at $NF. Since awk uses dollar signs freely, scripts that are self-contained on the
command line are generally in single quotes so the dollar signs do not get expanded to variable values.
Linux systems have gawk, the GNU awk program that has several extensions to the original awk
program. awk is sufficiently complex that it should have its own class, so it will not be seriously
described here.

47.1 Extended Regular Expressions

The awk program uses Extended Regular Expressions

47.2 Print Field

One of the more frequent one-line awk commands is awk '{print $1}' file. This says print the first
field from each line of input. Another example is awk -F: '{print $1, $3}' /etc/passwd which
prints the user id and numeric uid from the /etc/password file.

47.3 Pattern Matching

Pattern matching is easy to do. One can say '/^[a-m]/ {print $1}' /etc/passwd which will print all
userids in /etc/passwd that begin with a letter from “a” through “m”.

47.4 Keyword Matching

This is just a variant of pattern matching.

47.5 Passing Local Variables

Passing local variables to awk is easy. The preferred method is to use -v variable=value on the

Enterprise Linux Professionals, LLC © 2016


59
command line and then refer to variable in the awk script. For example, name=ftp; awk -F: -v
seek=$name '$1 == seek {print $0}' /etc/passwd returns
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
The negative example for passing local variables is to mix single and double quotes around the script.
This would make the above example look like this. name=ftp; awk -F: '$1 == '”$name”'
{print $0}' /etc/passwd
Some people are lazy and would omit the double quotes around $name which could cause problems if
there is whitespace in $name. Also, if multiple variables are needed, the quoting changes get
increasingly complex and error prone.

Enterprise Linux Professionals, LLC © 2016


60
48 Regular Expression Labs

The following lab exercises should give you a chance to try what you have learned up to now.

48.1 Lab 1

Given a list of names, names=”bob carol ted alice”, capitalize the first letter of each name and
change the space between names to “ and ” and print the result which should be “Bob and Carol and
Ted and Alice”.

48.2 Lab 2

Given the string, sentence=”A Rat in the House might eat the Ice Cream”, produce the
word “arithmetic” from the first letter of each word.

48.3 Lab 3

Given the string, sentence=”The quick, brown fox jumps over the lazy dog”, produce the
number of unique letters.

49 Daemons

Daemons (pronounced like demons) are background jobs that perform services when requested. For
instance, there may be a printer demon that monitors print queues and when seeing work, forks another
process to perform that work and the main process goes back to its monitoring role. When the forked
process finishes its task, it can simply exit.

49.1 Non-interactive Processes

Most scripts should be written as non-interactive scripts unless there is a specific need for them to be
interactive. That will make it possible to have the script started by cron, the job scheduler.

Enterprise Linux Professionals, LLC © 2016


61
50 Services

50.1 Managed Programs

50.2 SystemV Init Scripts

50.3 SystemD Targets

51 Processes

Processes are the way work gets done on a Linux system.

51.1 Active Programs

Active programs are those with at least one process running or eligible to run.

51.2 Running Processes

Running processes are those that are actively using the CPU.

51.3 Sleeping Processes

Sleeping processes are those that request to sleep for a time, e.g., sleep 10 says the current process can
sleep for ten seconds. Whether the process actually awakes after the sleep time is difficult to
determine, only the sleep time is guaranteed but once the sleep time has expired, the process gets added
to the queue to request CPU time, so when the process actually gets CPU time depends on the system
load. Additionally, sleeping processes can be waiting for input or output to complete.

51.4 Zombie Processes

A zombie process is one that has completed but its parent process has not requested its final status.

51.5 Foreground Processes

Foreground processes are those that are started from the command prompt without the ampersand (&)
that says to run the command in the background. If necessary, one can use Ctrl-Z to pause a

Enterprise Linux Professionals, LLC © 2016


62
foreground process and then type bg to put the paused job in the background.

51.6 Background Processes

Background processes are those that are started from the command line with an ampersand (&).

52 Jobs

52.1 Processes Attached to a Terminal

52.2 jobs

The jobs command shows the jobs the user is currently running.

Enterprise Linux Professionals, LLC © 2016


63
53 “ps” (Process Status)

The ps command returns a formatted list of processes. The format of the output can be controlled by
command line options as can the selection of processes. It is important to note that the output returned
from ps is a snapshot. This means that if one wants to follow parent PIDs up the chain to cron or init,
for example, to determine of a process belongs to a cron job, save the ps output to a file (ps -ef
>/tmp/ps.dat) and then follow the PID chain within the file. This is the only way to guarantee
consistency.

53.1 Process Query

Many times a process query simply starts with the ps command and its output gets filtered to cut down
the amount of data to look at. For example, ps -ef prints information about all processes (-e
everything), and formats the output in a standard way (-f). The output may then be filtered to only
show processes of the user (ps -ef | grep userid).

53.2 “ps -ef” (All Processes)

This prints information about all processes that are running, sleeping, zombie, etc.

53.3 “ps -f -p $$ (The current process)

Print a standard one line output describing the current process ($$). This is a good way to check your
shell.

53.4 “ps axo pid,ppid,%cpu,%mem,args”

This ps command shows the process ID, parent pid, percent cpu, percent memory and command with
arguments.

Enterprise Linux Professionals, LLC © 2016


64
54 Signals

Signals allow communication between processes and the user.

54.1 Sending Messages to Processes

Some of the messages that are sent are to tell an executing process that the user has hung up (SIGHUP),
interrupted the process (SIGINT) or sent a SIGUSR1 or SIGUSR2 signal, while some are generated by
the computer itself and sent to processes, like Illegal Instruction (SIGILL), Floating Point Exception
(SIGFPE), Death of Child (SIGCHLD), etc.
Some background processes watch for a Hang Up signal to tell them to reread their control file. The
process goes something like this. The administrator makes changes to the printer control file and then
sends a SIGHUP to the printer daemon which will then reread its control file and become aware of the
changes that were made.

54.2 kill

The kill program is used to send signals to individual processes. Its name is derived from the fact that
most of the time it is used, it is used to kill a process. There is one signal that cannot be caught by
processes. It is SIGKILL (-9).

Enterprise Linux Professionals, LLC © 2016


65
55 Loops

There are several loops available to the shell programmer. They are while, until, for and select. (The
select loop is not covered in this document but you can read about it in the bash man page.)

55.1 while and until Statements

while testlist; do list; done


until testlist; do list; done
The testlist is a command that returns zero or non-zero. The while continues as long as the testlist
returns zero. The until continues as long as the testlist returns non-zero.

55.2 for Statements

for name [ [ in [ word ... ] ] ; ] do list ; done


for (( expr1 ; expr2 ; expr3 )) ; do list ; done
for name ; is equivalent to for name in “$@” ; to process the arguments on the command line or
a function call.

55.3 Instantiation

Indenting to the same level for each for, while, until, do and done, case, esac, if, elif, else and fi.
Though “then” is part of the if statement, it is more of a noise word, so some prefer to put it following a
semicolon after the right bracket of the test. For example,
for arg
do
until [ “$arg” = “--” ]
do
case $arg in
-a) aopt=1;;
-c) copt=1;;
-w) wopt=1;;
*)
if [ “$arg” = “” ] ; then

Enterprise Linux Professionals, LLC © 2016


66
echo “Empty value specified.”
fi
;;
esac
done
done

56 if Statements

if list; then list; [ elif list; then list; ] ... [ else list; ] fi
If the first condition is very complex and if it is false you need to produce an error, many times it is
easier to not try to invert the first condition, instead simply use the colon (:) after the then and use the
else to produce the error message. For example, assuming we are in a loop,
if [ $month = “Nov” -a $dayOfWeek = “Tue” \
-a $dayOfMonth -le 7 \
-a $timeOfDay -ge 0700 \
-a $timeOfDay -le 1900 ] ; then
: “Go Vote”
else
echo “Polls are closed.” >&2
break
fi
# Continue with voting process...

Enterprise Linux Professionals, LLC © 2016


67
57 case

case word in [ [(] pattern [ | pattern ] ... ) list ;; ] ... esac


If you have an if elif elif … sequence that are all checking the same variable for a different value, a
case statement is probably more efficient. So, instead of:
if [ $day = “Sun” ] ; then sun=1; \
elif [ $day = “Mon” ] ; then mon=1; \
elif [ $day = “Tue” ] ; then tue=1; \
elif [ $day = “Wed” ] ; then wed=1; \
elif [ $day = “Thu” ] ; then thu=1; \
elif [ $day = “Fri” ] ; then fri=1; \
elif [ $day = “Sat” ] ; then sat=1; \
else echo “Invalid day \”$day\”.” ?&2 ; fi
use:
case $day in
Sun) sun=1;;
Mon) mon=1;;
Tue) tue=1;;
Wed) wed=1;;
Thu) thu=1;;
Fri) fri=1;;
Sat) sat=1;;
*) echo “Invalid day \”$day\”.” ?&2
esac

57.1 Complex Conditions

Complex conditions can occur in both if and case statements.


Remembering our example above that was:
# At least one of rd, wr or ex is non-zero.
msg=””; msep=””;

Enterprise Linux Professionals, LLC © 2016


68
if [ $rd -ne 0 ] ; then msg=”$msg${msep}read”; msep=”, “; fi
if [ $wr -ne 0 ] ; then msg=”$msg${msep}write”; msep=”, “; fi
if [ $ex -ne 0 ] ; then msg=”$msg${msep}execute”; msep=”, “; fi
# Change the final “, name permission”
# to “ and name permissions”.
msg=$(echo “$msg permission” | sed \
-e 's/, \([^,][^,]*\)$/ and \1s/' )
echo “File $file is missing $msg.” >&2
Knowing that rd, wr and ex are either 0 or 1, using a case statement would look like this:
case $rd$wr$ex in
000) msg=””;; # All permissions are set
001) msg=”executable permission”;;
010) msg=”writable permission”;;
011) msg=”writable and executable permissions”;;
100) msg=”readable permission”;;
101) msg=”readable and executable permissions”;;
110) msg=”readable and writable permissions”;;
111) msg=”readable, writable and executable permissions”;;
*) msg=””; echo “Unexpected rwx \”$rd$wr$ex\”.” >&2
esac
if [ “$msg” != “” ] ; then
echo “File $file is missing $msg.” >&2
fi
Another example could be something like if all variables are empty, print an error:
if [ “$var1$var2$var3” = “” ] ; then
echo “error”
fi
which is simpler than:
if [ “$var1” = “” -a “$var2” = “” -a “$var3” = “” ] ; then
echo “error”
fi

Enterprise Linux Professionals, LLC © 2016


69
58 Solving Real-World Problems

58.1 Determine Which Systems are Available on a Network


Determine your network and ping every ip address in it using a for loop.
for i in $(seq 1 254);
do ping -c1 192.168.1.$i;
done

58.2 Speed Up Common Tasks


Alias the navigation of a system by using the ~/.bashrc login script:
alias eth='cd /etc/sysconfig/network-scripts'
alias kern='cd /usr/share/doc/kernel-doc-3.10.0/sysctl'
Look through your history to find two more common tasks you can shorten.

58.3 Create 100 Users


Hint: This is done with the newusers command.

58.4 Get a List of Files Accessed in the Last Five Minutes


The find command has a -newer option that will select files newer that the -newer file. We can use
touch to set a timestamp for five minutes earlier that now and use that as the file following the -newer
option. We can use the date command to get the current time and subtract 5 minutes. We will make the
number of minutes a variable so we can easily change the number of minutes and we will recognize a
-a, -c and -m options for access time, change time or Modify time, respectively. We will give the newer
file a unique name. newerfile=”/tmp/newer.$$”. Then build the touch command: touch -t $
(date “+%Y%m%d%H%M.%S” -d “-$min minutes” ) “$newerfile”, then run find as:
find / -type f -newer “$newerfile” -print ; /bin/rm -f “$newerfile”

58.5 Unexpire a User’s Account and force them to set a new password

Use usermod --unlock --expiredate yyyy-mm-dd userid

58.6 Generate a List of Files by Size and Owner

This comes in handy when a file system partition nears 100% capacity. It allows the administrator to
quickly determine who the heavy users are. Sending an email to those users will usually yield results
by freeing up some space that they no longer need.

Enterprise Linux Professionals, LLC © 2016


70
The find starts at $dir (the directory getting full), $size is a size threshold so small files will not be
listed, e.g., size=10000, the time-style simplifies the date to yyyy/mm/dd so the users can
determine whether the file is recent or not (sometimes very old files can be removed with little
thought). The 2>/dev/null eliminates error messages from find. The cut keeps the owner, size, date
and filename fields. The sed removes a leading “./” as the front of the filename. The sort sorts on the
file size, the filename and the date. An alternate sort would be -k1,1 -k2nr,2 -k4,4 -k3,3 which
would sort by owner first, then size, filename and date.
find $dir -type f -size +${size}c -exec ls -l \
--time-style="+%Y/%m/%d" {} \; 2>/dev/null | \
cut -d" " -f3,5-7 | sed -e "s~ [.]/~ ~" | \
sort -k2nr,2 -k4,4 -k3,3

58.7 Generate a List of Home Directories by Size and Owner

This comes in handy when the /home file system partition nears 100% capacity. It allows the
administrator to quickly determine who the heavy users are. Sending an email to those users will
usually yield results by freeing up some space that they no longer need.
d=”[0-9]”
# nm will match a number
nm=”$d${d}*”
# T is a tab character
T=$(echo “t” | tr “t” “\t” )
# ws will match whitespace
ws=”[ $T][ $T]*”
# ns will match a non-slash field
ns=”[^/][^/]*”
du /home | sed -n -e "/^$nm$ws\/$ns\/$ns\$/p" | sort -k1nr,1 -k2,2

59 References

Unix Power Tools, Jerry Peek, Tim O'Reilly and Mike Loukides, O'Reilly & Associates
CentOS bash man page.

Enterprise Linux Professionals, LLC © 2016


71
60 Regular Expression Lab Answers

This section provides a possible answer to the three labs included above. They all ran successfully on a
CentOS 6.8 system.

60.1 Lab 1 Answer

Given a list of names, names=“bob carol ted alice”, capitalize the first letter of each name and
add “and” between each name and print the result which should be “Bob and Carol and Ted and Alice”.
This is one way to do it:
#!/bin/bash
# Lab 1
# Given a list of names, e.g., names=“bob carol ted alice”,
# capitalize the first letter of each name and change the space
# between each name to " and " and print the result which should
# be "Bob and Carol and Ted and Alice".
# Uses three programs: cut, tr and sed.
names="bob carol ted alice"
output=""
osep=""
# $names is specifically not quoted,
# so the for processes is
# for name in bob carol ted alice
for name in $names
do
First=$(echo "$name" | cut -c 1-1 | tr "[a-z]" "[A-Z]" )
# The sed expression does the following.
# 1) Keep the first character and three through the end,
# dropping the second character which is the first
# lowercase letter of the original name.
Name=$(echo "$First$name" | sed -e "s/\(.\).\(.*\)/\1\2/" )
# Add this name to the output following the previous output
# value and the output separator.

Enterprise Linux Professionals, LLC © 2016


72
output="$output$osep$Name"
# Change the output separator to a space-”and”-space.
osep=" and "
done
echo "$output"
exit 0

60.2 Lab 2 Answer

Given the string, sentence1=”A Rat in the House might eat the Ice Cream”, produce the
word “arithmetic” from the first letter of each word. This is one way to do it:
#!/bin/bash
# Lab 2
# Given the string, sentence1=”A Rat in the House might eat the
# Ice Cream”, produce the word “arithmetic” from the first letter
# of each word.
# Uses one program: sed.

sentence1="A Rat in the House might eat the Ice Cream"


# The echo puts a space in front of the string so sed expression 2
# processes the first character.
# The sed expressions do the following.
# 1) Change any uppercase letters to lowercase letters.
# (a better way would be to create two variables, e.g.,
# AZ="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
# az="abcdefghijklmnopqrstuvwxyz"
# and use "y/$AZ/$az/")
# 2) Capture the first character following a space and replace with
# just the captured character, skipping the space and following
# characters.
echo " $sentence1" | sed \
-e "y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/" \

Enterprise Linux Professionals, LLC © 2016


73
-e "s/ \([a-z]\)[a-z]*/\1/g"
exit 0

60.3 Lab 3 Answer

Given the string, sentence2=“The quick brown fox jumps over the lazy dog”, produce
the number of unique letters. The answer is 26. This is one way to do it:
#!/bin/bash
# Lab 3
# Given the string, sentence2=“The quick brown fox jumps over
# the lazy dog”, produce the number of unique letters. The
# correct answer is 26.
# Uses four programs: sed, tr, sort and grep.

# tmpfile is a temporary file we will use and remove.


# By using a variable, we can easily change the name
# in only one place.
tmpfile=”/tmp/lab3Letters”
AZ="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
az="abcdefghijklmnopqrstuvwxyz"
sentence2="The quick brown fox jumps over the lazy dog"
# The first sed expressions do the following.
# 1) Convert all uppercase letters to lowercase.
# 2) Remove all spaces.
# 3) Add a space after each character.
# 4) Remove the extra space following the last character.
# The tr makes a separate line of each letter. This same
# thing can be used with words.
# The sort keeps the unique ones.
# The tee keeps a copy of the sort output for the last step
# that shows how to convert from lines back to a string.
# The grep -c returns the count of lines with a character

Enterprise Linux Professionals, LLC © 2016


74
# at the beginning of the line.
# The final sed puts a sentence around the number.
echo "$sentence2" | sed \
-e "y/$AZ/$az/" \
-e "s/ //g" \
-e "s/\(.\)/\1 /g" \
-e 's/ $//' | \
tr " " "\n" | \
sort -u | \
tee “$tmpfile” | \
grep -c "^." | \
sed -e "s/\(.*\)/The number of unique letters is \1./"
# Now show how we can go from separate lines to a string
# of characters.
# We need the tr ; echo in a list or subshell because the
# tr removes the final newline, and the echo "" adds one
# back in. Why? Because sed does not like to work with
# a line with no newline.
{ tr "\n" " " <”$tmpfile”; echo ""; } | sed -e "s/ //g"
# This echo shows that there are 26 characters.
echo "123456789.123456789.123456"
/bin/rm -f /tmp/lab3Letters
exit 0

Enterprise Linux Professionals, LLC © 2016


75

You might also like