You are on page 1of 98

Lecture 1: Introduction, Basic UNIX

Advanced Programming Techniques Summer 2003

Unix Programming Environment1

Objective: To introduce students to the basic features of Unix and the Unix Philosophy (collection of combinable tools and environment that supports their use)

Basic commands File system Shell Filters (wc, grep, sort, awk)


of the examples for this lecture come from the UNIX Prog. Env. and AWK books shown (see lecture outline for full references)

Operating Systems
An Operating System controls (manages) hardware and software.

provides support for peripherals such as keyboard, mouse, screen, disk drives, software applications use the OS to communicate with peripherals. The OS typically manages (starts, stops, pauses, etc) applications.

Unix and Users

Most flavors of Unix (there are many) provide the same set of applications to support humans (commands and shells). Although these user interface programs are not part of the OS directly, they are standardized enough that learning your way around one flavor of Unix is enough.

Flavors of Unix
There are many versions of Unix that are used by lots of people:

SysV (from AT&T) BSD (from Berkeley) Solaris (Sun) IRIX (SGI) AIX (IBM) LINUX (free software)

The power of Unix

Open source, portability You can extend the basic functionality of Unix:

customize the shell and user interface. string together a series of Unix commands to create new functionality. create custom commands that do exactly what we want.

Structure of the UNIX system

Shell Kernel \ (OS)

There are many standard applications: file system commands text editors compilers text processing


Kernel (OS)
Interacts directly with the hardware through device drivers Provides sets of services to programs, insulating these programs from the underlying hardware Manages memory, controls access, maintains file system, handles interrupts, allocates resources of the computer Programs interact with the kernel through

system calls

Logging In
To log in to a Unix machine you can either:

sit at the console (the computer itself) access remotely, via SSH, e.g.

The system prompts you for your username and password. Usernames and passwords are case sensitive!

CS Dept. Accounts
See f/csLogin.html All CS machines (that you have access to) running Linux

tux machines the farm you may connect to from anywhere lab machines any of the desktops you may sit at in the lab, classrooms

Not administered by Drexel IRT

(typically) a sequence of alphanumeric characters of length no more than 8. the primary identifying attribute of your account. (usually) used as an email address the name of your home directory is usually related to your username.

a password is a secret string that only the user knows (not even the system knows!) When you enter your password the system encrypts it and compares to a stored string. passwords should have at least 6 characters It's a good idea to mix case, include numbers and/or special characters (don't use anything that appears in a dictionary!)

Home Directory
The users personal directory. E.g.,

/home/kschmidt /home/vzaychik

Where all your files go (hopefully organised into subdirectories) Mounted from a file server available (seemlessly) on *any* department machine you log into

Home Directory
Your current directory when you log in cd (by itself) takes you home Location of many startup and customization files. E.g.:
.vimrc .bashrc .bash_profile .forward .plan .mozilla/ .elm/ .logout

Files and File Names

A file is a basic unit of storage (usually storage on a disk). Every file has a name. Filenames are case-sensitive! Unix file names can contain any characters (although some make it difficult to access the file) except the null character and the slash (/). Unix file names can be long!

how long depends on your specific flavor of Unix

A directory is a special kind of file - Unix uses a directory to hold information about other files. We often think of a directory as a container that holds other files (or directories). A directory is the same idea as a folder on Windows.

More about File Names

Review: every file has a name (at least one). Each file in the same directory must have a unique name.

Files that are in different directories can have the same name.

The Filesystem (eg)




scully X

bin ls

etc who

hollid2 netprog unix

Unix Filesystem
The filesystem is a hierarchical system of organizing files and directories. The top level in the hierarchy is called the "root" and holds all files and directories in the filesystem. The name of the root directory is /

The pathname of a file includes the file name and the name of the directory that holds the file, and the name of the directory that holds the directory that holds the file, and the name of the up to the root The pathname of every file in a given filesystem is unique.

Pathnames (cont.)
To create a pathname you start at the root (so you start with "/"), then follow the path down the hierarchy (including each directory name) and you end with the filename. In between every directory name you put a "/".

Pathname Examples
/ bin/ etc/ home/ scully/ X tmp/ usr/

Hollid2/ netprog unix/

bin/ local/ ls who

Syllabus /home/hollid2/unix/Syllabus


Absolute Pathnames
The pathnames described in the previous slides start at the root. These pathnames are called "absolute pathnames". Special absolute:

~kschmidt/ /home/kschmidt (for users home directories only) ~/ Your home directory (so, relative to login, $USER)

Relative Pathnames
Prefixed w/the current directory, $PWD So, relative to the current working directory $ cd /home/hollid2 $ pwd /home/hollid2 $ ls unix/Syllabus unix/Syllabus $ ls X ls: X: No such file or directory $ ls /home/scully/X /home/scully/X

Special Relative paths

. The current directory .. The parent directory $ pwd /home/holid2 $ ls ./netprog ./netprog $ ls ../scully X

Disk vs. Filesystem

The entire hierarchy can actually include many disk drives.

some directories can be on other computers

/ bin etc users tmp usr



Commands for Traversing Filesystem

ls lists contents of a directory

-a all files -l long listing

pwd print working (current) directory cd change directory

w/out argument, takes you home

man Pages
To get information about anything that's been properly installed, use man:
man ls man cat man man

Linux boxes also have info pages

The ls command
The ls command displays the names of some files. If you give it the name of a directory as a command line argument it will list all the (unhidden) files in the named directory.

Command Line Options

We can modify the output format of the ls program with a command line option. The ls command support a bunch of options:

l long format (include file times, owner and permissions) a all (shows hidden* files as well as regular files) F include special char to indicate file types. C place into columns

*hidden files have names that start with "."

cd change directory
The cd command can change the current working directory: cd change directory
The general form is: cd [directoryname]

Viewing files
cat concatenate, send to stdout. View contents of text files less, more paging utilities (hit h for help) od octal dump. For viewing raw data in octal, hex, control chars, etc.

Copying, removing, linking

rm remove file
rm ~/tmp/download

mv move (rename) file

mv old.file ../otherDir/

cp copy file
cp someDir/file someDir/file.copy

ln create hard (inode) or soft (symbolic) links to a file

Commands for directories

mkdir make directory rmdir remove directory
Directories can also be moved or renamed (mv), and copied (cp r)

Commands for Archiving

tar Tape Archive

makes a large file from many files

compression utility

gzip, gunzip

tar on Linux does gzip compression with the z option: $ tar czf 571back.tgz CS571 $ tar xzf assn1.tgz

File attributes
Every file has some attributes:

Access Times:
when the file was created when the file was last changed

when the file was last read

Size Owners (user and group) Permissions Type directory, link, regular file, etc.

File Time Attributes

Time Attributes:

when the file was last changed ls -l sort by modification time ls -lt

File Owners
Each file is owned by a user. You can find out the username of the file's owner with the -l or -o option to ls:
[jjohnson@ws44 winter]$ ls -l total 24 drwxr-xr-x 7 jjohnson users -rw------1 jjohnson users -rw-r--r-1 jjohnson users 80 Jan 8258 Jan 8261 Jan 3 3 3 2005 cs265/ 2005 cs265.html 2005 cs265.html~

ls -l
$ ls -l foo -rw-rw----

1 hollingd grads 13 Jan 10 23:05 foo

size owner group time



File Permissions
Each file has a set of permissions that control who can mess with the file. There are three types of permissions:

read abbreviated r write abbreviated w execute abbreviated x

There are 3 sets of permission:

2. 3.

user group other (the world, everybody else)

ls -l and permissions

User Type of file: - plain file d directory s symbolic link Group Others


r - allowed to read. w - allowed to write x - allowed to execute

r - allowed to see the names of the file. w - allowed to add and remove files. x - allowed to enter the directory


Changing Permissions
The chmod command changes the permissions associated with a file or directory. There are a number of forms of chmod, this is the simplest: chmod mode file

chmod numeric modes

Consider permission for each set of users (user, group, other) as a 3-bit #

r4 w2 x1

A permission (mode) for all 3 classes is a 3digit octal #

755 rwxr-xr-x 644 rw-rr-700 rwx------

chmod - examples
$ chmod 700 CS571 $ ls o Personal
drwx-----10 kschmidt 4096 Dec 19 2004 CS571/

$ chmod 755 public_html $ chmod 644 public_html/index.html $ ls ao public_html

drwxr-xr-x drwx--x--x -rw-r--r-16 kschmidt 4096 Jan 8 10:15 . 92 kschmidt 8192 Jan 8 13:36 .. 5 kschmidt 151 Nov 16 19:18 index.html

$ chmod 644 .plan $ ls o .plan

-rw-r--r-5 kschmidt 151 Nov 16 19:18 .plan

chmod symbolic modes

Can be used to set, add, or remove permissions Mode has the following form:

u user g group o other a all + add permission - remove permission = set permission

chmod examples
$ ls -al foo -rwxrwx--x 1 hollingd grads foo $ chmod g-wx foo $ ls -al foo -rwxr----x 1 hollingd grads foo $ chmod u-r . $ ls ls: .: Permission denied

Shell as a user interface

A shell is a command interpreter, an interface between a human (or another program) and the OS

runs a program, perhaps the ls program.

allows you to edit a command line. can establish alternative sources of input and destinations for output for programs.

Is, itself, just another program

Bourne-again Shell (bash)

Well teach bash in this course Extension of the Bourne Shell (sh) Contains many of the Korn Shell (ksh) extensions You may use the shell of your choice (tcsh, zsh, etc.), but thats on you.

Session Startup
Once you log in, your shell will be started and it will display a prompt.

(for our examples, we will use $ as the prompt. It is not part of the input)

When the shell is started it looks in your home directory for some customization files.

You can change the shell prompt, your PATH, and a bunch of other things by creating customization files.

Each shell supports some customization.

User prompt Where to find mail Shortcuts

The customization takes place in startup files files that are read by the shell when it starts up

Startup files
sh,ksh: /etc/profile (system defaults) ~/.profile bash: ~/.bash_profile ~/.bashrc ~/.bash_logout csh: ~/.cshrc ~/.login ~/.logout

Incorrect login
You will receive the Password: prompt even if you type an incorrect or nonexistent login name Can you guess why?

Entering Commands
The shell prints a prompt and waits for you to type in a command. The first token on the line is taken to be a command (for now). Come in 2 flavors:

shell builtin - commands that the shell interprets directly. External programs (utilities) standalone programs on disk (directories in your $PATH are searched, in order)

Interpreting a Command - type

When a command is seen, the shell:
1. 2. 3. 4.

Checks for aliases Checks for user-defined functions Looks for a builtin Checks directories in $PATH for a utility

Use Bashs type builtin to see what the shell is using:

kschmidt@ws60 kschmidt> type echo echo is a shell builtin kschmidt@ws60 kschmidt> type chmod chmod is /bin/chmod

Command Options and Arguments

standardized command syntax (applies to most commands): command option(s) arguments options modify the way in which a command works, often single letters prefixed with a dash (can be sometimes combined after a single dash

Getting help
manual original Unix help (flat, single page) $ man who $ man man info 2-d system, emacs-like navigation $ info who The resource frame on the class page Internet google, wikipedia The linux documentation project ( Safari online Friends, group-mates, and others

Some simple commands

date print current date who print who is currently logged in finger usr more information about


ls -ao lists (long) all files in a directory du -sh disk usage summary, human readable quota

Logging off
exit command

Exits the shell If it is the login (top-level) shell, then it disconnects you

A shell is just another program that is running. Can recursively invoke shells Please dont just disconnect w/out exiting

Standard I/O
When you enter a command the shell creates a subshell to run the process or script. The shell establishes 3 I/O channels:

Standard Input (0) keyboard Standard Output (1) screen Standard Error (2) screen

These streams my be redirected to/from a file, or even another command

Programs and Standard I/O

Standard Input (STDIN)


Standard Output (STDOUT)

Standard Error (STDERR)

Terminating Standard Input

If standard input is your keyboard, you can type stuff in that goes to a program. To end the input you press Ctrl-D (^D), the EOF signal, on a line by itself, this ends the input stream. The shell is a program that reads from standard input. What happens when you give the shell ^D (see the bash set command, ignoreeof)

Shell metacharacters
Some characters have special meaning to the shell. These are just a few:

I/O redirection < > | wildcards * ? [ ] others & ; $ ! \ ( ) space tab newline

These must be escaped or quoted to inhibit special behavior

* matches 0 or more characters ? matches exactly 1 character [<list>] matches any single character in <list> E.g.
ls ls ls ls *.cc list all C++ source files in directory a* list all files that start w/a a*.jpeg list all JPEGs that start w/a * - (make sure you have a subdirectory, and try it)

Wildcards (more examples)

ls file? - matches file1, file2, but not file nor file22 ls file?.*.DEL - matches file1.h.DEL,, file3..DEL but not file8.DEL nor file.html.DEL These are not regular expressions!

Wildcards - classes
[abc] matches any of the enclosed characters ls T[eE][sS][tT].doc

[a-z] matches any character in a range

ls [a-zA-Z]* [!abc] matches any character except those

ls [!0-9]*

Shell Variables
bash uses shell variables to store information Shell variables are used to affect the behavior of the shell, and many other programs We can access these variables:

set new values for some to customize the shell. find out the value of some to help accomplish a task.

Setting/Viewing Variables
To assign (in sh, ksh, bash):
VAR=someString OTHER_VAR=I have whitespace Note, no whitespace around the =!

To view (dereference) a variable:

$ echo $VAR someString $ echo $OTHER_VAR I have whitespace

Shell maintains some variables

Some common ones:
PATH list of directories shell searches for non-shell commands PS1 Primary prompt USER user's login name HOME users home directory PWD current working directory

Other Useful Ones

SHELL the login shell TERM the type of terminal interface HISTFILE where your command history is saved EDITOR holds user's preferred editor HOSTNAME machine's hostname SHELLOPTS status of various shell options (see Bash's set built-in)

Displaying Shell Variables

Prefix the name of a shell variable with "$". The echo command will do: $ echo $HOME $ echo $PATH You can use these variables on any command line: $ ls -al $HOME

Setting Shell Variables

You can change the value of a shell variable with an assignment command (this is a shell builtin command):
HOME=/etc PATH=/usr/bin:/usr/etc:/sbin NEWVAR="blah blah blah"


command (shell builtin)

The set command with no args prints out a list of all the shell variables. Some bash options

noclobber won't let re-direct overwrite an existing file ignoreeof Shell won't exit on ctrl-D vi use vi-style interface -n dry-run (just parse, but don't execute). Handy for scripts

Quoting escape character, \

Use the backslash to inhibit the special meaning of the following character: $ echo $USER kschmidt $ echo \$USER $USER $ echo a\\b a\b

Quoting double quotes

Double quotes inhibit all behavior except variable substitution, command substitution, and the escape, \
$ echo $USER is $USER kschmidt is kschmidt $ echo \$USER is $USER $USER is kschmidt $ echo I said, \Wait a moment\ I said, Wait a moment

Quoting single quotes

Single quotes inhibit nearly all special behavior May not contain a single quote $ echo I said Wait! I said Wait! $ echo My name is $USER My name is $USER $ mv rambleOnByLedZeppelin ramble on led zeppelin

Input Redirection
The shell can attach things other than your keyboard to standard input.

A file (the contents of the file are fed to a program as if you typed it). A pipe (the output of another program is fed as input as if you typed it).

Output Redirection
The shell can attach things other than your screen to standard output (or stderr).

A file (the output of a program is stored in file). A pipe (the output of a program is fed as input to another program).

Redirecting stdout
Use > after a command (and its arguments) to send output to a file: ls > lsout

if lsout previously existed it will be truncated (gone), unless noclobber is set (see bash)

Redirecting stdin
To tell the shell to get standard input from a file, use the < character: sort < nums The command above would sort the lines in the file nums and send the result to stdout.

You can do both!

sort < nums > sortednums
tr a-z A-Z < letter > rudeletter

Appending Output
Use >> to append append output to a file:
ls /etc >> foo ls /usr >> foo

Easy way to concatenate files:

cat rest_of_file >> my_file

Redirecting stderr
stderr is file descriptor 2, so:
gcc buggy.c 2> error.log grep [Vv]era *.html > log 2> errorlog

To send both to the same place (stdout is file descriptor 1):

find . -name 'core*' > core.lis 2>&1

find . -name 'core*' 2> core.lis

Pipes connecting processes

A pipe is a holder for a stream of data. A pipe can be used to hold the output of one program and feed it to the input of another.


Asking for a pipe

Separate 2 commands with the | character. The shell does all the work! ls -1 | sort ls -1 | sort > sortedlist ls -1 | sort | head > top.ten

Process Control
Processes are run in a subshell (by default) Subshells inherit exported variables Each process is has an ID (pid) and a parent (ppid) Use the ps utility to look at some processes: $ ps PID TTY TIME CMD 350 pts/4 00:00:00 bash 22251 pts/4 00:00:00 vim 22300 pts/4 00:00:00 ps

Process Control (cont.)

Use the f option for a long listing: $ ps f
UID PID kschmidt 350 kschmidt 22251 kschmidt 22437 PPID 349 350 350 C 0 0 0 STIME 10:06 17:32 17:36 TTY pts/4 pts/4 pts/4 TIME 00:00:00 00:00:00 00:00:00 CMD -bash vim myHomework ps -f

Use the e option to see more processes (all of them).

$ ps e | grep xmms 29940 pts/0 00:33:47 xmms

Killing a process (not usually nice)

The kill command sends a signal to a process (the given pid) By default, sends TERM (terminate), which asks the process to finish, so that it may do clean-up use -9 to send a KILL (wont be ignored), but no cleanup My mp3 player hangs once in while:
$ kill -9 29940

Job Control
The shell allows you to manage jobs

place jobs in the background move a job to the foreground suspend a job kill a job

Background jobs
If you follow a command line with "&", the shell will run the job in the background.

you don't need to wait for the job to complete, you can type in a new command right away. you can have a bunch of jobs running at once. you can do all this within a single terminal (window). ls -lR > saved_ls &

Listing jobs
The command jobs will list all background jobs:
> jobs [1] Running > ls -lR > saved_ls &

The shell assigns a number to each job (this one is job number 1).

Suspending and Resuming the Foreground Job

You can suspend the foreground job by pressing ^Z (Ctrl-Z).

Suspend means the job is stopped, but not dead. The job will show up in the jobs output.

You give fg a job number (as reported by the jobs command) preceeded by a %.

Without an argument, fg brings the last job forward

ls -lR > saved_ls &

$ jobs [1] Stopped $ fg %1 ls -lR > saved_ls

Placing a suspended job in the background

If its in the foreground, suspend it Use bg, just as you did fg, to let a suspended job continue in the background:
$ bg %3

Killing a job
Kill may also take a job number or even a job name, introduced by %:
$ find . name core\* -print > corefiles & $ firefox& $ jobs [1]+ Running find . name [2]+ Running firefox $ kill %2

A text editor is used to create and modify text files. The most commonly used editors in the Unix community:

vi (vim on Linux)
$ vimtutor

$ emacs

Then, hit ctrl-h t (thats control-h, followed by t)

You must learn at least one of these editors

The Unix Philosophy

Stringing small utilities together with pipes and redirection to accomplish non-trivial tasks easily E.g., find the 3 largest subdirectories:
$ du s * | sort nr | head -n3
120180 22652 9472 Files Zaychik tweedledee.tgz

Programs that read some input (but dont change it), perform a simple transformation on it, and write some output (to stdout) Some common filters

wc word count (line count, character count) tr translate grep, egrep search files using regular expressions sort sorts files by line (lexically or numerically) cut select portions of a line uniq Removes identical adjacent lines head, tail displays first (last) n lines of a file

pipes and combining filters

Connect the output of one command to the input of another command to obtain a composition of filters who | wc -l ls | sort -f ls -s | sort -n ls -l | sort -nr -k4 ls -l | grep ^d