You are on page 1of 4

Essentials for Scientic Computing: Bash Shell Scripting Day 4

Ershaad Ahamed TUE-CMS, JNCASR

May 2012

Glob Expressions

Recall that in example above, our script accepted a list of lenames on the command line which the shell stored in $@. $@ is then used as the list of values over which the for loop should iterate, assigning each value to filename in sequence for each repetition. Suppose we needed to modify our script, so that the for loop will iterate over all les in the current working directory that end with txt. Our script will be #!/bin/bash for filename in *txt do sort "$filename" | uniq done What happens in this case is that the shell expands *txt into a space separated list of all les in the current directory that have zero or more characters followed by the literal string txt. So if output of the ls command for the directory in which we run the script looks like contxt fruits.txt txt txtfile.dat vegetables.txt When you execute the script, the shell interprets *txt and looks for any les in the current working directory that have zero or more characters in the name followed by txt. In our case it will be the following list. contxt fruits.txt txt vegetables.txt The shell then expands the expression *txt into this list. Therefore the for line in the script is eectively substituted with 1

for filename in contxt fruits.txt txt vegetables.txt The expression *txt is called a Glob expression. Within glob expressions, the characters *, ? and [], have special meanings. You already know what * stands for. A ? is interpreted as any single character. The glob expression r??l will match the pathnames reel, real, roll, or even r12l, etc. Suppose you need to match a single character like ? does, but need to restrict the characters that it matches. You can use the expression []. For example, if you only needed pathnames beginning with r and ending with l, and a digit in between, you would use r[0123456789]l. Here [0123456789] will match any one of the characters enclosed. Thus it will match pathnames r2l and r9l, but not r93l. If you wanted to match any character other than those enclosed, you would negate the list by placing a ! as the rst character. Thus to match rol and r l but not r9l, you would use r[!0123456789]l. Bash also supports character ranges, so the above can be written more conveniently as r[!0-9]l. Ranges may also be [a-z] for lowercase characters or [A-Z] for uppercase characters or any subset like [M-Q] or 4-7. What happens if a glob expression does not match any pathnames? In that case, the shell leaves the glob expression unexpanded and the glob expression characters *, ? or [] are treated literally.

Command Substitution

In command substitution, the shell can replace an expression with the output of any command. Consider the seq command. seq 10 produces the output 1 2 3 4 5 6 7 8 9 10 As an example, consider you had a directory with les suxed with numbers like dataset1, dataset23, etc., and you wanted to loop over the les dataset1 through dataset10, you could use the following for line. for i in $(seq 10) do echo dataset$i done The output will be 2

dataset1 dataset2 dataset3 dataset4 dataset5 dataset6 dataset7 dataset8 dataset9 dataset10 What happens here is that the shell interprets an expression enclosed in $() as a command. The command (or pipeline) is executed and the output of the command is substituted in place of the expression. So, the eective for loop above after substitution is for i in 1 2 3 4 5 6 7 8 9 10

Scripting on the Command Line

Earlier we had mentioned that the bash shell is a complete programming environment. You have seen some of the most commonly used constructs in the examples above. Until now, we have been creating scripts using an editor and saving it in a le to execute later. The good news is that the bash shell provides full access to the complete programming environment at the command line itself. That is, you can type any of the scripting examples above right at the command line with few or no modications and it will executed. This is usually as simple as replacing any new line with the ; character and typing the script as a single line. For example, this script #!/bin/bash for filename in *txt do sort "$filename" | uniq done need not be stored in a le and can be executed at the command line by typing for filename in *txt ; do sort "$filename" | uniq ; done Notice that we omitted the line #!/bin/bash since were not executing a script le anymore, and that we join each line into a single command by separating them with the ; character. Scripting is generally done this way when the script is too short to be worth the eort to write it in a script le, and when youre probably going to execute it only once and will not need it again. Certain elements of shell programming like Redirection and Glob expressions are commonly used in everyday use of the command line.



When writing scripts out on the command line, you should remember that these commands are not being executed in a subshell, which is the case for executing script les. This means that there may be side eects to you current shell session. For example, a cd command will aect the current working directory of the shell session, and any variables that are created will still be available (like $filename in the last example. Type echo $filename) after the prompt returns. This also means that you should be careful before assigning new values to any variable, because that variable may already exist and contain some value in the current shell session. Assigning to it will overwrite the previously stored value.