Geophysical  Computing

Geophysical Computing

L01_Intruduction to the Unix OS­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L01­1  L02_Awk, Cut, Paste and Join ­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L02­1               L03_C shell Scripting – Part 1­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L03­1 L04_C shell Scripting – Part 2­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L04­1 L05_Generic Mapping Tools (GMT) – Part 1­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L05­1 L06_Generic Mapping Tools (GMT) – Part 2­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L06­1 L07_Generic Mapping Tools (GMT) – Part 3­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L07­1 L08_Generic Mapping Tools (GMT) – Part 4­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L08­1 L09_Fortran Programming – Part 1­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L09­1 L10_Fortran Programming – Part 2­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L10­1 L11_Fortran Programming – Part 3­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L11­1 L12_Fortran Programming – Part 4­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L12­1 L13_Supercomputing – Part 1­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L13­1 L14_Supercomputing – Part 2­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L14­1 L15_POV­Ray – Part 1­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L15­1 L16_POV­Ray – Part 2­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L16­1 L17_POV­Ray – Part 3­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L17­1 L18_Finalizing Ilustrations for publication­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­L18­1

Geophysical Computing

L01-1

L01 – Introduction to the Unix OS 1. What is Unix?
Unix is an operating system (OS): it manages the way the computer works by driving the processor, memory, disk drives, keyboards, video monitors, etc. and by performing useful tasks for the users. Unix was created in the late 1960s as a multiuser, multitasking system for use by programmers. The philosophy behind the design of Unix was to provide simple, yet powerful utilities that could be pieced together in a flexible manner to perform a wide variety of tasks. A key difference between the Unix OS and others you are familiar with (e.g., PC) is that Unix is designed for multiple users. That is multiple users may have multiple tasks running simultaneously. Its original purpose was to facilitate software development. It is the primary OS used by physical scientists everywhere, and all supercomputing facilities use it. To put it bluntly, if you are at all on the numerical side of physical sciences, then you need to learn how to operate on a Unix OS. In this class we are actually using a Linux OS. What is Linux? Basically the same thing as Unix. Only, Linux is developed by user contributions. Several flavors have arisen (Red Hat, Suse, Fedora, etc.) but they are all basically the same thing. What you can do in Unix you can do in Linux (the corollary of which isn’t necessarily true). In this class I refer to Unix and Linux interchangeably. If I say Unix I mean Linux, and most of what I say is applicable to both. The main difference between the two is that: (1) Unix development has corporate support. This means it tends to be a more stable OS and is the choice of those for whom stability is the top priority, (2) Linux is developed by a community of users and is free. Thus, you get what you pay for? Well, it has some stability issues and bugs creep up. But, the bugs are also quickly squashed and new content, programs, and functionality has quickly outpaced that of Unix. I’m not going to go into detail into what the Unix/Linux OS is comprised of, but there are 3 basic entities: 1) The Kernel – The core of the UNIX system. Loaded at system start up (boot); manages the entire resources of the system. Examples of what it does are: interpreting and executing instructions from the shell, managing the machine’s memory and allocating it to processes, scheduling the work done by the cpu’s. 2) The Shell – Whenever you login to a Unix system you are placed in a shell program. The shell is a command interpreter; it takes each command and passes it to the operating system kernel to be acted upon. It then displays the results of this operation on your screen. Several shells are usually available on any Unix system, each with its own strengths and weaknesses. Examples are the Bourne Shell (sh), C Shell (csh), and Bourne Again Shell (bash). 3) Utilities -- UNIX provides several hundred utility programs, often referred to as commands. The commands accomplish universal functions such as printing, editing files, etc.

Geophysical Computing

L01-2

2. Logging into the Unix side of things
To log into the Linux side of the FASB computers before hitting return after entering your username and password select: Session GNOME

3. Getting started – really basic Unix
Now that you’ve logged in and opened up a terminal you are looking at a window that contains your home directory space. In case you are already confused, on Unix systems we refer to folders as directories. Your Home Directory
• • •

Each user has a unique home directory. Your home directory is that part of the file system reserved for your files. After login, you are put into your home directory automatically. This is where you start your work. You are in control of your home directory and the files which reside there. You are also in control of the file access permissions to the files in your home directory. Generally, you alone should be able to create/delete/modify files in your home directory. Others may have permission to read or execute your files as you determine. In most UNIX systems, you can move around or navigate to other parts of the file system outside of your home directory. This depends upon how the file permissions have been set by others and/or the System Administrator.

Unix Commands Unix commands are programs that are supplied with the Unix OS to do specific tasks. They generally act like: >> command arguments Unlike your PC or Mac, instead of clicking a program icon, you type a program name in the terminal window. For example, type the following: >> date Date is an example of a Unix command. When used as above it simply returns the current date and time. But, we can often supply arguments to the command that modify the way the program works. For example: >> date –date==yesterday Here we supplied an argument asking us to return yesterday’s date instead of todays. One of the most important Unix commands is the ls (list) command. It lists the contents of the current directory you are in. Try the following:

Geophysical Computing

L01-3

>> ls >> ls –l >> ls –la We can create a new directory with the mkdir (make directory) command. Try: >> mkdir garbage Now entering the ls command should show us that we now have a new directory called garbage. We can go into this directory by using the cd (change directory) command: >> cd garbage To move back out of the garbage directory into the previous directory type: >> cd ../ Note that we can go back multiple directories if we want to: >> cd ../../../ (etc.)

Where the .. always stands for the previous directory. After moving around directories it can get confusing as to where you are. So use pwd (print working directory): >> pwd to see where you are. You can always go right back to your home directory by typing either: >> cd ~username or just >> cd ~ or even just >> cd The primary reason to use the tilde (~) is so that we can back to directories starting from our home directory. (e.g., >> cd ~/Utilities/ if the Utilities directory was located in my home directory) Perhaps your sick of your directory called garbage. You can get rid of it with rmdir (remove directory): >> rmdir garbage We can also make files. We will talk more about this now, but let’s just try the following: >> echo “I love geophysics” > geophys.txt

Geophysical Computing

L01-4

>> echo “I really love geophysics” >> geophys.txt The echo command just echo’s whatever you write, and in this case redirected “I love geophysics” into a text file called geophys.txt. Perhaps I wasn’t happy with the file name I created and wanted it be named Geophys.txt (note that we are working in the C Shell and that file names are case sensitive), then I could use the mv (move) command: >> mv geophys.txt Geophys.txt Or maybe I wanted another copy of this file Geo.txt called Geo_copy.txt >> cp Geophys.txt Geo_copy.txt You can guess already that the cp command means copy. As you can tell, there are a lot of Unix commands. The examples shown above are some of the most important, but are really just the tip of the iceberg. The following web page shows what some of the most important basic commands are: http://mally.stanford.edu/~sr/computing/basic-unix.html

Special Characters There are also some very special characters that you can type in Unix. The next table shows just a few of them: Character * ? $ & ; > Function wildcard for any number of characters in a filename wildcard for a single character in a filename References a variable executes a command in the background separates commands on the same line redirects standard output

From the above example we should still have two files around named Geo.txt and Geo_copy.txt. What if I want to see all of the files I have that have a name starting with Geo? I can use the * character: >> ls Geo* Or just files with the word copy in them? >> ls *copy* We will introduce more of these characters later. Getting Information on Commands Most Unix commands have several options and can be used in a variety of ways. To get full instructions on a Unix command there is the man (manual) utility. For example to see all of the ways you can use the ls command type:

Geophysical Computing

L01-5

>> man ls Logging Off the System To log off the system select the Red Hat icon in the lower left hand corner of the screen. Choose the Log Out option.

4. Editing Files
One of the most important choices you will make in learning Unix is what text editor should you use. This is likely not a question many of you anticipated, as on a Windows or Mac one rarely ever uses a text editor – unless you refer to Microsoft Word (it is a text editor, but how many times do you store data in .txt format?). On a Unix system there are several choices of editors. Peoples defense of their choice of editor is similar to a religious conviction, so be careful in talking bad about other editors. Two of the most popular choices of editors are: • vi – (pronounced vee – eye) one of the earliest advanced editors, was installed on every system. vi or die was a common expression, as it was the only editor found on many systems. emacs – a more recent editor, many people prefer this one, and it can now be found as commonly as vi. Emacs has many more commands than vi.

You can use whatever text editor you choose. The choice is yours, but you are responsible for learning one on your own. Learning to use an editor is not a choice though. This is mandatory if you want to be successful in computation. I have heard not knowing an advanced editor described as “being like a car without an engine under its hood.” I personally like vi, which is reportedly rather difficult to learn at first. But, I can maneuver around a file in vi way faster than any other editor so here’s a super fast intro: To create a new file, or open an old file type: >> vi myfile.txt The key thing to remember is that vi has two modes: command, and insert. When you are in command mode, everything you type on the keyboard gets interpreted by vi as a command. Command mode is the mode you start out in. Now that you are in your file enter into insert mode by hitting the i (for insert) key. You should notice that at the bottom of the screen it now says you are in -- INSERT -- mode. Now anything you type shows up on the screen. When you are in insert mode, you can switch back to command mode by pressing the Esc key on your keyboard. When you are in command mode, there are many keys you can use to get into edit mode, each one gives you a slightly different way of starting to type your text. In addition to insert there is also a for append, o for open a line, etc.

Geophysical Computing

L01-6

To wrap up a vi session, hit the Esc key to get back into the command mode. Now you save the file by hitting Shift+ZZ. A good tutorial can be found here: http://www.rru.com/~meo/useful/vi/vi.intro.html A vi reference card is located here: http://limestone.truman.edu/~dbindner/mirror/vi-ref.pdf

5. A few more important commands
Now that we know how to create files what are the basic ways we can access their contents. To start out everyone create a file called temp.txt: >> vi temp.txt hit the i key for insert and type some words, for example here are some nice words from Edward Abbey’s famous book Desert Solitaire: “The love of wilderness is more than a hunger for what is always beyond reach; it is also an expression of loyalty to the earth, the earth which bore us and sustains us, the only home we shall ever know, the only paradise we ever need – if only we had the eyes to see. Original sin, the true original sin, is the blind destruction for the sake of greed of this natural paradise which lies all around us – if only we were worthy of it.” Now hit the Esc key and save the file by hitting Shift+ZZ. If we do an ls we can see that our file temp.txt now exists. But this doesn’t tell us anything about what is in the file. Viewing Files: There are several ways we can see the contents of the file. Try the following commands: >> cat temp.txt >> less temp.txt So, what was the difference between the two commands? Now try the following: >> head -1 temp.txt >> tail -1 temp.txt These commands are obviously useful if you want to see the top or bottom of a file. What if we want to know something about the file like how many words does it contain? Look up the man page on wc (word count) and find out how you can (a) determine how many words the file contains, (b) how many characters the file contains, and (c) how many lines the file contains.

Geophysical Computing

L01-7

(a) (b) (c) Another really useful command is grep. This allows us to search files to find specific instances of words. For example, we could say let’s just find the lines in temp.txt that contain the word paradise. >> grep paradise temp.txt Grep is really useful when I’m searching for something specific in a lot of files.

Zipping and Unzipping Files: It’s important to understand this as most of the files you will download off the webpage for this course are zipped. There are, as is typical, many choices of zipping utilities. Generally we use the utility called gzip. To zip up, or compress, our file temp.txt simply type: >> gzip temp.txt Now if you do an ls you will see the filename is changed to temp.txt.gz. Note that the .gz extension will often be found on files you get from me. This means they have been compressed with gzip. What happens if you try and view the contents of this file with cat temp.txt.gz? Right, to view its contents we need to unzip or uncompress it: >> gunzip temp.txt.gz You will also notice that most of the files you download from me have the .tar extension. These tar files stand for tape archiving (still in use for backups today!). Usually we use tar to lump a group of files together into one single file. Then we only need to send one file and not a bunch. To see how tar works let’s do as follows: >> cp temp.txt temp_copy.txt >> mkdir TempFiles >> mv temp*.txt TempFiles >> tar cvf TempFiles.tar TempFiles Now you will notice there is a file called TempFiles.tar. The tar command we used in the above example used the flags to create a file called TempFiles.tar from the directory and its contents TempFiles. Most of the files you download from the webpage will have the .tar extension. To unpack these files:

Geophysical Computing

L01-8

>> tar xvf TempFiles.tar Where now we used the extract flag. Job status: It is also useful to see what is currently running on the computer you are using. The quickest way to do this is to use the top utility. Just type: >> top But, you can get specific information using the ps (processes) utility. E.g., to see what programs you personally are using type: >> ps –u username Where you fill in username with your personal username. This is especially useful if you’ve started a bunch of jobs or maybe someone else did on your computer and its eating up the cpu or memory. Notice that all jobs have a number associated with them under the column PID (Process ID number). This number is important. Don’t actually do this now – but in the event that you absolutely need to stop something that is running you can do this with the kill command: >> kill -9 PID This will force the process, whatever it is to be stopped. So only use this if you absolutely need to stop the job and you know what the job is.

6. Customizing your environment – the .cshrc file
Before we wrap up this intro lets talk about your C Shell Resource File or .cshrc (some people also call it a C Shark file but that drives me crazy so please don’t use it). This file is really important because it gets read by the Unix system every time you log in or every time you open up a new terminal window. This file lives in your home directory, so change directories to your home directory and let’s look at its contents: >> cd ~ >> less .cshrc There are two main things I want to point out in this file: (1) your search path, and (2) aliases. Search Path When you type a Unix command at the command prompt (e.g., cd or ls) the Unix Shell looks for a program with that name. Things like ls or cd or mkdir are all programs that reside in a directory somewhere. For example, if you want to find out where the ls command lives type: >> which ls

Geophysical Computing

L01-9

So, on the computer I am working on as I write this document, I see that ls is located at: /usr/bin/ls. Or, it lives in the directory /usr/bin/. For the Unix system to be able to execute the ls command it has to be in the Unix Search Path. That is, Unix has a special variable called PATH that contains a collection of directory names to search through for commands that are typed. To see what directories Unix is currently searching through for you type: >> echo $PATH OK, but what if I make a program that I want Unix to be able to use (and believe me you will!)? The most common thing to do is to create a special directory where you will store your personal programs, then add that directory name to the PATH variable. I put all of my personal programs into a directory called Utilities/bin. So, you could do the same: >> mkdir Utilities >> cd Utilities >> mkdir bin Now we need to add this directory to the search PATH. We do this by adding a line to our .cshrc file: >> cd ~ >> vi .cshrc now go down somewhere to the bottom of the page and insert: set PATH = ($PATH ~/Utilities/bin) Save the file with Esc then Shift+ZZ sequence. After you’ve saved the file, we need to tell the Unix system to re-read our .cshrc file. We do this by typing: >> source .cshrc Aliases Another fine use of the .cshrc file is to create aliases, or shortcuts. For example, instead of just the normal ls command I like using the following flags: ls -F -h --color=always. So, I can add a line to my .cshrc file that says every time I type ls, actually do: ls –F –h –color=always. We can do this by adding the following line to our .cshrc file: alias ls="ls -F -h --color=always" Another favorite of mine is to just be able to type net to launch an internet browser: alias net="mozilla &" I also like the fancy printing style that comes out of: alias lpt="a2ps -o- -d --medium=letter"

Geophysical Computing

L01-10

7. Homework
Probably the most important thing you can do this week is start getting a good handle on a text editor. Hence, I want to you pick a text editor and practice using it. If you do not do this you will quickly fall behind in this class in a manner you will not be able to recover from. So, your homework is: Choose a text editor and create some files. Create one file and tell me: (1) what your major is, (2) if you are a graduate student tell me who you are working with and what your research project is about or if you are an undergraduate tell me what your plans are after graduation, and (3) what you want to get out of this class. Also, if there is a special computational task or tool you want to learn in this class that isn’t currently on the syllabus please tell me what it is and why its important for you to learn that. Create a second file tell me which editor you chose to use and what your experience is about learning it. Now make a directory and move these files into that directory. Gzip the files and make a tar file of the directory with the files. Create the tar file with the following naming convention: Lastname_Firstname_HW1.tar. Copy that file to a location on my personal home space. That is, copy the file to: >> cp Lastname_Firstname_HW1.tar ~mthorne/GG5920_HW All of your homework will be turned into me this way.

Geophysical Computing

L02-1

L02 – Awk, Cut, Paste, and Join 1. Awk
Awk will be your best friend. If you know how to use awk you can just throw Excel in the trash and ponder why anyone ever decided it was a good idea to write Excel in the first place. So, now that you know how I feel, what is awk? Awk is a programming language. However, in the geosciences it is typically used on the command line to process text-based data. The name awk, comes from its authors names: Alfred Aho, Peter Weinberger, and Brian Kernighan. This lecture is aimed at giving you a basic working knowledge of awk. This document should just be viewed as The O’Reilly book features an an awk primer, for more info on all the things you can do auk on the cover. with awk there are a ton of amazing resources available on the web. To get started let’s create a simple example file to play around with. Using your favorite text editor create the following file named: example.txt. File: example.txt 1 shear 2 compressional 3 anisotropy 4 perovskite 5 olivine 5 10 30 2 25 20.00 2.00 3.50 45.50 33.19

Note: in awk we refer to each line in the file as a record, and each column as a field. So, in the above example file we have 5 total records and 4 fields. Awk works by scanning through each line of text (or record) in the file and carrying out any instructions you tell it on that line. In awk we access fields using syntax like: $1 or $2. $1 indicates that you are referring to the first field or first column. Example 1 - Printing fields: What is the output for the following examples? >> awk ‘{print $2}’ example.txt >> awk ‘{print $1, $4}’ example.txt >> awk ‘{print $4, $2}’ example.txt >> awk ‘{print $1$2}’ example.txt >> awk ‘{print $0}’ example.txt >> awk ‘{print $1$2”-->$”$4}’ example.txt

Geophysical Computing

L02-2

We can also do some simple arithmetic with awk. Example 2 – Simple arithmetic on fields >> awk ‘{print ($1*$3)}’ example.txt >> awk ‘{print ($4 - $3), ($1 + $1)}’ example.txt >> awk ‘{print ($3/$1), $2, (2*3.14*$1)}’ example.txt >> awk ‘{print int($4)}’ example.txt The last example shows that in addition to the simple arithmetic commands, awk also has some useful numeric functions, such as sin, cos, sqrt, etc. To see the full list check out the awk man page. A real useful ability is to be able to search within the files. First, let’s introduce some of the variables that are built into awk: awk Variable name FILENAME RS OFS ORS NF NR OFMT FS What it stands for Name of current input file Input record separator (Default is new line) Output field separator string (Blank is default) Output record separator string (Default is new line) Number of fields in input record Number of input record Output format of number Field separator character (Blank & tab is default)

These may not all make sense right now, but we’ll come back to some of them later. Example 3 – Simple sorting routines Try these examples on for size: >> awk ‘NR > 3 {print $0}’ example.txt >> awk ‘NR <= 3 {print $2}’ example.txt >> awk ‘$3 >= 10 {print $0}’ example.txt >> awk ‘$2 ~ /perov/ {print $0}’ example.txt >> awk ‘$2 !~ /perov/ {print $0}’ example.txt

Geophysical Computing

L02-3

The comparison operators that awk allows are: < <= == != >= > ~ !~ Less than. Less than or equal. Equal. Not equal. Greater than or equal. Greater than. Contains (for strings) Does not contain (strings)

To make things even more interesting we can add some logic to our conditionals! In the following examples && is the AND operator and || is the OR operator. Example 4 – sorting with logic >> awk ‘NR > 2 && NR < 5 {print $0}’ example.txt >> awk ‘$3 > 10 && $4 > 2.5 {print $0}’ example.txt >> awk ‘$2 ~ /aniso/ || $2 ~ /oliv/ {print $0}’ example.txt >> awk ‘NR >= 2 && $2 ~ /aniso/ || $2 ~ /oliv/ {print $0}’ example.txt You can also specify that awk does something either before starting to scan through the file (BEGIN) or after awk has finished scanning through the file (END). Example 5 – BEGIN and END >> awk ‘END {print $0}’ example.txt >> awk ‘END {print NR}’ example.txt >> awk ‘END {print NF}’ example.txt >> awk ‘BEGIN {print NF}’ example.txt >> awk ‘BEGIN { OFS = “_”} {print $1, $2}’ example.txt >> awk ‘BEGIN { FS = “o”} {print $1, $2}’ example.txt >> awk ‘BEGIN {print “Example #5”} {print $2} END {print “End of Example”}’ example.txt You can also set variables in awk and do operations with them. Occasionally it comes in handy.

Geophysical Computing

L02-4

Example 6 – awk variables Here’s a quick example that sets a variable x = 1 at the beginning and increments the variable by one at each record, printing the variable out as a new field for each record. >> awk ‘BEGIN {x=1} {print $0, x++}’ example.txt This is a slight variation on the above example. >> awk ‘BEGIN {x=0} {print $0,x+=10}’ example.txt The following table might help to make the above examples a little more transparent.

Assignment operator += -= *= %=

Use for Assign the result of addition Assign the result of subtraction Assign the result of multiplication Assign the result of modulo

Example a += 10 d += c a -= 10 d -= c a *= 10 d *= c a %= 10 d %= c

Equivalent to a = a + 10 a=a+c a = a - 10 a=a-c a = a * 10 a=a*c a = a % 10 a=a%c

In example #3, we showed an example of using awk with a conditional. >> awk ‘NR > 3 {print $0}’ example.txt Essentially, this example states: If the record number is greater than 3 then print out the entire line of the file. Awk also supports a syntax with if statements. E.g., >> awk ‘{if (NR > 3) print $0}’ example.txt is another way of doing the same thing. However, it is sometimes very useful to also have an else or else if statement to play around with. The next couple of examples show how to do this. Example 7 – Control structures >> awk ‘{if ($1 > 2) print $0; else print $1}’ example.txt >> awk ‘{if ($1 > 2) print $0; else if ($1 > 1) print $2; else print $1}’ example.txt

Geophysical Computing

L02-5

Using the command printf it is possible to format the output from awk. Printf is essentially the same as that in C. You define the width of the column, whether to left or right justify and the type of information that will be outputted—such as a string, floating point, or decimal number. Example 8 – Formatted Output >> awk ‘{print $1, $2, $3, $4}’ example.txt >> awk ‘{printf( “%4d %-20s %-5d %-7.2f\n”, $1, $2, $3, $4)}’ example.txt

2. Cut, Paste, and Join
This section describes three utilities that are often used in conjunction with awk for quickly manipulating fields in files. Paste Sometimes you may want to extract columns of information from different files and combine them into one file. Paste is the perfect utility for this. Consider the two files: A.txt a1 a2 a3 a4 a5 B.txt b1 b2 b3 b4 b5

We can combine them as follows: >> paste A.txt B.txt > C.txt Join If two separate files share a common field they can combined with join. Consider two files: A.txt Vs Vp Rho 7.2 11.3 6.6 B.txt Vs Vp Rho 6.3 12.4 5.9

Now try: >> join A.txt B.txt > C.txt

Geophysical Computing

L02-6

Cut Cut is incredibly useful for chopping up files into fields. Use the –d flag to specify a new delimiter, and the –f flag to state which fields to print out. Consider a file as follows (A.txt) that uses underscores to separate fields: Vs_7.2 Vp_11.3 Rho_6.6 One could just extract the numeric values by: >> cut –d_ -f2 A.txt Another place I find cut useful for is in extracting information out of file names. For example, suppose I have a bunch of SAC files (seismograms) that look as follows: >> ls >> HRU.UU.EHZ NOQ.UU.HHZ GMU.UU.EHZ CTU.UU.EHZ

The filename convention here looks like: station_name.network.component If I want to make a list of just the station names I could do something like: >> ls *UU* | cut –d. –f1 > stationlist.txt

3. Homework
1) Consider two files given below that each contain a set of Cartesian coordinates. Write an awk script to compute the distance between these pairs of points. Feel free to use any of the other commands we learned in this lecture as well. x1 0.0 0.5 0.75 1.0 y1 0.0 0.1 0.2 0.3 x2 0.0 -0.25 -0.5 -1.0 y2 0.0 0.1 0.2 0.3

2) Below is a table of S-wave velocities at the coordinates given by the indicated latitude, and longitude (φ) in degrees. Create a file exactly as shown below, and write an awk command that will convert the longitudes given in the file below from the interval: -180° ≤ φ ≤ 180° to the interval: 0° ≤ φ ≤ 360°. Note: longitudes from 0° to 180° in the original file should not change. Format your output, such that you have three distinct labeled columns and add a single decimal place to both the latitude and longitude values.

Geophysical Computing

L02-7

Lon -180 -135 -90 -45 0 45 90 135 180 -180 -135 -90 -45 0 45 90 135 180 3) Consider a file that looks as follows:

Lat -10 -10 -10 -10 -10 -10 -10 -10 -10 10 10 10 10 10 10 10 10 10

dVs 2.3 2.4 2.0 1.8 0.0 -0.3 -1.2 -1.5 0.0 2.4 2.6 2.1 1.6 -0.1 -0.4 -1.0 -1.0 0.3

Vs Vp Rho Vs Vp Rho Vs write an awk command that will print the total number of lines that contain the string Vs. 4) I have a group of SAC files named as follows: >> HRU.UU.EHZ NOQ.UU.HHZ GMU.UU.EHZ CTU.UU.EHZ

Using awk, how can we change the names of all of these files so that the EHZ or HHZ is replaced by just Z. So, for example the first file is renamed as: HRU.UU.Z

5) Write an awk command that will print the sum and average of column #1 of a file. The output should look like: >> Sum is: X; Average is: X

Geophysical Computing

L02-8

awk cheat sheet
# get total number of records in a file awk ‘END {print NR}’ # If NR is equal to shell variable ‘n’ print line awk ‘NR == ‘$n’ {print $0}’ # Sum the values along a column (column #2 in this example) awk ‘{ sum += $2} END {print sum}’ # Print the sums of the fields of every line awk '{s=0; for (i=1; i<=NF; i++) s=s+$i; print s}' # Print out file with double spacing awk ‘{print ; print “ “}’ # Print fields in reverse order awk '{ for (i = NF; i > 0; --i) print $i }' # if else syntax awk ‘{if ($1 > 2) print $0; else print $1}’ file # Concatenate every 5 lines of input, using a comma separator between fields awk 'ORS=NR%5?",":"\n"' file

Geophysical Computing

L03-1

L03 – C Shell Scripting - Part 1 1. What is a shell?
So now you’ve been using Linux operating system for a couple of weeks. Things are a little different here than in the world of Mac or PC that you are likely accustomed to. One major difference is that you are playing around in a terminal, and typing directly into a command line. Getting started in a Linux environment is like going through mouse detox. Instead of clicking our way around, everything happens at the command line of the terminal. But, how are the commands we type interpreted? This depends on the shell, where the shell is just a commandline interpreter. That is, a shell is really just a computer program that reads what you type into the terminal and then interprets what to do with it. There are a lot of different shells. The most common shells today seem to be the Bourne Again Shell (bash) and the C Shell, but there are some older ones you might encounter such as the Korn shell. To see which shells are actually available on your system type: >> cat /etc/shells I’m a big fan of the bash shell, and hence, in some nerdy circles am referred to as a basher! Nonetheless, C shell is a very common shell to use in geophysics. This is somewhat historical, the bash shell wasn’t written until 1987, long after most geophysicists started a tradition of shell scripting. The C shell was written in the late 1970’s and hence has had a longer time to get indoctrinated into the geophysics community. It also turns out to be quite simple to use.

2. What is a shell script?
Normally, when you are sitting at your terminal, the shell is interactive. This means the shell takes the command you type in and then it executes this command. This can be rather tedious if you want to do a larger number of commands in a specific order and maybe do it over and over again on different sets of data. Luckily, we can just write our sequence of commands into a text file, and then tell the shell to run all of the commands in this text file. This text file containing all of our commands is a shell script. Let’s make a simple one as an example. Open up a new file named example.csh with your favorite text editor and type the following: #!/bin/csh # the simplest shell script possible clear echo “geophysics kicks ass”

After creating this file, type the following on the command line: >> chmod +x example.csh

Geophysical Computing

L03-2

This will set the permissions for your new file example.csh such that you are allowed to execute it. You only need to do this once for a new file and not after every time you edit it. Now you can execute the commands in this text file by typing: >> ./example.csh A couple notes on the above script. Line 1: #!/bin/csh - this basically just says that I want to use the C Shell to interpret these commands. Every C Shell script must start out with this as the top-most line. Line 2: # the simplest… - you can add comments, and should frequently, to your scripts if you start the line out with the # symbol Filename: example.csh – unlike on a windows machine Linux machines do not require you to have a file extension in most cases. However, it usually makes sense for people to adopt some kind of nomenclature so that you quickly know what kind of file you are dealing with. Hence, I usually use .csh to let me know that I have a C Shell script. OK, now that we have that out of the way, type up the following script and see what it does #!/bin/csh # Script to print user information who currently login , # current date & time clear echo "Hello $USER" echo "Today is \c ";date echo "Number of user login : \c" ; who | wc –l echo "Calendar" cal

Note that some versions of C-Shell require you to use echo –e so that the \c will not print to the screen.

3. C Shell Variables
There are two types of variables: (1) System variables – that are created and maintained by the Linux system itself. We saw one example of these in the example script above: $USER. Another example would be if you wanted to print out your home directory then you could type: >> echo $HOME (2) User defined variables – that are created and maintained by the User.

Geophysical Computing

L03-3

Setting variables in a C Shell script is done in two ways: (a) String variables. String variables are just treated as a bunch of text characters. i.e., you cannot do math with them. String variables are created with the set command as shown below. #!/bin/csh set x = 1 set y = 10.5 set myvar = super echo $x $y $myvar echo $x + $y

(b) Numeric variables. The C Shell can only handle integer valued numeric variables. Setting variable names is done with the @ symbol. Below is a simple example. #!/bin/csh @ x = 1 @ x = $x + 10 echo $x What happens if you try: set x = $x + 10 in the above script? (c) Arrays of String Variables. You can also use a single variable name to store an array of strings. #!/bin/csh set days = (mon tues wed thurs fri) echo $days echo $days[3] echo $days[3-5] As a special note: variables are case sensitive. For example, the three following combinations of the letters n and o are all considered to be a different variable by the C Shell. This is important to remember as it is not the case with other programming languages (e.g., in Fortran all three of these variable names would be considered to be the same variable). set no = 10 set No = 11 set nO = 12

Geophysical Computing

L03-4

echo $no $No $nO

4. Displaying Shell Variables
In case you haven’t figured it out by now, we typically use the echo command to display text or the value of a variable when we want to write it out to the screen (writing to the screen is usually called writing to standard out). Usually, one just types: echo $my_variable_name But, in case you want to get fancy, do a man on echo and see what the following examples should produce: #!/bin/csh set minX = 80 echo “Xaxis Minimum is set to: “ $minX echo “Xaxis Minimum is set to: \a“ $minX echo “Xaxis Minimum is set to: “; echo $minX echo “Xaxis Minimum is set to: \c“; echo $minX echo “Xaxis Minimum is set to: \t“ $minX echo “Xaxis Minimum is set to: \\“ $minX It is also prudent at this point to consider the action of different types of quotes. There are three types of quotes

Quotes Name

Meaning
"Double Quotes" - Anything enclosed in double quotes removes the meaning of the characters (except \ and $). For example, if we set arg = blah, then echo “$arg” would result in blah being printed to the screen. 'Single quotes' – Text enclosed inside single quotes remains unchanged (including $variables). For example, echo ‘$arg’ would result in $arg being printed to the screen. That is, no variable substitution would take place. `Back quote` - To execute a command. For example, `pwd` would execute the print working directory command.

" ' `

Double Quotes Single quotes Back quote

To see the effect of the single or double quote add the following to the above script:
echo “$minX”

Geophysical Computing

L03-5

echo ‘$minX’

The back quote is really useful. This allows us to set a shell variable to the output from a Unix command:
#!/bin/csh set mydir = `pwd` # set variable to current working directory # what does this do?

@ nr = `awk ‘END {print NR}’ input_file` @ nfiles = `ls *UU* | wc –l`

As a final note on displaying shell variables it is often useful to concatenate shell variables: #!/bin/csh set year set month set day = 2010 = 12 = 30

set output1 = ${year}_${month}_${day} set output2 = ${year}${month}${day} echo $output1 echo $output2 mv inputfile ${output1}.txt Note that we use the { } brackets in this example. This is because if I just type $year_ then the shell would look for a variable called year_.

5. Command Line Arguments
It is often useful to be able to grab input from the command line or to read user input. The next example shows a simple way to interactively get information and set the result to a variable. #!/bin/csh echo “How many records in this file do you want to skip? “ set nlines = $< echo $nlines To see how command line arguments are handled let’s consider the following example where I want to read in a filename and then perhaps do some action on this file later.

Geophysical Computing

L03-6

#!/bin/csh set ifile = $argv[1] echo “Now lets perform some kind of action on file: $ifile”

If I named this C Shell script: action.csh and we want to perform the action on the file foo.txt then we need to type: >> action.csh foo.txt on the command line to make this work. This is really useful when we want to make generalized scripts that don’t require editing the variable names every time we want them to run.

6. Redirection of standard output/input
The input and output of commands can be sent to or received from files using redirection. Some examples are shown below: date > datefile The output of the date command is saved into the contents of the file, datefile. a.out < inputfile The program, a.out receives its input from the input file, inputfile. sort gradefile >> datafile The sort command returns its output and appends it to the file, datafile.

A special form of redirection is used in shell scripts.
calculate << END_OF_FILE ... ... END_OF_FILE In this form, the input is taken from the current file (usually the shell script file) until the string following the << is found. An example of using the program SAC (Seismic Analysis Code) is shown below (it is becoming more and more of a rarity for people to write SAC macros!):

Geophysical Computing

L03-7

#!/bin/csh sac << EOF r infile.sac qdp off ppk q EOF

If the special variable, noclobber is set, any redirection operation that will overwrite an existing file will generate an error message and the redirection will fail. In order to force an overwrite of an existing file using redirection, append an exclamation point (!) after the redirection command. For example for the command: date >! datefile The file datefile will be overwritten regardless of its existence. The output of one command can be sent to the input of another command. This is called piping. The commands which are to be piped together are separated by the pipe character. For example: ls -l | sort -k 5n This command takes the output of the ls -l command and puts the output of it into the sort command.

7. Homework
1) Write a C Shell script that will allow you to set the name of an input postscript file and desired output name of a jpg file, and then use ImageMagick’s convert command to convert a postscript file into a jpeg image. E.g., At the very least I should enter, either by the command line or by interactive input the name of an input .ps file, and desired name of output .jpg file and the script will automatically create the .jpg file.

2) Write a C Shell script that will add the current date to the end of a filename. E.g., if today is Dec 25, 2010, then the shell script should change the filename to: filename.20101225 The script should read the filename from the command line. Hence, if we named this script adddate then execution of this command should look like: >> addate filename

Geophysical Computing

L03-8

3) Write a C Shell script that will remove dates added with the script written in Problem #2. Note: this script should also work when there is a dot in the filename. E.g., the code should work for any filename of the form… foo.20101225 foo.foo.20101225 foo.foo.foo.20101225 foo.foo.foo.*.20101225 Output file names for the examples above should be: foo foo.foo foo.foo.foo etc.

4) Write a script that will replace spaces in file names with underscores. E.g., if the input file is named: My File.txt , then the output file should be named My_File.txt.

Geophysical Computing

L04-1

L04 – C Shell Scripting - Part 2 1. Control Structures: if then else
Last time we worked on the basics of putting together a C Shell script. Now, it is time to add to this the control structures that actually make scripting useful. The following example shows the three primary examples of how to test conditionally. #!/bin/csh echo “Enter a number between 1 and 10… “ @ number = $< if ($number == 6) then echo “that’s the lucky number!” endif

if ($number > 5 && $number < 7) then echo “that’s the lucky number!” else echo “you lose. try again.” endif if ($number > 0 && $number < 5) then echo “a low pick.” else if ($number >= 7 && $number <= 10) then echo “a high pick.” else if ($number == 6) then echo “that’s the lucky number!” else echo “you didn’t pick a number between 1 and 10!” echo “follow the instructions and try again...” endif

Remember though, when testing numbers in a C Shell script, it can not handle real numbers!

2. Control Structures: goto
I shudder to actually write down the goto statement. It is, in my opinion, an abomination. It was relegated obsolete back in the 60’s, yet here it is, still in existence in a handful of languages. Here are a couple of quick examples on how to use it, and then I wash my hands of it! First, let’s just look at the example given above, and put a goto statement in, such that if you choose a number outside of the range 1 to 10 the script will force you to re-pick a number.

Geophysical Computing

L04-2

#!/bin/csh select: echo “Enter a number between 1 and 10… “ @ number = $<

if ($number > 0 && $number < 5) then echo “a low pick.” else if ($number >= 7 && $number <= 10) then echo “a high pick.” else if ($number == 6) then echo “that’s the lucky number!” else echo “you didn’t pick a number between 1 and 10!” echo “follow the instructions and try again...” goto select endif

The following example shows how one could test for the proper usage of a C Shell script: #!/bin/csh # # Example script requires 2 command line arguments # 1) the name of an input file, 2) the name of an output file if ($#argv < 2) goto usage set ifile = $argv[1] set ofile = $argv[2] exit 1 usage: echo “Usage: myprog input_file output_file” My hands are clean.

3. Control Structures: loops
Once you can loop you are pretty much set. There are two main ways to loop in a C Shell: either with a while or a foreach statement. Examples of each are given below.

Geophysical Computing

L04-3

Example of using a while statement: #!/bin/csh #Example of looping through a list of files. # # e.g., imagine I have a bunch of SAC files that all end with the # suffix .R # i.e., I have them all rotated to the radial component. # Now I want to do something with those files, in this example # use SAC to cut them. #make a temporary file listing all of my .R files ls *.R >! file_list # find out how many files I have @ nr = `awk ‘END {print NR}’ file_list` @ n = 1 # define a looping variable

# start the loop while ($n <= $nr) #grab nth file name from the list set if = `awk ‘NR == ‘$n’ {print $1}’ file_list` echo “cutting file $if ..” sac << eof r $if cuterr fillz cut 0 200 r w over q eof @ n = $n + 1 #increase n by one end # end loop # clean up temporary files rm file_list

Geophysical Computing

L04-4

Example of using a foreach statement: #!/bin/csh set phase_list = (ScP PcP P) set depths = (100.0 200.0 300.0 400.0 500.0 600.0) # loop through all seismic phases and depths set above foreach phase ($phase_list) foreach depth ($depths) echo $phase $depth end end

4. Control Structures: Switch Case
This is a really nice structure that is similar to an if then type of structure. Suppose I wanted to do some action based on what kind of seismic arrival I was looking at. So, if I was interested in a PKP arrival I could write some code that did tests like: if ($some_string == ‘PKP’) then do something… else if ($some_string == ‘SKS’) then do something else else do another something else endif OK, a more elegant way to do this is to use the Switch Case structure: #!/bin/csh set input_phase = PKP switch ($input_phase) case PKP: echo “PKP arrival” breaksw case SKS: echo “SKS arrival” breaksw case SPdKS: echo “SPdKS arrival” breaksw endsw

Geophysical Computing

L04-5

5. Control Structures: if then else revisited
Sometimes to make your scripts more robust it is useful to do some checks before you actually implement some action. For example, no sense in trying to move the file named blah, if the file blah doesn’t even exist. To see how this works, create a temporary file named: example.txt and a temporary directory named: ExampleDir. So, let’s do some tests on these temporary files (in the Linux system directories are really just files as well). #!/bin/csh set if = example.txt set id = ExampleDir if (-e $if) then echo “the file $if endif if (-e $id) then echo “$id exists!” endif if (-f $id) then echo “$id is a normal file” else echo “$id is NOT normal.” endif if (-d $id) then echo “$id is a directory!” endif # so we don’t have to type out the # filename a bunch of times. # as above…

exists!”

The table below shows all of the attributes one may search for relating to files: Letter d e f o r w x z Attribute The file is a directory file. The file exists. The file is an ordinary file. The user owns the file. The user has read access to the file. The user has write access to the file. The user has execute access to the file. The file is 0 bytes long.

Geophysical Computing

L04-6

6. The Dialog utility
Let’s wrap up our lectures on C Shell scripting with an entertaining utility. Perhaps you want to impress your advisor and make him/her think you’ve already developed these mad hacking skills. Well, try asking for input using the dialog utility. I guarantee that you will impress the entire faculty in this Dept. (with the exception of me of course). As a quick demo: #!/bin/csh dialog --title “----- WARNING -----“ \ --infobox “This computer will explode \ unless you press a key within the next 5 seconds!” 7 50; set exit_status = $?

The dialog utility uses the following syntax:
dialog --title {title} --backtitle {backtitle} {Box options} where Box options can be one of the following (other options also exist if you check out the man page) --yesno {text} {height} {width} --msgbox {text} {height} {width} --infobox {text} {height} {width} --inputbox {text} {height} {width} [{init}] --textbox {file} {height} {width} --menu {text} {height} {width} {menu} {height} {tag1} item1}...

Here is an example of how to create a yes/no box: #!/bin/csh set ifile = ‘blah.txt’ dialog --title “----- Yes/No Example -----“ \ --yesno “Do you want to delete file $ifile” 7 60 set exit_status = $? echo “ “ switch ($exit_status) case 0: #user selected ‘yes’ echo “Deleting file $ifile” rm $ifile breaksw # get the dialog utilities exit status

Geophysical Computing

L04-7

case 1: #user selected ‘no’ echo “Saving file $ifile” breaksw case 255: #user hit escape key echo “Operation Canceled…” breaksw endsw

As a final example of the dialog utility, let’s use it to grab some text from the user. In this example we will prompt the user to type in a file name to delete: #!/bin/csh dialog -- title “----- Text Input Example -----“ \ -- inputbox “Enter the name of the file you want to delete” \ 7 60 ‘file’ \ --stdout > temp_menu.txt set exit_status = $? #get the dialog utilities exit status

#get the string that the user typed in the input box set ifile = `cat temp_menu.txt` echo “ “ switch ($exit_status) case 0: #A file name was entered echo “Deleting file $ifile” breaksw case 1: #The cancel button was pressed echo “Cancel button pressed” breaksw case 255: #User hit the escape key echo “Escape key pressed” breaksw endsw rm temp_menu.txt #get rid of temporary files

Geophysical Computing

L04-8

7. Debugging C Shell Scripts
There are two quick ways in which one can debug a C Shell script. The script can either be run from the command line as in one of the following two examples: >> csh –x myscript >> csh –v myscript or, the top most line of the script can be written as follows: #!/bin/csh –x #!/bin/csh –v The –x option echoes the command line after variable substitution. The –v option echoes the command line before variable substitution.

8. Homework
1) Write a C Shell script that will loop through a list of files, and add a counter to the beginning of the filename. For example, if I have 10 files named: a.txt b.txt c.txt … j.txt The code should move the files to be named: 01_a.txt 02_b.txt 03_c.txt … 10_j.txt This kind of utility is often needed in naming files. Especially, as we will see in later lectures when automatically generating animations or movie files. 2) Write a C Shell script that will repeat a command many times. We will call this script: forever. For example, sometimes I want to see if a job I submitted to the supercomputer has started yet. To do so I would type qstat –a. Well, I’m anxious to see if it starts, so I will keep typing qstat –a until I get confirmation that indeed the job did start. Instead I want to type forever qstat –a, and what should happen is that qstat –a keeps getting invoked (after a couple seconds delay) until I decide to cancel it. Your script should be able to take any Unix command as input. For example, it should work as forever ls, or forever ls –la, or forever cat inputfile, etc.

Geophysical Computing

L04-9

3) In the C Shell one can not do floating point operations. That is, you can not do math with real numbers. However, it is sometimes necessary to do so. A quick work around is to do the math inside a program like the basic calculater (e.g., use: bc -l). Write a shell script that will allow you to do a simple calculation on floating point numbers. Take as input a coordinate position in polar coordinates (Radius, and angle theta in degrees) and output the equivalent Cartesian coordinate position. 4) Write a C Shell script using the dialog utility to create a menu box. The menu box should provide several options of actions you want to carry out on a seismogram. For example, the menu box may have options as follows: Please choose an Action to be performed on Seismogram: 1 2 3 4 Flip Polarity of Seismogram Low Pass Filter Seismogram Make Time Picks on Seismogram Discard Seismogram

The script doesn’t actually have to perform any actions on a seismogram file, but is aimed at getting you to write a script using the dialog utility. Output, in the form of some kind of recognition of which option was chosen should be provided in the code.

Geophysical Computing

L05-1

L05 – Generic Mapping Tools (GMT) - Part 1 1. What is GMT?
It’s not Greenwich Mean Time (there is a story there) According to the GMT webpage itself: GMT is an open source collection of ~60 tools for manipulating geographic and Cartesian data sets (including filtering, trend fitting, gridding, projecting, etc.) and producing Encapsulated PostScript File (EPS) illustrations ranging from simple x-y plots via contour maps to artificially illuminated surfaces and 3-D perspective views. GMT supports ~30 map projections and transformations and comes with support data such as GSHHS coastlines, rivers, and political boundaries. GMT is a set of programs that allows you to make plots, and most importantly for geoscience people it has a ton of tools for manipulating and creating maps. It is also scriptable. That is, we make plots using GMT by writing shell scripts. This is awesome because it means that once we write a shell script to create a certain type of plot we can use this script over and over to make the same kind of plot on all kinds of different data. Plots can even be made automatically. E.g., once we do some kind of data processing, we can have the code launch a GMT script and make plots for you. You can go away, let your codes run, come back after lunch, or a weekend (not if you’re a grad student you should be here every day!) and have plots sitting on your desktop waiting for you. Beautiful! Another key feature of using GMT is that you have total control over every aspect of the plot. You can make just about any style of plot you can dream up, even if no one has ever dreamed up of that style of plot before. Try doing that with Excel! All right, for a few examples let’s go to the GMT web page. It’s a good site to know because it has the entire GMT manual online: http://gmt.soest.hawaii.edu > Once here, click on Examples. (Actually most of these examples suck if you ask me, but it gives at least a quick idea of what can be done). Referenceing GMT: If you use GMT, and most people in the geosciences do for generating figures, then it is appropriate to acknowledge the fact in your publications. Typically one says something as follows in the acknowledgements, “figures were drawn using the Generic Mapping Tools (Wessel and Smith, 1998).” Wessel, P. and W.H.F. Smith (1998), New, improved version of the Generic Mapping Tools released, Eos Trans. AGU, 79, 579.

Geophysical Computing

L05-2

2. Getting Started with GMT
OK, so plotting with GMT is definitely not like anything you are probably used to right now. So, you can’t learn how to swim without jumping in the water, so let’s dive in: #!/bin/csh # Example of a simple location map pscoast –R0/360/-90/90 –JG-111/45/4.5i –Bg30 –Dc –A8000 \ -G10/10/10 –W3/10/10/10 –P –K >! globe.ps psxy -100 -100 -120 -120 -100 END –R –JG –W6/255/0/0 –P –O –Am << END >> globe.ps 40 50 50 40 40

gs –sDEVICE=x11 globe.ps If you type up the above script and execute it, you should get a ghostscript window that pops up with an image that looks as follows:

The image is just that of the Earth, with the continents filled in black and a red box drawn around some area of interest. It’s actually a useful script and if you look at several of my publications (and those of coauthors of mine who now use the same basic format) you will see that I use a small image just like this to show my study region. Now, that you’ve done that, let’s take a quick look through the GMT commands we used. Note that to find out information on these commands you should always consult the man pages: >> man pscoast

Geophysical Computing

L05-3

>> man psxy pscoast and psxy are two of the most important commands you will use. Let’s dissect them.

2.1 pscoast
The pscoast command is what we use to plot land features (or water) on maps. It can plot coastlines, rivers, lakes, and political boundaries. The command we typed was: pscoast –R0/360/-90/90 –JG-111/45/4.5i –Bg30 –Dc –A8000 \ -G10/10/10 –W3/10/10/10 –P –K >! globe.ps The options we used are described as follows: Flag -R Options 0/360/-90/90 Purpose The –R flag sets the region of interest. Here I am plotting a map of the entire globe, so I set the region of interest as being between longitude 0° and 360° and latitude from -90° to 90°. The –J flag sets what map projection you want to use. Here we select G, which says use an Orthographic projection. (for an overview of the map projections available go to the GMT webpage > DOCS > GMT Technical Reference and Cookbook > 6. GMT Map Projections). Different map projections have different input options: -JG requires 3 numbers: -111 says center the map on a longitude of -111°, 45 means center map on a latitude of 45°, and 4.5i means make the size of the map 4.5i across. So, now is a good time to play around with the script above. Try changing the center latitude and longitude and rerun the script. -B g30 The –B flag tells us how often to draw latitude and longitude lines. Here the g says how often to draw gridlines. Try the script as g5 and see what happens. This flag tells us what resolution of coastline data to use. Here the c stands for crude. You can also try full, high, intermediate, and low. In the example this flag says: do not plot features smaller than 8000 km2. Just completely remove the –A command and see what happens. Now change –D option to –Df and see what happens. What does this do to the file size? -G 10/10/10 This option tells us what color we want to use to plot the land masses (see the section below on the R/G/B color space). To get an idea how this works try plotting using –G0/255/0.

-J

G-111/45/4.5i

-D

c

-A

8000

Geophysical Computing

L05-4

-W

3/10/10/10

This says we want to draw solid lines around our coastlines. Here the 10/10/10 says what RGB color to use and the 3 says use a line thickness of 3. Try it as: -W5/0/0/255 (you probably want to have the –A8000 flag turned on for this one)

-P

This flag says the plot should be in Portrait mode (i.e., page size 8.5" × 11"). Leaving out the –P option would say plot in landscape mode (11" × 8.5"). This is an extremely important flag! Many of your problems using GMT will center on forgetting to put this flag in, or putting it in when it is not needed. What it says is that I will provide more GMT commands later, so do not close the postscript file yet!

-K

Note, that we used the command all as follows: pscoast –R0/360/-90/90 –JG-111/45/4.5i –Bg30 –Dc –A8000 \ -G10/10/10 –W3/10/10/10 –P –K >! globe.ps There are two important things to note: 1) At the end of the first line we used a “\” symbol (a backslash). This says, we are not finished typing the command, but that we ran out of room on the first line. The backslash says to continue this command on the next line. 2) The output of the pscoast command goes into >! globe.ps . This says (remember our C Shell conventions) to force the creation of a new file called globe.ps and put all of the output in this file. Since this is the first GMT command we used in this script, we used the >! redirection to ensure that we opened up a new file. Wow, so that’s a lot of options with the pscoast command. But, if you noticed in looking at the man page there are even more that we didn’t even touch! At first, it will likely seem really arbitrary, but after a little while you will generally just remember what all of the flags are and be able to create plots quickly.

2.2 psxy
Another command you will really get to know is psxy. This command is used to plot lines, symbols, or polygons on a map. In our example we used it to plot a red box around our study region. The command we used was: psxy -100 -100 -120 -120 -100 END –R –JG –W6/255/0/0 –P –O –Am << END >> globe.ps 40 50 50 40 40

Geophysical Computing

L05-5

Let’s go through the options we used for this command also: Flag -R Options Purpose The –R flag always serves the same purpose, i.e., to specify the region of interest. In this case, we do not add a region. This is because we specified the region with the pscoast command already. Hence, GMT will use the same region specified above. We want to plot our box on the same map as specified with pscoast, hence we just leave the option as –JG and GMT knows to keep the projection as is. Draw the box with a red line (255/0/0) with a thickness of 6 pts. Portrait mode. Muy importante! In pscoast we used the –K flag which said, we will add more to the plot later. The –O option is now used, saying let’s overlay the results of this psxy command on top of whatever was supplied in previous commands. Note, we do not now use a –K command because we don’t want to add anything else to plot later. m This command controls how to draw connecting lines, i.e., as great circle arcs or not. In this case, I wanted the lines to follow lat/lon lines so the m option did this.

-J

G

-W -P -O

6/255/0/0

-A

So, what is this odd redirection we did with this command? << END >> globe.ps lon lat lon lat ... END 1) Note, we use >> globe.ps. This is important as we want to append the results of the psxy command into our file globe.ps. 2) We also used: << END. This states that we are going to redirect anything between the current line and the END statement into the psxy command. In this case we have a list of longitudes and latitudes we want psxy to connect into a box. Another option is to put the lons and lats into a file. For example, if we placed our locations in a file named box.xy. Then we could run psxy as: psxy box.xy –R –JG –W6/255/0/0 –P –O –Am >> globe.ps

Geophysical Computing

L05-6

3. Interlude - The RGB color space
Specifying color in GMT is always done with Red/Green/Blue values. The amount of red, green, or blue varies from 0 – 255. So, 255/0/0 indicates to use the maximum amount of red and no green or blue, i.e., make the color pure red. 0/0/0 indicates not to use any color values, i.e., black. 255/255/255 indicates using the maximum amount of each color, i.e., make it white. Color is often specified as shown above as either a line color (generally specified with the –W flag) or as a fill color (specified with the –G flag). But, in a later lecture we will discuss Color Palette Tables, which also use the RGB color space. As a final note, it is often useful to have a webpage bookmarked that shows you colors and the equivalent RGB value. There are plenty of sites on the web (e.g., google rgb colors). An example is given below: http://cloford.com/resources/colours/500col.htm
cadetblue 3 cadetblue 4 turquoise 1 turquoise 2 cadetblue 3 cadetblue 4 turquoise 1 turquoise 2 122 197 205 83 134 139 0 0 245 255 229 238

This is really nice, for example scrolling through the colors I think that cadetblue 4 may be useful for plotting the fill colors of lakes. To use this color I can quickly see that: Red=83, Blue=134, and Green=139.

4. Interlude #2 – What are postscript files?
You may have noticed by now that we have output the names of our plot files with the .ps extension. This is because our GMT output is in the postscript format. Postscript is a language that describes what goes on a printed page. Its development was intimately tied to the development of the first laser printers and still remains a standard. The postscript language was actually created by John Warnock (a UU graduate who co-founded Adobe and for whom the Warnock building is named) and Charles Geschke around 1982. Note that postscript files are just text files that you could actually edit (not recommended) with your favorite text editor (open one up and check it out). As another note, it is usually advisable to store your plots in postscript format if you are thinking about using them in a scientific publication. As we will later see, this makes them exceptionally easy to edit in Adobe Illustrator. Note that the encapsulated postscript format (.eps) which is an export option in Matlab is the same thing.

Geophysical Computing

L05-7

5. The –K and –O flags
These flags are often the most confusing to new users of GMT but they are truly quite simple to understand. The basic rules are: 1) If I am going to add more to my plot with additional GMT commands, then I need to use the –K flag, which states: I will append more to this plot later. 2) Any time I am appending to a plot that I have already drawn to (i.e., on the previous line I used the –K flag) then I must use the –O flag, which states: I am overlaying onto a plot that has already been started. 3) The very last command used in generating a plot must NOT have a –K flag. This is because the postscript file needs to be closed with an end statement in the postscript language. The –K flag keeps the file open for more plotting instructions, and does not close the file. If you try to send your postscript file to the printer and it doesn’t print out the most common culprit to your printing problems is that you didn’t close out the file. Here are some examples of proper usage of these flags. Example 1: GMT Plot with a single command. # For a single command don’t use either the –K or –O flag gmtcommand -flags >! plot.ps

Eaxmple 2: GMT Plot with two commands. # Use the –K flag on the first line gmtcommand –flags –K >! plot.ps # Use the –O flag on the last line, but not the –K flag gmtcommand –flags –O >> plot.ps

Example 3: GMT Plot with three or more commands. # Use the –K flag on the first line gmtcommand –flags –K >! plot.ps # Use both –K and –O flags on all but the last command gmtcommand –flags –K –O >> plot.ps ... # Just use the –O flag on the last line gmtcommand –flags –O >> plot.ps

Geophysical Computing

L05-8

6. X-Y Plots
GMT is definitely a powerful tool for mapping applications. However, one of its most powerful aspects as a researcher is in its potential for creating simple 2D plots. The reason I think it is so powerful is because (1) It is scriptable, meaning from one single script I can plot multiple data sets in exactly the same way with minimal effort, and (2) GMT provides complete control over every aspect of how the plot looks. These are huge advantages over plotting programs that require you to click your way through options and only offer minimal control over how the plot looks (e.g., Excel). To wrap up this lecture let’s just look at a really quick plot of some data. #!/bin/csh #---------------------------------------------------------------# # First we will create a data set to plot # # Here we are going to create a wave packet example. # This is just a combination of two sinusoids with a 100 # and 20 sec period # i.e., angular frequency = 2*pi/ T #---------------------------------------------------------------# @ time = 0 while ($time <= 1000) if ($time == 0) then echo $time >! temp.xy else echo $time >> temp.xy endif @ time = $time + 1 end # now let’s calculate some amplitude values awk ‘{print $1, (5*sin(-$1*.063) + 2.5*sin(-$1*.314))}’ temp.xy >! timeseries.xy rm temp.xy # plot the time series psxy timeseries.xy –JX6i/3i –R0/1000/-10/10 –W2/0/0/255 –P \ -B100g10000f10/5g10000nSeW –X1.5 –Y6.0 –K >! plot.ps # add some text describing the plot and axes pstext –R0/8/0/11 –JX8i/11i –P –O –N << END >> plot.ps 0.0 3.25 16 0 1 1 Wave Packets 2.6 -0.6 12 0 1 1 Time (sec) -0.5 1.1 12 90 1 1 Amplitude END gs –sDEVICE=x11 plot.ps

Geophysical Computing

L05-9

If you typed everything in correctly you should get a plot that looks like this:

Note that we use psxy as our primary command for generating this plot. I will not go into detail as to what all of the options mean. It’s better just to get some practice on your own. Hence, it’s time for the homework!

7. Homework
1) Create a global map showing the major plate boundaries (an x,y table of plate boundaries are given on the web page) using a Mollweide projection and centered on the Pacific Plate. 2) Create a map that will plot the locations of all Global Seismic Network (GSN) stations using a Winkel projection. Put the station code (e.g., AAK) next to the plot symbol for the station. The station names and locations are provided in a file on the web page. 3) The data file: envelope.xy is given on the web page. This data file contains the envelope of a seismogram recorded at Eilson Airforce Base in Alaska with the direct P-wave arrival aligned at 0 sec. The envelope is given here to show the exponential decay of the coda waves. Make a plot of this seismogram in two separate panels. In the top panel show the raw seismogram as provided in the file envelope.xy. Anytime a quantity displays exponential behavior it is customary to plot this as the natural log of the amplitudes instead of the raw amplitude. Such a plot would then show a linear decay instead of exponential decay, and the exponential decay can then be estimated by fitting a straight line to the linear region. Make the bottom panel of the plot should be the natural log of the amplitude of the seismogram, and show which region immediately following the direct P-arrival can be fit with a straight line. 4) This will be one of the most useful scripts you will ever write. When doing any type of scientific work one always generates tabular data (i.e., x, y pairs of data). It is really nice to quickly be able to take a glance at these data without having to write a special GMT script every time. For a finalized plot of data you probably will want to write a special script, but sometimes its really, really useful to just be able to take a quick glance at it. Write a generic script that will plot a 2D group of data. Make an option for plotting solid lines or symbols. Name the script

Geophysical Computing

L05-10

xyplot. The plot should contain a title that contains the name of the file being plotted. The script should run as: >> xyplot input.xy s where, if the s option is given, the data will be plotted with filled in circles. Hint: Use GMT’s minmax and gmtmath commands to help in determining the plot range, and spacing between tickmarks for the axes.

Geophysical Computing

L06-1

L06 – Generic Mapping Tools (GMT) - Part 2 1. Plotting Fields - Grdimage
Last time we plotted a couple of maps. This was great, but there wasn’t a lot of information on those maps other than where the coastlines are located. The next example shows how to add some additional information. In this case, we are plotting seismic S-wave velocities as derived from the tomographic inversion by Jeroen Ritsema (U. Michigan). The values of S-wave velocity are given (in % difference from a standard reference model – PREM) in the file s20rts.grd (this file is supplied on the webpage along with the data sets for use in the homework). Grab the files available for the homework and try the following example. #!/bin/csh # Generate a color palette table makecpt –Ccelsius –T-5/5/.1 –Z –I >! color.cpt # Plot the gridded image grdimage s20rts.grd –Ccolors.cpt –R120/280/-55/40 –JX6.4i/3.8i \ -B20g10000f10/10g10000nSeW –P –E300 –K >! plot.ps # Add the coastline information pscoast –R –JX6.4d/3.8d –Dc –W1/2/255/255/255 –P –O –K –A10000 \ –N1/1/255/255/255 >> plot.ps # Add a scale bar of the colors psscale –D2.0i/4.75i/3.5i/.3ih –O –Ccolors.cpt –B2.5 >> plot.ps rm colors.cpt gs –sDEVICE=x11 plot.ps The resulting image you should get is shown here:

So, what did we do in creating this image? I am assuming you are expert in searching the man pages by now to determine what all of the flags mean. But, let’s discuss a couple of the main points in creating such an image:

Geophysical Computing

L06-2

1) We need data in a specific type of gridded format. For this example we specifically needed our data in longitude, latitude, S-wave velocity format. In this case it was specified in a special kind of file called a netCDF or grid file. This is the supplied file called s20rts.grd. Later on in this lecture we will talk about how to create this type of file. 2) We needed a file that provided a mapping between seismic wave velocities and the color used to plot them. This is called a color palette table or .cpt file. Here we used the file colors.cpt. The next section of this lecture shows how to generate such a table. 3) Once we have a .grd and .cpt file we can create a colored image using the GMT command grdimage. 4) Note that we can now plot coastline information with pscoast, but that it has to overlay the color information or else we won’t be able to see it. 5) Finally, we should always add a scale bar to our plot so we know what the colors mean. This is done with GMTs psscale command.

2. Color Palette Tables
Now, that we’ve seen an example of plotting a gridded image let’s discuss how the coloring of this image is done. Note in the above example that we specified a file: colors.cpt. The .cpt extension is used to denote a color palette table. Here is an example of a cpt file designed specifically for plotting the topography of Scotland:
# scotland.cpt # # by Eric Gaba -1750 121 178 222 -1500 132 185 227 -1250 141 193 234 -1000 150 201 240 -750 161 210 247 -500 172 219 251 -200 185 227 255 -100 200 235 255 -50 216 242 254 0 172 208 165 50 148 191 139 100 168 198 143 200 189 204 150 400 209 215 171 600 239 235 192 800 222 214 163 1000 202 185 130 1200 192 154 83 B 0 0 0 F 255 255 255 N 255 0 0

-1500 -1250 -1000 -750 -500 -200 -100 -50 0 50 100 200 400 600 800 1000 1200 1400

121 132 141 150 161 172 185 200 216 172 148 168 189 209 239 222 202 192

178 185 193 201 210 219 227 235 242 208 191 198 204 215 235 214 185 154

222 227 234 240 247 251 255 255 254 165 139 143 150 171 192 163 130 83

What does this mean? Looking at the first line we have:

Geophysical Computing

L06-3

-1750 121 178 222 -1500 121 178 222

What this says is that we want to color elevations between -1750 to -1500 (so this is actually below sea level) with the RGB color: R=121, G=178, B=222. This color looks like: . The next line is:
-1500 132 185 227 -1250 132 185 227

Which simply states that we want to color elevations between -1500 and -1250 with the color: R=132, G=185, B=227. So the cpt file is just a mapping between elevations and the color used to represent those elevations. The final three lines of the cpt file are special:
B 0 0 0 F 255 255 255 N 255 0 0

Note that the color palette table in our example scotland.cpt is valid for elevations between -1750 and 1400 m. These three lines state: 1) B – Elevations are less than the defined range. In this example if the elevation occurs in the map with a value < -1750 m, then color that space as black (R=0, G=0, B=0). 2) F – Elevations are greater than the defined range. In this example if the elevation is > 1400 m then color that space as white (R=255, G=255, B=255). 3) N – If no elevation data is given for that space (i.e., elevation is given as NaN - Not-ANumber) then color that space as red (R=255, G=0, B=0). This is an example of a categorical type of cpt. What this means is that we do not interpolate color for a given block of data. That is, in our example, for the range of elevation between -1750 and -1500 m, we use a single color. Hence, a scale bar for this type of cpt looks like discrete blocks as shown here:

However, we can also make a continuous cpt, where color is linearly interpolated between data ranges. To do this, just consider the first line of our cpt file:
-1750 121 178 222 -1500 121 178 222

Now, what would happen if we wrote:
-1750 121 178 222 1400 192 154 83

This line by itself could be a cpt file. Try it out by creating a cpt file with just that line, and plotting a color bar using psscale.

Geophysical Computing

L06-4

As you can see we can make cpt files by just plain hand editing. But, luckily GMT comes with a number of preloaded cpts. To see their names just type makecpt. The makecpt utility is great for generating cpts with custom bounds. For example, what if I wanted to make a continuous color palette table using the GMT base file relief for topography, but having the elevations limited to the range between +/- 1000 feet. We can do this easily by: >> makecpt –Crelief –Z –T-1000/1000/25 >! color.cpt A couple of flags are of special note: 1) –Z: This flag says to make a continuous cpt file. 2) –I: This flag says reverse the colors As a final note, there is an excellent resource called CPT City located at: http://soliton.vm.bytemark.co.uk/pub/cpt-city/ There are literally hundreds of color palette tables contributed to this web site, many by professional cartographers, so it’s often useful to peruse the site for interesting color schemes. Note that the colors here are given in a variety of formats, but the cpt format is the same as that used in GMT.

3. Generating Grid Files (xyz2grd)
In general our data sets aren’t arranged in .grd files. But luckily GMT comes supplied with a nice command to create them: xyz2grd. To make a .grd file one typically needs to have data ordered in a table of x, y, and z values. For example, if I have global data set of temperatures on the Earths surface at a specific time, then I would want the table to look like: lon 1 lon 2 lon 3 … lon N lat 1 lat 2 lat 3 … lat N T1 T2 T3 … TN

If I named this input file as temperature.xyz, and my longitude and latitude points were spaced on a 1° × 1° interval, then I could generate a .grd file as follows: >> xyz2grd temperature.xyz –Gtemps.grd –R0/360/-90/90 –I1/1 –V Here, the –G flag says that we want to create a grid file named temps.grd. The –I flag tells us the spacing between x and y grid points. In our example we suggested a 1° × 1° interval which is why this flag takes on its current values. The –V flag states we want xyz2grd to spit out as much

Geophysical Computing

L06-5

information as possible while generating the .grd file. This can be really useful in trying to debug problems.

4. Global Datasets
The National Oceanic and Atmospheric Administration (NOAA) manage a number of really nice global datasets, which can often be downloaded in netCDF grid file format. The Etopo series of surface topography and bathymetry are probably the most useful data set you will find on-line. http://www.ngdc.noaa.gov/mgg/global/global.html

Other interesting data sets include ocean crustal ages: http://www.ngdc.noaa.gov/mgg/ocean_age/ocean_age_2008.html

Geophysical Computing

L06-6

and Geomagnetic data: http://www.ngdc.noaa.gov/geomag/geomag.shtml

5. What are .grd files?
In the older GMT examples, the authors always referred to their gridded data sets as .grd files, or grid files. However, I notice that now they more commonly refer to the files as .nc files. There is no change in the file format, they are still of the form called netCDF. NetCDF (Network Common Data Form) is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. http://www.unidata.ucar.edu/software/netcdf/ This data format is embraced much more in the atmospheric sciences discipline than it is in the geophysical community. But, in the geophysical community everyone still thinks they should design their own data format for every thing they do. Many of the old timers still like to cling to their data as well. But, if we were to adopt a common format for sharing data, netCDF would probably be the format. If you really want details on the netCDF format see my other handout that is on the web page. Here is a reiteration of the important points: NetCDF is an array based data structure for storing multidimensional data. A netCDF file is written with an ASCII header and stores the data in a binary format. The space saved by writing binary files is an obvious advantage, particularly because one does not need worry about the byte order. Any byte-swapping is automatically handled by the netCDF libraries and a netCDF binary file can thus be read on any platform. Some features associated with using the netCDF data format are as follows:

Geophysical Computing

L06-7

Coordinate systems: support for N-dimensional coordinate systems. • X-coordinate (e.g., lat) • Y-coordinate (e.g., lon) • Z-coordinate (e.g., elevation) • Time dimension • Other dimensions Variables: Support for multiple variables. • E.g., S-wave velocity, P-wave velocity, density, stress components… Geometry: Support for a variety of grid types (implicit or explicit). • Regular grid (implicit) • Irregular grid • Points Self-Describing: Dataset can include information defining the data it contains. • Units (e.g., km, m/sec, gm/cm3,…) • Comments (e.g., titles, conventions used, names of variables (e.g., P-wave velocity), names of coordinates (e.g., km/sec),... As a special point, netCDF comes with a utility called ncdump that will allow you to dump the contents of a netCDF file into an ASCII readable format. It also allows you to check just the header information. Try it: >> ncdump –h netcdf_file GMT also comes with a similar utility: >> grdinfo netcdf_file

6. Homework
1) Write a C Shell script utility that one can use to visualize a color palette table. That is it should use the psscale command to generate a plot that one can quickly see what the .cpt file looks like. If we name the script viewcpt then we should be able to run it as: >> viewcpt cptfile where cptfile is the name of the color palette table we are interested in seeing. 2) Using the Etopo1 grid file create a world map to emulate the first plot shown in Section 4. 3) Make a global plot showing elevation of land masses using the Etopo1 elevations. Instead of using bathymetry for the oceans use the ocean ages and overlay the plate boundaries we used in the last homework. This problem will require clipping. Refer to the GMT Example #17 for hints on how to do this. Hint: Elevations will look best here as gray scale.

Geophysical Computing

L06-8

4) The webpage contains a file: Tomography_TXBW.xyz. This file has longitude, latitude, and variations of S-wave velocity based on the tomographic inversion of Steve Grand (University of Texas at Austin). The velocities are given for a layer just above the core mantle boundary. Convert this file into a .grd file and plot it in two panels centered on (a) the central Pacific Ocean region and (b) Africa. It is customary to plot the lowest values (low velocities) with red colors and the highest values (high velocities) with blue colors. Make the color palette table a continuous table. Also plot world hot spot locations as triangles (to emulate volcanoes) on the maps. Be sure to include a scale bar for the colors. Is there any relation between deep mantle seismic velocities and hot spots locations?

Geophysical Computing

L07-1

L07 – Generic Mapping Tools (GMT) - Part 3 1. Local Scale Mapping
In our last lecture we showed some examples of global scale mapping. We also introduced a global elevation model Etopo1. This file has a single elevation value for each 1 minute × 1 minute grid point. This is an elevation value for grid sizes about 1.85 km × 1.85 km. This is a really fantastic grid size for plotting at global scales, or even on the statewide scale. Consider the next script which shows an example of plotting elevations for the state of Utah. #!/bin/csh makecpt -CColor_DEM.cpt -Z -T800/5000/100 >! utah.cpt grdimage ETOPO1_Ice_g_gmt4.grd -R-116/-108/36/43 -JX5i/6i \ -B2g10000nSeW -Cutah.cpt -V -P -K >! utah.ps pscoast -R -JX5d/6d -Df -V -P -O -K -W2 -N2 \ -S33/204/255 >> utah.ps # draw an inset box psxy -R -JX -W8/255/0/0 -O -P << eof >> utah.ps -111.833 40.75 -111.833 40.583 -111.5 40.583 -111.5 40.75 -111.833 40.75 eof gs -sDEVICE=x11 utah.ps The execution of which results in the following plot:

Geophysical Computing

L07-2

Note that I use a color palette table derived form Color_DEM that was obtained from the CPT City website and available in this lectures accompanying material on the web site. The resulting image on the scale of the state of Utah is quite nice. But notice the red inset box which approximately covers the area between Mill Creek and Big Cottonwood Canyons. If we zoom into this space what we get looks like this:

As you can see the resolution isn’t that pretty at this scale. So, what do we do? Fortunately there are a ton of good data sources for Digital Elevation Models (DEM), ranging from 90 meters × 90 meters (lateral grid size), down to 2 m × 2 m in certain locations. Data sources A fair amount of data is currently available online. However, it is usually on a state by state basis. For example, one can find DEM data for the state of Arizona at: http://sco.az.gov. But, what is available there compared with other states varies quite dramatically. Utah has one of the nicest repositories of DEM data I’ve seen yet. This is available at: http://gis.utah.gov To get to the data click on GIS Data & Resources > SGID, Utah GIS Data Note that here there are two important types of data resources: (1) Vector Data, and (2) Raster GIS Data. We will talk about using both of these file sources.

2. Generating netCDF files from DEM files.
We will first look at Raster GIS Data, and work through collecting elevation models and converting this to a .grd file. Just follow the instructions below which are rather recipe like. • • • • Go to GIS Data & Resources > SGID, Utah GIS Data > Raster GIS Data > Elevation/Terrain Data Next go to: 5 Meter Auto-Correlated Elevation Model (DEM) Next go to: Retrieve 5 meter DEM via Interactive Map. Let’s generate an image of the region between Mill Creek and Big Cottonwood Canyons that will look nicer than the above image. Zoom into the region around Salt Lake City, and notice that the map is broken up into squares. Each square represents a different DEM file.

Geophysical Computing

L07-3

Download the files: 12TVK200800.asc, 12TVK400800.asc, 12TVL200000.asc, and 12TVL400000.asc • The .asc files we downloaded are not written in the most convenient manner for converting to a .grd file with GMT. In the material for this lecture there is a C Shell script called catascfiles.csh. This is a script written by myself that will combine all of the files we downloaded into one big file. The final output file is named: temp01, and is just a table of easting, northing, and elevation (in meters). The brunt of the work is done with a Fortran 90 code called Lidar_ASC2XYZ.f90. We will cover Fortran 90 later in the semester, but for now just assume this code I wrote works. So, let’s make one big file of x, y, z positions by typing: >> ./catascfiles.csh • Now, let’s generate a grid file. Here is an example file on how to do this:
#!/bin/csh # # Example script for generating .grd files # # Input file is 'temp01' written out from catascfiles.csh # #------------------------------------------------------------------# #------------------------------------------------------------------# # Set input parameters #------------------------------------------------------------------# set ifile = temp01 # input file output from catascfiles.csh set elev = f # output elevation in meters (m) or feet (f) set cell = 5 # cellsize (5 meters for the 5-m DEM) set ofile = Mt_Olympus # Prefix naming for output files #------------------------------------------------------------------#

#------------------------------------------------------------------# # Determine boundaries of grid file #------------------------------------------------------------------# echo "Determiniming map bounds..." minmax -C $ifile >! temp.f1 set set set set xmin xmax ymin ymax = = = = `awk `awk `awk `awk 'NR 'NR 'NR 'NR == == == == 1 1 1 1 {print {print {print {print $1}' $2}' $3}' $4}' temp.f1` temp.f1` temp.f1` temp.f1`

rm temp.f1 #------------------------------------------------------------------#

Geophysical Computing

L07-4

#------------------------------------------------------------------# # Convert elevations to feet if switch is set #------------------------------------------------------------------# if ($elev == 'f') then echo "Converting elevations from meters to feet..." awk '{print $1, $2, ($3*3.2808399)}' $ifile >! temp.f2 else cp $ifile temp.f2 endif #------------------------------------------------------------------# #------------------------------------------------------------------# # make .grd file #------------------------------------------------------------------# echo "Gridding file..." xyz2grd temp.f2 -R${xmin}/${xmax}/${ymin}/${ymax} \ -I${cell}/${cell} -V -G${ofile}.grd #------------------------------------------------------------------#

At this point we can generate a plot using grdimage to view our elevations. following is an example that will get us in the same vicinity as our image above.
#!/bin/csh makecpt -CColor_DEM.cpt -Z -T5000/10500/100 >! mtolympus.cpt

The

grdimage Mt_Olympus.grd -R430000/459995/4488500/4502000 -JX6i/2.7i \ -B10000g100000/2000g1000000000nSeW -Cmtolympus.cpt -V -P -K \ -X1 -Y2 >! olympus.ps pscoast -R -JX3d/2d -Df -V -P -O -K -W2 -N2 >> olympus.ps gs -sDEVICE=x11 olympus.ps

This gives us the following image:

This has significantly more detail than our image generated using Etopo1. But, we can make it look better than this.

Geophysical Computing

L07-5

3. Intensity Files
If we really want our elevation images to pop then we need to add some illumination. The easiest way to do this is with the grdgradient command. Grdgradient is used to calculate the directional derivative (or gradient) of a .grd file. Consider our example above where we generated a grid file for the Mt. Olympus Wilderness Area. We could just add the following couple of lines to the end of our script that generated the .grd file, to also produce shading.
#---------------------------------------------------------------------# # make intensity (.gradients) file #---------------------------------------------------------------------# echo "Calculating intensities..." grdgradient ${ofile}.grd -A0/270 -G${ofile}.gradients -Ne0.6 -V rm temp.f2 #---------------------------------------------------------------------#

What is most important here is the –A flag which describes which direction (azimuth in degrees clockwise from true North) do we shine our light from. To see how this works, we can just create a simple .grd file of a mound. In this case we just create a simple mound in a Cartesian space that is longer in the y-direction than in the x-direction. These data are located in the file mound.xyz. Using these data we can explore different gradient effects. Here is an example:
#!/bin/csh xyz2grd mound.xyz -Gmound.grd -R-100/100/-100/100 -I1/1 grdgradient grdgradient grdgradient grdgradient mound.grd mound.grd mound.grd mound.grd -Ne0.6 -Ne0.6 -Ne0.6 -Ne0.6 -A0 -GAzi_0.grd -A45 -GAzi_45.grd -A90 -GAzi_90.grd -A-45 -GAzi_m45.grd

makecpt -T-1000/2000/100 -Z -Cgray >! color.cpt grdimage mound.grd -R -JX3i/3i -B0 -Ccolor.cpt -P -K -X1 -Y6 \ -IAzi_0.grd >! mound.ps grdimage mound.grd -R -JX -B0 -Ccolor.cpt -P -O -K -X4 \ -IAzi_45.grd >> mound.ps grdimage mound.grd -R -JX -B0 -Ccolor.cpt -P -O -K -X-4 -Y-4 \ -IAzi_90.grd >> mound.ps grdimage mound.grd -R -JX -B0 -Ccolor.cpt -P -O -X4 \ -IAzi_m45.grd >> mound.ps rm Azi_0.grd Azi_45.grd Azi_90.grd Azi_m45.grd mound.grd gs -sDEVICE=x11 mound.ps

Geophysical Computing

L07-6

Azimuth = 0°

Azimuth = 45°

Azimuth = 90°

Azimuth = -45°

Finally, once we have calculated the gradients for our Mt. Olympus DEM file, we can re plot the data as follows:
#!/bin/csh makecpt -CColor_DEM.cpt -Z -T5000/10500/100 >! mtolympus.cpt grdimage Mt_Olympus.grd -R430000/459995/4488500/4502000 -JX6i/2.7i \ -B10000g100000/2000g1000000000nSeW -Cmtolympus.cpt -V -P -K \ -X1 -Y2 –IMt_Olympus.gradients –Sb –E300i >! olympus.ps pscoast -R -JX3d/2d -Df -V -P -O -K -W2 -N2 >> olympus.ps gs -sDEVICE=x11 olympus.ps

The topography definitely pops more with the shading added! As a note, in this image I used put the lighting at two different azimuths (-A0/270). Notice in the image that the topography is dominated by East-West and North-South running ridgelines. Thus putting a light source at both 0° and 270° makes both orientations of ridgelines stand out.

Geophysical Computing

L07-7

4. Map Scale in GMT
Thus far we’ve been specifying how big to make our maps in terms of inches along either the xor y- axis. For example, we’ve typed –JX6i/2.7i to draw our map of the Mt. Olympus Wilderness area to roughly the same scale in the x- and y- directions. There is a much easier way to accomplish this. Everyone in here has looked at topographic quads (at least all geoscience majors). Most commonly these are plotted at the scale 1:24,000. How do I draw a map to that scale in GMT? Simple, we just change our flag to –Jx1:24000. Our map above covers an area that is a little large for such a scale to fit on a sheet of paper, hence it might be better to try something like Jx1:200000. All we did was use a lower case x instead of an upper case X and GMT will look for map scale instead of absolute map size. Note that this works with all map projections (e.g., -Jr as opposed to –JR). At this point it would also be useful to add a scale bar to the plot. This is most conveniently done using the –L flag in either the pscoast or psbasemap command. However, this can be challenging when working in the UTM coordinate system as we have been with our DEM data. The key to getting around this is as follows: 1) Convert map bounds from UTM coordinates to Latitude and Longitude. In our plot of the Mt. Olympus region we used the UTM region:
-R430000/459995/4488500/4502000

We can use GMTs mapproject command to find the longitude and latitude. Above we have the minimum x- and y- location given by: 430000, 4488500. To find the minimum lon- and lat- location: echo 430000 4488500 | mapproject –Ju12T/1:1 –F –C –I To find the maximum lon- and lat- location:

Geophysical Computing

L07-8

echo 459995 4502000 | mapproject –Ju12T/1:1 –F –C –I Note that we used the –Ju projection, which is the UTM projection. Also note, that with the UTM projection we have to specify the UTM Zone, which for our area of interest is zone 12T. Also note that we use the –I flag which does the inverse transformation. That is, we want to go from UTM coordinates to longitude and latitude. 2) Plot map data with pscoast in UTM coordinate system. Our command may look something as follows where our latitude and longitude bounds determined from step one are represented by variables $lon1, etc.
pscoast -Ju12T/1:175000 -R${lon1}/${lon2}/${lat1}/${lat2} \ -Dc -P -O -K -W2 -N1 -A100000 \ -Lf-112.24/40.84/${lat1}/5m+l1:175000+u --LABEL_FONT_SIZE=10 \ >> plot.ps

Note that here it is the –L flag that we are interested in. The most basic form of the –L flag looks like: -Lf${lon_pos}/${lat_pos}/${lat_scale}/${distance} where, lon_pos = the longitude position on the map where you want the scale located. lat_pos = the latitude position on the map where you want the scale located. lat_scale = at which latitude do you want determine the length of your scale bar. distance = how big do we make the scale. In its most basic form, we could have just written: -Lf-112.24/40.84/40.84/5m Where the 5m indicates we want the scale bar to show 5 miles.

Geophysical Computing

L07-9

6. Homework
In this homework and in the next we will make a variety of plots of Antelope Island. Hence, whatever work you do for this weeks HW should be saved for future use. Download 5-m DEM data covering the Antelope Island region. Convert these elevation data into a netCDF format grid file, and generate a series of maps of Antelope Island. Make three separate plots of the island showing: (A) just colored elevations, (B) elevations colored with intensity shading, (C) bi-modal color (e.g., light brown for land masses) and blue (for water) with intensity shading. Provide a scale bar for the maps for both color (elevation in feet) and distance (in miles). Be sure to label the axes and to label each plot. Bring a printed copy of your map to the next class. We will vote on the best plot with the winner receiving a prize. As an example, the third plot might look something like this:

Geophysical Computing

L08-1

L08 – Generic Mapping Tools (GMT) - Part 4

We’ve come a long way thus far in our GMT usage. We really have barely even touched on some of GMTs functionality. But, there are a few more useful tricks I would like to share with you before we move on to our next topic. We will continue to look at local scale applications in our examples, but note that the commands we talk about here are also applicable to global scale apps.

1. Contouring in GMT – Topo Maps
To put it simply contouring is a pain. Not that it’s difficult to generate contour lines. But, it can be difficult to get everything to look nicely. One just needs patience and a lot of trial and error. So, how do we do it? Let’s use our data set from the Mt. Olympus region DEM. If you don’t have the file handy you can grab a new DEM file from the website. The next example shows a simple way to generate contour lines. Our primary command here is grdcontour.
#!/bin/csh # Plot region zoomed around Grandeur Peak set xmin = 433000 set xmax = 437500 set ymin = 4504000 set ymax = 4508500 # Make cpt makecpt -CColor_DEM.cpt -Z -T3000/15000/1000 >! grandeur.cpt # Make intensity file grdgradient Mt_Olympus.grd -A0/270 -GMt_Olympus.gradients -Ne0.6 -V # Grid image grdimage Mt_Olympus.grd -R${xmin}/${xmax}/${ymin}/${ymax} -Jx1:24000 \ -B1000g10000nSeW -Cgrandeur.cpt -V -P -K -IMt_Olympus.gradients \ -Sb -Ei600 >! grandeur.ps # plot 100 foot contour lines grdcontour Mt_Olympus.grd -Jx -R -W1/80/80/80 -C100 -P -S4 -O \ -A500+f3+k80/80/80+s8t -G2i/10 -Djunk -V >> grandeur.ps rm junk_*.xyz #get rid of temporary contour files

# remove excess files rm grandeur.cpt rm Mt_Olympus.gradients gs -sDEVICE=x11 grandeur.ps

Executing this script will result in the following image:

Geophysical Computing

L08-2

Here, I went for less emphasis on the contour lines. But, now is your chance to play around. Try out the above script. There are a couple of important flags to keep in mind: • • • -C – the easiest way to make contours is to supply the contour interval with this flag. In this example we set it at –C100 which means make contours at 100 foot intervals. Try out 40 and 200 foot intervals. -G – The primary use of this flag is how far to space the contour labels. In this example I said put them every 2 inches. But, you should also try closer or further a part. -A – This flag has all kinds of options. I chose: 500 - only label contours that are even multiples of 500 feet (i.e., draw labels on 5000, 5500, 6000, etc., feet). +f3 - use font number 3 (Helvetica-BoldOblique) to print contour labels +k100/100/100 – plot labels with color 100/100/100 +s8 – use an 8 point font size for labels As you can see there are many options with the –A flag. Try playing around with them now!

Geophysical Computing

L08-3

The previous example uses the simplest way to generate contour intervals. But, sometimes you may want to contour specific intervals that aren’t evenly spaced. For example, perhaps I want to draw contours every 200 feet with heavy lines, but every 40 feet with lighter lines. Because 40 is a multiple of 200, I don’t want to draw the 40 foot interval lines at every point. So, what I can do is generate a file that specifies which contour intervals to plot. The file would look as follows: 40 80 120 160 240 280 … C C C C C C …

Where we note that we didn’t supply a contour at the 200 foot level. The online material contains a file called contours40.dat that can be used as a test. Using this file one would specify the contour interval by the name of the file: -Ccontours40.dat

2. Vector Data
If by chance you have been browsing the Utah GIS data resources you may have noticed that there are tons of interesting vector data resources. Unfortunately, all of these data are written in ESRI shapefile format (.shp). So, how do we deal with these data? That will be left as a future exercise for the reader.

3. 3D Perspectives
GMT can also produce 3D perspective views using the grdview command. Revisiting our plot of Grandeur Peak:
#!/bin/csh # Plot region zoomed around Grandeur Peak set xmin = 433000 set xmax = 437500 set ymin = 4504000 set ymax = 4508500 # Make cpt makecpt -CColor_DEM.cpt -Z -T3000/15000/1000 >! grandeur.cpt # Make intensity file grdgradient Mt_Olympus.grd -A0/90 -GMt_Olympus.gradients -Ne0.6 -V # Gridview Azimuth/Elevation #1 grdview Mt_Olympus.grd -R${xmin}/${xmax}/${ymin}/${ymax} -Jx1:24000 \ -JZ4i -B1000g10000nSeW -Cgrandeur.cpt -V -P \ -IMt_Olympus.gradients -E180/30 -Qs >! grandeur.ps

Geophysical Computing

L08-4

# Grdview Azimuth/Elevation #2 grdview Mt_Olympus.grd -R${xmin}/${xmax}/${ymin}/${ymax} -Jx1:24000 \ -JZ4i -B1000g10000nSeW -Cgrandeur.cpt -V -P \ -IMt_Olympus.gradients -E260/5 -Qs >! grandeur2.ps # remove excess files rm grandeur.cpt rm Mt_Olympus.gradients gs -sDEVICE=x11 grandeur*.ps

With grdview we need to specify both the Azimuth from which we are viewing the plot and the elevation as shown in the following diagram. As usual azimuth is measured clockwise from North and elevation is measured as the positive angle upwards from horizontal.

We specify these parameters with the –E flag, which has the syntax: –Eazimuth/elevation. In the example script we first look at Grandeur Peak from an azimuth of 180° - or from due South. This is similar to our other views, where North is towards the top of our plot.

View of Grandeur Peak from an azimuth of 180° and an elevation of 30°.

Geophysical Computing

L08-5

The above plot is what Grandeur Peak might look like from the view of an airplane. However, a more familiar view to me is looking at it as I drive East up 3300 South. To mimic this view I set our azimuth as 260° and the elevation as 5°. Note that elevation must be positive in grdview.

View of Grandeur Peak from an azimuth of 260° and an elevation of 5°.

4. Raster Overlays (grdview)
There is also another important feature supported by grdview in addition to being able to draw perspective views. Namely, it supports overlaying Raster images. However, it isn’t always exceptionally the easiest thing to do, it is still useful when you want to script the plotting and/or generate several images. To begin with, what kind of raster images can be draped? Pretty much anything you want. So, let’s take as an example draping a topo quad. For the state of Utah we can get our topo quad images from http://gis.utah.gov. Go to: GIS Data & Resources > SGID, Utah GIS Data > Raster GIS Data > USGS Topographic Maps DRGs > By Quad Name, USGS Scanned Topographic Maps (1:24000, GeoTIFF) For the following example we use the MOUNT AIRE (Q1321) quad. We will generate the following shaded relief image centered on Mount Raymond.

Geophysical Computing

L08-6

The following script demonstrates the process.

Geophysical Computing

L08-7

#!/bin/csh # Which DRG files to use set prefix = q1321_DRG24k-c #leave off the extensions set geotif = ${prefix}.tif #actual .tif image set reference = ${prefix}.tfw #reference positions

# Set bounds of geotif image set lon1 = `awk 'NR == 5 {print $1}' $reference` set lat2 = `awk 'NR == 6 {print $1}' $reference` set x_inc = `awk 'NR == 1 {print $1}' $reference` set y_inc = `awk 'NR == 4 {print $1}' $reference` set lon2 = `awk 'NR == 1 {print ('$lon1'+'$x_inc'*8752)}' $reference` set lat1 = `awk 'NR == 1 {print ('$lat2'+'$y_inc'*11447)}' $reference` echo $lon1 $lon2 $lat1 $lat2

# Convert collarless tif to .ras using ImageMagick convert $geotif topo.ras # Convert .ras to .grd gmt2rgb topo.ras -I1 -F -Gtopo_%c.grd # Its easiest to set the bounds of the .grd files with grdedit grdedit topo_r.grd -R${lon1}/${lon2}/${lat1}/${lat2} grdedit topo_g.grd -R${lon1}/${lon2}/${lat1}/${lat2} grdedit topo_b.grd -R${lon1}/${lon2}/${lat1}/${lat2} # Make intensity file grdgradient Mt_Olympus.grd -A0/270 -GMt_Olympus.gradients -Ne0.6 -V # Make map set xmin = set xmax = set ymin = set ymax = bounds centered on Mt. Raymond 439000 443000 4499000 4503000

# Gridview it up grdview Mt_Olympus.grd -R${xmin}/${xmax}/${ymin}/${ymax} -Jx1:24000 -JZ4i -B1000g10000nSeW -V -P -IMt_Olympus.gradients \ -E180/90 -Qi600 -Gtopo_r.grd,topo_g.grd,topo_b.grd >! mtraymond.ps rm topo_r.grd topo_g.grd topo_b.grd rm Mt_Olympus.gradients topo.ras gs -sDEVICE=x11 mtraymond.ps

So what did we do? The following pointers may be helpful: • Note that the file *.tfw contains the UTM positions of the upper-left corner of the image. This file also contains the increment between each pixel in the x- and y- direction.

Geophysical Computing

L08-8

• •

The GMT codes can only read Sun raster image files (.ras). Fortunately it is exceptionally simple to convert from .tif to .ras with ImageMagick’s convert utility. GMTs command gmt2rgb is used to convert the Raster image (.ras) into a series of three .grd files. The three files contain the Red, Green, and Blue color data from the raster image. I typically find it is easiest to change the boundaries of the .grd files using GMTs grdedit command. Lastly, we just drape the .grd files with the grdview command.

• •

Once you’ve learned how to drape images you can add just about any type of information you want to a plot. As a final example consider the following plot I made that shows peak ground acceleration due to a hypothetical earthquake on the Wasatch Fault. What is important is not so much the peak acceleration values, but the point that anything you can color up can easily be draped in a GMT image.

Geophysical Computing

L08-9

5. Animations
Generating Animations isn’t exactly a GMT task. However, it is a natural extension since all plots made in GMT are scripted. Hence with some looping and variables we can quickly generate several different plots that can be strung together into an animation. There are several ways to generate animations. Here we will discuss the primary technique and show two different ways to finalize the animation product: (1) Using ImageMagick to generate animated gifs (.gif), or (2) Using ImageReady to produce Quicktime (.mov) files. All animations start with the same need – animation frames. Here we will generate animation frames for a very simple example: the rotating Lunar surface.
#!/bin/csh # set up some initial values #---------------------------------------------------------------------# set gfile = lunar_topo.grd # lunar topo gmtset PAGE_COLOR 0/0/0 # color makecpt -Cgray -Z -T-9000/8000/200 >! lunar.cpt #.cpt file grdgradient $gfile -A0/270 -Glunar.gradients -Ne0.6 –V # gradients #---------------------------------------------------------------------# # start looping through longitudes #---------------------------------------------------------------------# @ lon = 0 while ($lon <= 360) # create ordered numbering scheme for output files if ($lon < 10) then set ofile = lunar_00${lon}.ps set jfile = lunar_00${lon}.jpg else if ($lon < 100) then set ofile = lunar_0${lon}.ps set jfile = lunar_0${lon}.jpg else set ofile = lunar_${lon}.ps set jfile = lunar_${lon}.jpg endif grdimage $gfile -R0/360/-90/90 -JG${lon}/20/6.5i -Bg30 \ -Clunar.cpt -V -P -Sb -Ei300 -Ilunar.gradients >! $ofile convert -compress Lossless -density 150x150 $ofile $jfile rm $ofile @ lon = $lon + 5 end # end looping through longitudes #---------------------------------------------------------------------# # make an animated gif using ImageMagick convert -adjoin -delay 10 -loop 0 *.jpg lunar.gif rm lunar.cpt lunar.gradients

Geophysical Computing

L08-10

In the above example we use grdimage to generate images of the lunar surface using the supplied file: lunar_topo.grd. We actually make a loop through longitude and just change the projection center to a different longitude. Note, the most important thing here is our naming scheme of the output images. These need to named such that a ls command lists them in the correct order. Finally, each .ps image in converted to a .jpg image and all of these jpegs are combined into an animated .gif file using ImageMagick. One snapshot is shown below.

In running the above example we are left with a number of .jpg images. This is on purpose so we can see how to stitch them together into a Quicktime movie. Adobe Photoshop also comes with a program called Adobe ImageReady. You will have to be on a Windows or Mac OS to access this. To create a movie file in Image Ready do as follows: • • • • Copy a directory that contains all of the .jpg images over to the Windows or Mac machine. Launch Adobe ImageReady. In ImageReady do: File > Import > Folder as Frames… and select the folder containing your .jpg images. ImageReady opens up all the files and orders them by frame number. You will see the frame order in the Animation Window:

Geophysical Computing

L08-11

• • • •

Hit the triangle in the upper right corner of the Animation Window > Select All Frames Set a delay time between frames (e.g., 0.1 sec). Select: File > Export > Original Document Choose: Quick Time Movie (.mov); Set the quality from Medium to Best.

Now, all you need to do is launch the QuickTime Player to watch your animation.

6. Some Details
As a final note on GMT, there are a ton of values that GMT has initially set as defaults. Don’t believe me? Then type: >> gmtdefaults –D You may have noticed that in some of our example scripts we actually changed some of these. For example, in our script to plot the Lunar surface we made the default page background black with the following line:
gmtset PAGE_COLOR 0/0/0

Sometimes it is essential in GMT to change a setting. For example, when we used the grdproject command do you know what ellipsoid was being used for the projection? It was WGS-84. But, perhaps the map data were in the NAD-1927 datum. Then one needs to change the ellipsoid being used. In this case the Clarke-1866 ellipsoid would be needed and one would type: gmtset ELLIPSOID Clarke-1866 To see what options you have in setting the defaults just type: >> man gmtdefaults There is one more useful item to discuss here that deals with page size. The following file exists on your system: $GMTHOME/share/gmtmedia.d. This file contains paper sizes that one may use. Of immediate importance is that it is customizable. So, if you noticed that when we made

Geophysical Computing

L08-12

our animation the 8.5 × 11 page size wasn’t exactly ideal. Instead we want something more akin to that of a screen size. A nice thing to do is add the following line to the file gmtmedia.d snap 672 504

This specifies a paper type called snap. Now, if I want to use this paper size I can change the following GMT default: gmtset PAPER_MEDIA snap Alas, you have a page size that is much more suited for creating animations.

7. Homework
1) Generate a plot showing 3D perspective views of Antelope Island. The plot should contain two panels showing the island from two different viewpoints. As always be sure to include all relevant scale bars. 2) Pick an appropriate 3D perspective view of Antelope Island and generate an animation that mimics changing illumination from sunrise to sunset. That is, the illumination azimuth should start from the East and move to the West.

Geophysical Computing

L09-1

L09 – Fortran Programming - Part 1 1. Compiled Languages
Thus far in this class we’ve mostly discussed shell scripting. Here we bring a bunch of different utilities together into a recipe – or a script. Once we do that we can execute the script and the job runs. This is a very low level style of programming if you can even call it programming. Note, that this Shell scripting relied almost entirely on programs that were developed by other people. So, how do people develop these programs? The answer lies mostly in compiled languages (although many programmers today are getting away from compiled languages). That is, most of the programs we used were written in a language like C or Fortran. Fortran is an example of a compiled language. What does this mean? It means that we write code in human readable format. E.g., IF (variable == 10 ) THEN write(*,*) “The variable is 10” ENDIF But, our computer does not have the capability to read such code. The shell might, but that takes time. Instead we want to convert the above statement into something that the processor on our computer can understand. Usually, the processor can understand instructions given to it in a binary form, that we call machine language or assembly code. In the old days, one used to have to write code in these rather obtuse looking machine languages, some people still do if they are seriously interested in making their code run fast. But, eventually some people started writing different codes, called compilers, that can convert a language that is easily readable by humans into something that is easily readable by your cpu. In short, a compiler is like a language translator. It converts the statements you make in a language (a language like Fortran or C) to a language the cpu can speak. In this class we will focus on the human readable language called Fortran 90. The reasons are partly historical. In many of the physical sciences (geophysics, meteorology, etc.) most of the original work was written in Fortran. Hence, we have a long history of Fortran codes in our disciplines. But, Fortran still persists as a primary language in these fields. I have heard many computer scientists express their dismay that Fortran is still used with exclamations of, “I thought that was a dead language.” I think it ultimately comes down to the majority of computer scientists these days spending their time on web apps instead of serious problems like simulating the weather, or seismic wave propagation on the global scale, or gravity wave propagation across the universe. Hence, my standby response, “Fortran still produces the fastest executable code.” This makes a serious difference when you are talking about days or weeks of simulation time versus months or years when trying to solve the same problem in something like Java. Fortran is also quite simple for trying to solve mathematical problems – that is what it was designed for. It’s not fancy. But it is powerful. A Brief Fortran History When we talk about Fortran programming it is important to distinguish which version of Fortran we are referring to. You will see several versions (e.g., f66, f77, f90, f95 etc.) so a brief background is in order. John Backus developed the first Fortran compiler in 1954 – it was the

Geophysical Computing

L09-2

first high-level programming language! Different versions of Fortran compiler were being developed almost from the start and the need for some kind of standard became immediately apparent. The first official ANSI standard was introduced in 1966 and Fortran 66 (f66) was born. The next major standard was a huge improvement to the 1966 version and was released in 1978. Because the standard was agreed upon in 1977 it was termed Fortran 77 (f77). This was a major improvement to the standard and included much needed program elements such as IF THEN ELSE blocks! The majority of older code you will encounter was written with the f77 standard (although if you ever get any old seismology related code from Don Helmberger it’s likely still written in f66). Since the 1977 standard was released there was a long hiatus before a new standard was released. A friend of mine tells an interesting story of how Sun Microsystems was responsible for the hold up. Alas, a much needed new standard was finally issued in 1990 (f90). This included a deluge of improvements including the free form source (in f77 you had some incredibly annoying restrictions as to where you could write certain parts of the code). There were a number of bugs in the 1990 standard, which were quickly crushed and put into the 1995 standard (f95). Reportedly even before the 1995 standard was released new improvements were decided upon, which ultimately gave way to the current 2003 standard. But, at long last we are in a fluid time in the Fortran history and even newer standards are forthcoming. In these lectures we will refer to programming in f90. This is because the major improvements were all made A computing hero! John Backus. with the f90 standard. Although, it should be noted sometimes we will talk about things that are actually parts of the newer f95 and f2003 standards.

Fortran Compilers There are several different Fortran compilers on the market today. Some are free and others are commercially developed. Their primary differences are related to (1) which features of the most recent standard are included, and (2) how good are the optimizations (we will discuss what this means in a later lecture). Some common compilers available on our systems are: Compiler g95 gfortran ifort pgf90 pathf95 Developer GNU Fortran Intel Portland Group Pathscale Cost free free $$$ $$$ $$$ Website http://www.g95.org/ http://gcc.gnu.org/fortran/ http://software.intel.com/en-us/intelcompilers/ http://www.pgroup.com/ http://www.pathscale.com/

Geophysical Computing

L09-3

2. Your first Fortran code
So, let’s dive in with a really quick example. Open up a new file called myprog.f90 and type in the following: PROGRAM first IMPLICIT NONE write(*,*) “I am a fortran program.” END PROGRAM first This is about as simple of a Fortran code as can be written. The important points are: • • • • A new program must be given a name with the PROGRAM statement. The end of the program must be specified with END PROGRAM. IMPLICIT NONE is not required before the main part of the code is written, but it is terrible, terrible practice to leave it out. More on this later! write(*,*) says to write something to standard output (a fancy way to say write it out to the screen).

So, how do we run this code? Well as noted above, it cannot be run until we compile it. In this class we will use the g95 Fortran compiler. So, at the command line type: >> g95 myprog.f90 What happens is that a new file called a.out is created (on some systems this may be named a.exe). This new file which is created is in binary format (you cannot open it up with your text editor and view its contents) and is an executable file. To run the program just type: >> ./a.out Before we move on, let’s introduce our first compile flag: the –o flag. >> g95 myprog.f90 –o myprog.x The –o flag let’s us name the executable file that is created. In this case we have a new file called myprog.x. Note, Fortran 90 programs require a .f90 extension or else some compilers assume the code is written with the f77 standard. The .x extension above is just my preferred extension to let me know that I have an executable file. So, there you have it. You are now a Fortran programmer.

Geophysical Computing

L09-4

3. Numeric Variables
Creating variables in f90 programs is slightly more complicated than in Shell scripts. This is because we have a few different types of variables we can use. The three main types you need to worry about now are: • INTEGER – These are just the set of integers (e.g., 0, 1, 2, 3, -1, etc.). Use integers for counting, but don’t for example use integers to do math like 1/3 because the result is not an integer. REAL – The set of real numbers (for your normal mathematical operations). CHARACTER – Text strings.

• •

There are other types of variables you may decleare as well (e.g., Complex) but most of what you will do revolves around those three types. In a f90 program we define our variables at the beginning of the program. For example: IMPLICIT NONE REAL(KIND=8) INTEGER(KIND=4) :: arg :: J ! argument for trig functions ! looping variable

In this example we named two variables: (1) The first is named arg and is a real number, (2) The second is named J and is an integer. Note that in f90 programs we start writing comments with the exclamation point! We will discuss what the KIND statement means in Section 5 of this lecture. But first, let’s create a new example and see how to assign variables. PROGRAM variables IMPLICIT NONE ! Define my variables REAL(KIND=8) :: A, B INTEGER(KIND=4) :: J, K ! Declaring Real Numbers A = 10.0 B = 20.0 C = A*B – (A/B) write(*,*) “C = “, C ! Declaring integers J = 10 J = J + 10 write(*,*) “J =”, J

! some real numbers ! just a lowly integer

Geophysical Computing

L09-5

! Improper use of an integer -- see what happens K = J + 0.1 write(*,*) “K =”, K END PROGRAM variables The above example shows a very important point: Never declare a real number without a decimal point in it! For example, we declared A = 10.0, Never do this as A = 10 (without the decimal point). Why? Some Fortran compilers are buggy and will give you A=10.2348712410246287 or some kind of similar garbage if you don’t put the decimal point there. Another key point is that in Fortran variable names are not case sensitive: PROGRAM casesense IMPLICIT NONE REAL(KIND=8) :: gH gH = 10.0 ! All these version of ‘gh’ are treated the same write(*,*) gH, GH, Gh, gh END PROGRAM casesense Another fine point to make here is that in the older styles of Fortran programming (f77) one didn’t have to declare their variables at the start of the program. There were specific rules one could follow. For example, if you created a variable name that started with an i then it was taken to be an integer. Do Not do this. Writing code like this is an example of the worst of programming practices. Always declare your variables at the beginning of the code. This will trap many errors (some with very subtle effects) that may go unnoticed otherwise. In fact, writing in the IMPLICIT NONE statement at the beginning assures that you will have to, as the IMPLICIT NONE statement means that the compiler should expect all variables to be declared.

4. Arithmetic
In the examples above we already showed a slight arithmetic example. Just to clarify our options in Fortran the following are out arithmetic operators: Operation Addition Subtraction Division Multiplication Exponentiation Fortran Symbol

+ / * **

Geophysical Computing

L09-6

There are also many intrinsic functions that we can use: Function INT REAL ABS MOD SQRT EXP LOG LOG10 SIN COS TAN ASIN ACOS ATAN ATAN2 Action convert real number to integer convert integer to a real number absolute value remainder of I divided by J square root exponentiation [ex] natural logarithm [ln (y)] common logarithm [ log10(y)] sine cosine tangent arcsine arccosine arctangent arctangent(a/b) Example J = INT(X) X = REAL(J) X = ABS(X) K = MOD(I,J) X = SQRT(Y) Y = EXP(X) X = LOG(Y) X = LOG10(Y) X = SIN(Y) X = COS(Y) X = TAN(Y) Y = ASIN(X) Y = ACOS(X) Y = ATAN(X) Y = ATAN2(A,B)

Important Note: In Fortran (as in most programming languages) the arguments to the trigonometric functions are expected to be in radians. If your arguments are in degrees you need to first convert them to radians! For example: To take the sine of 45°: PROGRAM example IMPLICIT NONE REAL(KIND=8) :: argument, answer REAL(KIND=8) :: pi pi = 3.141592654 argument = 45.0 !Define pi !My argument is 45 deg !Convert from deg to rad !Take the sine !Print out the answer

argument = argument*(pi/180.0) answer = SIN(argument) write(*,*) answer END PROGRAM example

One should at this point be excited. Remember how challenging it was to do math inside a shell script. By comparison this is downright easy in Fortran.

5. Numeric Types
In our examples we have specified the naming of our variables in terms of KIND=4 or KIND=8. What this means is that we use numbers that are stored with either 4- or 8-bytes. Let’s take a look at the following example to see what this means:

Geophysical Computing

L09-7

PROGRAM simpletest IMPLICIT NONE INTEGER(KIND=1) :: J J = 0 DO write(*,*) J J = J + 1 IF (J < 0) EXIT ENDDO END PROGRAM simpletest This isn’t a very complicated program, although we haven’t looked at DO or IF statements yet. The program starts out with the variable J = 0, and starts a loop. It adds 1 to J at each step. So that J = 0, then J = 1, then J = 2, etc. Then it makes a test that says if J is less than 0 let’s exit the program. Run this program and write down that last value of J before it becomes < 0:_________________ That’s some odd behavior for sure. We keep adding a positive number to J and eventually J becomes negative. Now let’s change the KIND type in the above program: INTEGER(KIND=2) :: J Now what was the number we got to before J became < 0:_______________________________ Definitely a bigger number. The important point is that we actually have to specify how much memory to use to store the numbers. In the above two examples we used 1 or 2 bytes per integer number. Recall that there are 8 bits per byte and that a bit is either a 1 or 0. So, a 1-byte or 8 bit number might look like: 10010110, Which would actually represent the number: 1×27 + 0×26 + 0×25 + 1×24 + 0×23 + 1×22 + 1×21 + 0×20 = 150 The largest possible 8 bit number is: 11111111 = 1×27 + 1×26 + 1×25 + 1×24 + 1×23 + 1×22 + 1×21 + 1×20 = 255 However, we haven’t considered sign (i.e., + or -). So we need to use one bit to store the numbers sign. Hence, the largest number we can store in an 8-bit integer is: 1×26 + 1×25 + 1×24 + 1×23 + 1×22 + 1×21 + 1×20 = 127

Geophysical Computing

L09-8

The largest 2-byte (16 bit) number possible is 32767, and so on. Hence, if we are expecting to do some math with large numbers it is important to realize what the biggest number our memory can hold. Real numbers are another story and we need to be concerned with our precision. Remember that a true real number like 1/3 is 0.33333… where we extend off into infinity. Well, a computer does not have infinite memory so we don’t have true real numbers in the computers representation, but just an approximation to them. In general when doing mathematical operations we want to store our real numbers with 8-bytes (KIND=8) or else we will start to notice some real precision problems – especially when working with trigonometric functions. However, programs will run faster with smaller storage space per number (e.g., KIND=4) as we aren’t spending as much time writing numbers into memory. So, we don’t always want to go all out and use the largest KIND type available to us. Fortran does provide an easy way to see what the largest available number is for the KIND type you are using by supplying the HUGE intrinsic function. For example, in the example program add the statement: write(*,*) HUGE(J) What is the largest number available for 8-byte integers? How about 8-byte reals?

6. More Information
It is not possible in these tutorials to describe all aspects of Fortran, or even to show you all of the intrinsic functions that are available. Fortunately there are several good web sites available. A couple are listed below:

Numerical Recipes in Fortran: http://www.nrbook.com/institutional/ Fortran Language Reference: http://h21007.www2.hp.com/portal/download/files/unprot/fortran/docs/lrm/dflrm.htm Michael Metcalf’s Fortran tutorials: http://wwwasdoc.web.cern.ch/wwwasdoc/f90.html

7. Homework
1) Write a program that allows you to define the latitude and longitude at a point on the Earth’s surface and will return the x-, y-, and z- Cartesian coordinates for that point. Assume the positive z-axis goes through the geographic spin axis (N pole). Make sure your coordinate system is right handed.

Geophysical Computing

L09-9

2) By definition of the dot product we can find the angle between two vectors from the formula:

a ⋅ b = a b cos θ
Write a program that will let you define the x-, y-, and z- coordinates of two vectors in a Cartesian space and find the angle between the two vectors. 3) Combining the programs you wrote above write a program that will allow you to input two latitude and longitude coordinates on the Earth’s surface and will return the angular distance between the two points in degrees. Assume the Earth shape to be a sphere with radius = 6371.0 km. Also have the program return the distance (in both km and miles) between the two points on the surface. This distance is the great circle arc distance.

Geophysical Computing

L10-1

L10 – Fortran Programming - Part 2 1. Control Structures – DO Loops
Just like any computing language Fortran would be pretty useless without some control structures. Fortunately they aren’t much different from our C Shell cases, and are trivial to implement. As always looping is essential; In Fortran we use a DO Loop. There are 3 primary ways we can do this: DO Loop Example 1: The most common way to do this looks as follows: DO Variable = Start, End, Increment your code… ENDDO Here is an example code: PROGRAM doloopexample IMPLICIT NONE INTEGER(KIND=4) :: J DO J=1,10 write(*,*) “J = “, J ENDDO END PROGRAM doloopexample All we have done is: • • • • • Define a loop variable (in this case the variable J which is an integer). Let J = 1 at the start. Execute all commands that are in between the DO and ENDDO statements Increase J by 1 (i.e., J now equals 2) and again execute all statements between the DO and ENDDO statements but this time with value of J being 2. Keep doing this until J = 10.

! Looping Variable

Note, that in Fortran (as opposed to the C Shell) we do not explicitly need to state that we let J = J+1. This is assumed by Fortran. But, what if we wanted J to increase by more than 1 at a time? Then we just need to add an increment. E.g., change the above example code to have the following line: DO J=1,101,10

Geophysical Computing

L10-2

Note that the Fortran 95 standard states the inclusion of: DO statements using REAL variables. However, I haven’t seen this implemented in any compilers yet, and we are thus still stuck using integers as loop variables.

DO Loop Example 2: This example is akin to the while statement in the C Shell. The basic syntax looks like: DO WHILE ( Some logical expressions ) your code… ENDDO Here is a simple example: PROGRAM domore IMPLICIT NONE REAL(KIND=4) :: X X = 1.0 DO WHILE (X <= 10.0) write(*,*) “X =”, X X = X + 0.25 ENDDO END PROGRAM domore This example let’s us keep executing the commands inside our DO loop until the value of X has increased above 10.0.

DO Loop Example 3: The final way that we can perform a DO loop is as follows: DO your code… IF ( some logical expressions ) EXIT ENDDO This final form is very similar to that in Example #2. We could rewrite that example as:

Geophysical Computing

L10-3

PROGRAM domore IMPLICIT NONE REAL(KIND=4) :: X X = 1.0 DO write(*,*) “X =”, X X = X + 0.25 IF (X > 10.0) EXIT ENDDO END PROGRAM domore

2. Control Structures – IF THEN ELSE
If you have the C Shell scripting down, then these will look extremely familiar to you. The syntax for IF THEN statements in Fortran looks like: !Basic form of the If statement IF ( some logical expressions ) THEN your code… ENDIF ! Or, with some other options… IF ( some logical expressions ) THEN your code … ELSEIF ( some other logical expressions ) THEN more code… ELSE even more code… ENDIF The key here is that in Fortran we use the following operators in our logical expressions:

Operator
== /= >= <= < > .AND. .OR. .NOT.

Meaning
Equal to Not equal to Greater than or equal to Less than or equal to Less than Greater than Logical AND Logical OR Logical NOT

Geophysical Computing

L10-4

As an example of how we use these, let’s just do a simple test of an angle measured in a Cartesian space and see which quadrant it lies in: PROGRAM noname IMPLICIT NONE REAL(KIND=8) :: Theta, x, y !Define x,y coordinates x = 0.5 y = 0.25 !Determine angle in degrees Theta = (ATAN2(x,y))*(180.0/3.141592654) ! Let’s check and see which quadrant theta lies in ! based on the angle Theta IF ( Theta >= 0.0 .AND. Theta < 90.0 ) THEN write(*,*) “Theta is in upper right quadrant…” ELSEIF ( Theta >= 90.0 .AND. Theta <= 180.0) THEN write(*,*) “Theta is in lower right quadrant…” ELSEIF (Theta < -90.0 .AND. Theta >= -180.0) THEN write(*,*) “Theta is in lower left quadrant…” ELSEIF (Theta < 0.0 .AND. Theta >= -90.0) THEN write(*,*) “Theta is in upper left quadrant…” ELSE write(*,*) “Error: ENDIF Theta has a nonrealistic value”

END PROGRAM noname Similar to C Shells Fortran also provides a CASE statement. The syntax can be looked up on any of the Fortran related web sites.

3. Outputting Results – Writing Files
Writing variables to files in Fortran involves the following steps: 1) OPEN up a file to write with an associated UNIT number. 2) WRITE the data to that file. 3) CLOSE the file.

Geophysical Computing

L10-5

The following examples will show the basic procedure which varies slightly if you want to write out ASCII formatted files (normal situation) or if you want to save time and space and write out binary files. Writing ASCII Files In our previous examples of writing our output to the screen (standard out) we used a write(*,*) statement. You may have been asking what are the *’s for? These are essentially short cuts for the UNIT and FORMAT statements, which we will describe below. Our first step is to OPEN up files: The simplest way to do this is: OPEN(UNIT=1,FILE=’myfile.dat’) Here I have associated a UNIT Number = 1, with the file I just created called myfile.dat. I can open up more files at the same time if I want to, but then I need to use a different UNIT Number. E.g., to open up another file: OPEN(UNIT=2,FILE=’anotherfile.xyz’) Now, whenever we refer to either of the above two files we refer to them by their Unit Number. For example, now if I want to write out the variable X into the file unit 1, I can do: write(UNIT=1,*) X I don’t actually have to type out UNIT every time and just say: write(1,*) X I could similarly write out the variable Y to file unit 2: write(2,*) Y Once we are done writing to our files we need to CLOSE off the files: CLOSE(1) CLOSE(2)

A full code example showing how to write an ASCII file is shown here: PROGRAM asciiexample IMPLICT NONE REAL(KIND=4) :: X INTEGER(KIND=4) :: J X = 10.0 !Initialize a variable X

Geophysical Computing

L10-6

!Open up a new file called test.data OPEN(UNIT=1,FILE=’test.data’) DO J=1,10 !Loop 10 times write(1,*) J, X X = X/2.0 ENDDO CLOSE(1) !write out J and X to unit 1

!We are done writing so close off Unit 1

END PROGRAM asciiexample In the above example we have the Unit=1, and Format=*. In Section 5 we will discuss the Format statement further.

Writing Binary Files Writing binary files can be accomplished with the open statement as follows: OPEN(UNIT=1,FILE='filename',FORM='unformatted') No format statement can be included when writing, so writing is done as follows: WRITE(1) “what ever you want to write”, variables Hence, writing binary files is easier than writing formatted files; however, special care must be taken when reading back in unformatted data. In particular, the exact kind type used to write out the file must be used when reading back in the data, or else you will read in pure garbage. The following code example shows how to write out data in binary format: PROGRAM bin IMPLICIT NONE INTEGER(KIND=4) :: I, nr OPEN(UNIT=1,FILE='bin_test',FORM='unformatted') nr = 10 WRITE(1) nr DO I = 1,10 WRITE(1) I ENDDO CLOSE(1) END PROGRAM bin

Geophysical Computing

L10-7

Options with the OPEN statement Our above examples are quite simplistic, but encompass 99% of what you will want to do with writing output files. However, sometimes you may want to do something a little more advanced such as append to a file that already exists. There are some additional actions that may be done with the OPEN statement. For example: To only write to a file if it doesn’t already exist:
OPEN(UNIT=1,FILE=’myfile.dat’,STATUS=’new’,IOSTAT=ios)

To check and see if a file already exists and append to it:
OPEN(UNIT=1,FILE=’myfile.dat’,STATUS=’old’,POSITION=’append’)

Notice we have used the IOSTAT (Input/Output status) variable ios. Here we need to declare ios at the beginning of our code: INTEGER(KIND=4) :: ios This can help in error detection. Imagine the first situation, where we only want to open the file if it doesn’t already exist. Create a file called testing.dat and then try the next code example: PROGRAM testio IMPLICIT NONE INTEGER(KIND=4) :: ios OPEN(UNIT=1,FILE=’testing.dat’,STATUS=’new’,IOSTAT=ios) write(*,*) ios END PROGRAM testio A common problem is that not all Fortran compilers return the same value for IOSTAT depending on whether a file exists or not. But, if you know what value your compiler returns you can then do something useful such as give the user a warning that the file isn’t being opened because it already exists. But, beware, the code may not be portable to different machines. A better option is to use the INQUIRE statement. This is a logical function that returns a true or false answer as to whether your file already exists. LOGICAL :: file_exists INQUIRE(FILE=’testing.dat’,EXIST=file_exists) write(*,*) file_exists

Geophysical Computing

L10-8

4. Outputting Results – the Format Statement
So, far we have only specified the output format with an asterisk (*) which doesn’t provide any formatting information at all. Generally this is all you need to do. However, sometimes you may want the output to look fancy or you need it to be in a very specific format to be read in by another computer program. Fortran has a simple method to format output: 1) Somewhere in the code put a FORMAT statement with a reference number. This might look like: 100 FORMAT(I3) Where the 100 before the FORMAT statement is the reference number. We will discuss what goes inside the FORMAT statement later, but suffice it for now to say that it includes all of the directions on how the output should look. 2) Use the reference number in place of the asterisk (*) in your write statements. E.g., write(UNIT=1, FMT=100)

Integer Format The integer format (I) is the easiest to specify. If I say I2 then I want to use two columns to display my integer. Similary I4 would mean to use four columns. Try the following example: PROGRAM formatexample IMPLICIT NONE INTEGER(KIND=4) :: J J = 1 100 FORMAT(I2) write(UNIT=*,FMT=100) J 101 FORMAT(I2.2) write(UNIT=*,FMT=101) J J = 100 write(UNIT=*,FMT=100) J 102 FORMAT(I4) write(UNIT=*,FMT=102) J END PROGRAM formatexample

Geophysical Computing

L10-9

F – Format for Reals With real numbers we need to concern ourselves with the decimal point. Basically we define how many total columns we want to use, and then how many of those columns should be numbers after the decimal point. Imagine the example where we want to write out longitudes with 2 decimal points. A typical longitude may be a number like lon = -179.50. So, including the negative sign and the decimal point, we need 7 columns to display this number, and 2 columns after the decimal point. So, we specify our format as F7.2. Displaying the number -179.50 with F7.2 Format: column value In our code we just write: 100 FORMAT(F7.2) A – Alphanumeric Format We haven’t talked about character strings yet, but the easiest way to specify that we are writing out characters is: 100 FORMAT(A) i.e., just use the letter A. More complicated Output. Of course all of the above types may be combined into a single format statement. Try the following example where we use an X to represent the number of spaces in between numbers: PROGRAM example IMPLICIT NONE INTEGER(KIND=4) :: J REAL(KIND=4) :: x, y J = 1 x = 1000.5 y = 200.1564 write(*,100) “J, x, y = “, J, x, y 100 FORMAT(A,5X,I1,2X,F7.2,2X,F7.2) END PROGRAM example 1 2 1 3 7 4 9 5 . 6 5 7 0

Geophysical Computing

L10-10

5. Homework
1) One of the easiest ways to determine how long it takes a code to run is to imbed the executable between two date commands in a C Shell. For example, suppose I have a code named mycode.x that I want to determine how long it took to run. I could make a C Shell script like: #!/bin/csh date ./mycode.x date The only problem with this is that the output might look like: Tue Aug 3 02:59:27 MDT Thu Aug 5 03:55:35 MDT 2010 2010

In most circumstances this is easy enough to decipher, but sometimes the code may run for days and you can not quickly determine how much time it took to run. If you are doing a lot of benchmarking you might just want to know really quick how many minutes did the code take to run. Hence, the solution is to write a code: difdate.f90. This code will read in the two lines from the date command. [Note that reading formatted input is the same as writing formatted output, i.e., use read(*,FMT=100) instead of write(*,FMT=100)]. As output this code will report the difference in time between the two date commands. It will report the difference in two ways: (1) Total number of decimal minutes – e.g., 50.23 m, and (2) Total days, hours, minutes, and seconds: e.g, 2 d 13 h 4 m 13.2 s. The code should work for any combination of dates and times, even if there are years in between the date commands output. Hint: The easiest solution may involve converting your year, month and day into an integer value based on Julian day.

Geophysical Computing

L11-1

L11 – Fortran Programming - Part 3 1. Dealing with Characters
The key data type we have thus far ignored are characters. In general Fortran is not all that nice about handling characters, but does provide rudimentary tools for dealing with them. To start, we need to know how to declare characters. This is slightly different than our declarations of integers or real numbers. For example: CHARACTER(LEN=1) :: A CHARACTER(LEN=80) :: mystring In the above example we have declared two variables: (1) A that will only store 1 character [as defined by LEN=1], and (2) mystring that will store 80 characters. With character strings we have to define how many characters the variable can store, but we don’t have to fill up all the characters. For example, I may want to use a variable month that stores the current month. I may want to give the variable a length of LEN=9 so it will hold the name of the month with most number of characters (September). Declaration of characters is done similarly as regular numbers. The following example shows how to read in a character input from the user: PROGRAM readuser IMPLICIT NONE CHARACTER(LEN=80) :: window_function !write a query to standard output write(*,*) “What type of window function would you like to use?” write(*,*) “ [E.g., ‘Boxcar or Blackman’]” !Now read in the users response read(*,*) window_function !Now write back out the user’s response write(*,*) “ “ write(*,*) “You selected the: “, window_function, “ function…” END PROGRAM readuser

There are also operators and intrinsic functions that work on characters. The most important operator is the concatenation operator which is two forward slashes: //. That is, I may want to join two characters together. For example, to combine two characters into one: CHARACTER(LEN=5) :: A, B CHARACTER(LEN=10) :: C A = ‘one’ B = ‘test’ C = A//B

Geophysical Computing

L11-2

Try concatenating operators in the above example. What happens? As you may have noticed there are a number of spaces between the two characters. This may not be desirable, so there are the useful intrinsic functions TRIM and ADJUSTL. The ADJUSTL function shifts all the characters to be left justified, and the TRIM function trims off all the trailing blanks. Try the concatenation again with the following line: C = TRIM(ADJUSTL(A))//TRIM(ADJUSTL(B)) It is often necessary to also convert integers and real numbers to characters. However, there is no intrinsic function in Fortran to do this. To do this we need to use a minor Fortran trick. Let’s assume we want to loop through an integer (say from 1 to 10) and append the number of the loop to a character string. The following example shows the way PROGRAM aa IMPLICIT NONE INTEGER(KIND=4) :: J CHARACTER(LEN=3) :: Jstr CHARACTER(LEN=10) :: output output = ‘example_’ DO J=1,10 ! Convert integer J into a character Jstr with 3 columns write(Jstr,”(I3.3)”) J write(*,*) TRIM(ADJUSTL(output))//Jstr ENDDO END PROGRAM aa Note that we have to write our variable J into the variable Jstr.

2. Intro to Arrays
Arrays are the last data type we will talk about in this class. Note, there are also logical and complex data types that may be important for some of your work, but arrays are definitely the most important. Arrays allow us to store and manipulate many values with a single variable name. For example, we may have 4 measurements of seismic S-wave velocity and we want to store all of those measurements in the single variable called Vs. The following example shows us how we can hard wire these measurements into a single variable name:

Geophysical Computing

L11-3

PROGRAM arrays IMPLICIT NONE REAL(KIND=4), DIMENSION(4) :: Vs Vs(1) Vs(2) Vs(3) Vs(4) = = = = 7.26 3.48 2.50 2.48

write(*,*) Vs write(*,*) Vs(3) END PROGRAM arrays The above example demonstrates the following points: • When we declare an array variable we need to specify its size. In this case, we want to store 4 real numbers, so we declare the variable as being REAL, and that we will give it a DIMENSION of 4. The position of each element in the array is given by a number inside parentheses. This is called the array index. Here, the third element of the variable Vs [denoted by Vs(3)] is 2.50.

The above example shows one way to declare the elements of our array. Here is an equivalent example where we use the array index notation to state that we will be declaring the values of array elements 1 through 4: PROGRAM arrays IMPLICIT NONE REAL(KIND=4), DIMENSION(4) :: Vs Vs(1:4) = (/ 7.26, 3.48, 2.50, 2.48 /) write(*,*) Vs(3) END PROGRAM arrays These examples have thus far been confined to vector data (i.e., one column of data) however, our arrays do not have to be confined in such a manner. We can specify an array to have several rows and columns of data. For example if I have 8-rows of data, and 4-columns of data: Then I can specify this as: REAL(KIND=4), DIMENSION(4,8) :: myvariable In general we prescribe dimensions as DIMENSION(number_columns, number_rows). Note that this is backwards from Matrix notation (as is used in Matlab) which usually specifies array indices as (row,column). Let’s look at an example in detail here. Let’s assume we have 4 rows and 2 columns of data. This could be depth in the Earth and Seismic wave velocity for example:

Geophysical Computing

L11-4

row 1 row 2 row 3 row 4

column 1 0.0 20.0 40.0 60.0

column 2 1.45 6.80 8.10 8.08

Using our index notation we can declare the variable Vp for P-wave velocity in the Earth and store both our depth (first column) and velocity (2nd column). PROGRAM arrays IMPLICIT NONE REAL(KIND=4), DIMENSION(2,4) :: Vp INTEGER(KIND=4) :: J ! Put data values into column 1, rows 1:4 Vp(1,1:4) = (/0.0, 20.0, 40.0, 60.0/) ! Put data values into column 2, rows 1:4 Vp(2,1:4) = (/1.45, 6.80, 8.10, 8.08/) ! Write out data one row at a time DO J=1,4 write(*,*) Vp(:,J) ENDDO END PROGRAM arrays

3. Reading in array data
A key feature of Fortran 90 that wasn’t available in Fortran 77 or earlier versions is the addition of ALLOCATABLE arrays. With f77 you always needed to declare the size of the array at the onset of the program. However, with the new syntax you can wait until later. All you have to do is declare the shape. Then at some later point you can decide how many elements will go into the array. The following example shows how to read data from a file into a Fortran array variable. The data from the input file may have differing numbers of lines. PROGRAM aa IMPLICIT NONE REAL(KIND=4), DIMENSION(:), ALLOCATABLE :: mydata INTEGER(KIND=4) :: nr, J CHARACTER(LEN=100) :: infile ! Ask the user for some information about the data to be read in write(*,*) "Enter the name of the data file to read..." read(*,*) infile write(*,*) "Enter the number of lines in the input data file..." read(*,*) nr

Geophysical Computing

L11-5

! Allocate the memory required in variable mydata ALLOCATE(mydata(nr)) ! Open up the file to read OPEN(UNIT=1,FILE=infile) ! Now read the file into variable mydata DO J=1,nr read(1,*) mydata(J) ENDDO ! We are done with the file so now close it out CLOSE(1) ! For fun, let's write back out to standard out DO J=1,nr write(*,*) mydata(J) ENDDO END PROGRAM aa The important points to note are: • • • We stated that we don’t know how many elements will be in our array with the DIMENSION(:), ALLOCATABLE statement. We read how many elements to expect from the user into the integer variable nr. Once we knew how many elements to expect we reserved space in our memory with the ALLOCATE(mydata(nr)) statement. This just reserved nr elements into our array mydata. Now that the memory is allocated we can read our data into the array, as is done with the read statement. Note, we loop through the file (after opening it) with a standard DO loop letting the loop variable J act as the array index.

Note that sometimes we may want to use the same variable again, but with a different number of elements. In this case we can use the DEALLOCATE statement to clear the memory. Then we can ALLOCATE the variable again with a new size. IMPORTANT: Often times when we run one of our programs we will get an error. A common error to see is the much dreaded Segmentation Fault. If you see this error it is generally because you tried to write a value to an array index that doesn’t exist. For example, I may have an array mydata that has DIMENSION(100). If I try something like mydata(101) = something, then I will get an error because I am asking the program to place a value into a memory location that doesn’t exist. OK, you have been warned! As a final note: Check out the Fortran tricks notes that I have put together. In these notes I show how to read data into an array where you never need to specify how many lines of data exist.

Geophysical Computing

L11-6

4. Whole Array Operations
Now we can start to look at the beauty of storing arrays in Fortran90: whole array operations! Doing operations with arrays is similar to doing operations with regular variables. For example, we can add each element of two arrays just by: C = A + B Or, we can take the square root of each element in an array: C = SQRT(B) Or element by element multiplication: C = A*B Note, that the above operation multiplies each element of the array by the corresponding element in the other array. E.g., 1 3 2 4 × 1 3 2 4 = 1×1 2×2 3×3 4×4

If you want to do matrix multiplication instead you can use the MATMUL intrinsic function. E.g., C = MATMUL(A,B) If your arrays are composed of vectors, then there is also an intrinsic function for the dot product: C = DOT_PRODUCT(Vector_1, Vector_2) These kinds of features reperesent a huge improvement over f77 in which one would have to write DO loops and loop over all of the elements in the arrays, performing the operations element by element. The following example shows the use of a couple of the most useful array intrinsics:

Geophysical Computing

L11-7

PROGRAM minmax IMPLICIT NONE REAL(KIND=4), DIMENSION(2,8) :: mydata REAL(KIND=4), DIMENSION(2) :: maxcolumns REAL(KIND=4) :: mydata_max, mydata_min ! Load in some data values to play with ! --------------------------------------------------------------! ! Put data values into column 1, rows 1:8 mydata(1,1:8) = (/0.0, 20.0, 40.0, 60.0, 40.0, 20.0, 0.0, 15.0/) ! Put data values into column 2, rows 1:8 mydata(2,1:8) =(/1.45, 6.80, 8.10, 8.08, 7.20, 7.00, 9.34, 2.65/) ! --------------------------------------------------------------!

! Play around with the min/maxval functions ! --------------------------------------------------------------! ! Find the Maximum value in mydata mydata_max = MAXVAL(mydata) write(*,*) "Maximum Value anywhere in array: ", mydata_max ! Find the Minimum value in mydata mydata_min = MINVAL(mydata) write(*,*) "Minimum Value anywhere in array:

",

mydata_min

! Using a MASK, find the largest value in mydata less than 10.0 mydata_max = MAXVAL(mydata,MASK=mydata < 10.0) write(*,*) "Maximum Value < 10.0: ", mydata_max ! Using DIM, find the maximum values in each column maxcolumns = MAXVAL(mydata,DIM=2) write(*,*) "Maximum Value of column 1: ", maxcolumns(1) write(*,*) "Maximum Value of column 2: ", maxcolumns(2) ! --------------------------------------------------------------! END PROGRAM minmax The above example demonstrates two really useful features of many of the whole array intrinsics: (1) The ability to limit the operation based on logical expressions using the MASK feature, and (2) The ability to perform the same operation distinctly on columns or rows of the array using the DIM feature. Both of these features would have previously required the use of looping but can now be done in a single line of code. Very sexy indeed. Sometimes what you want to know is not what the maximum or minimum values are, but where they occur in an array. For example, suppose we have a seismogram and we are not concerned with what the maximum amplitude is, but at what time the maximum amplitude occurs at. Consider the following example:

Geophysical Computing

L11-8

PROGRAM minmax2 IMPLICIT NONE REAL(KIND=4), DIMENSION(2,8) :: seismogram REAL(KIND=4) :: time_max, amp_max INTEGER(KIND=4), DIMENSION(2) :: time_max_index ! Make a fake very, very short seismogram ! --------------------------------------------------------------! ! Put timing information in column #1 seismogram(1,1:8) = (/0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0/) ! Put amplitude information in column #2 seismogram(2,1:8)=(/1.45,1.53,1.62,1.73,1.41,1.38,1.33,1.20/) ! --------------------------------------------------------------! ! Now let's play with the MAXLOC intrinsic ! --------------------------------------------------------------! ! Find the array indices of the max in each column time_max_index = MAXLOC(seismogram,DIM=2) time_max = seismogram(1,time_max_index(2)) amp_max = seismogram(2,time_max_index(2)) write(*,*) "Time to maximum value: ", time_max, " (sec)" write(*,*) "Amplitude of maximum value: ", amp_max ! --------------------------------------------------------------! ! Now let's do it again using the MASK and say we want to find ! the next highest amplitude ! --------------------------------------------------------------! time_max_index=MAXLOC(seismogram,DIM=2,MASK=seismogram < amp_max) time_max = seismogram(1,time_max_index(2)) amp_max = seismogram(2,time_max_index(2)) write(*,*) "Time to next largest value: ", time_max, " (sec)" write(*,*) "Amplitude of next largest value: ", amp_max ! --------------------------------------------------------------! END PROGRAM minmax2

The final array function I wish to discuss is the incredibly versatile WHERE control structure. This is similar to IF THEN statements only it applies to entire arrays. The basic syntax looks like: WHERE ( some logical statements) your code… ELSEWHERE ( more logical statements) your code… ENDWHERE

Geophysical Computing

L11-9

To make it’s operation clear let’s look at a simple example: PROGRAM wherestatement IMPLICIT NONE REAL(KIND=4), DIMENSION(2,8) :: seismogram INTEGER(KIND=4) :: J ! Make a fake very, very short seismogram ! --------------------------------------------------------------! ! Put timing information in column #1 seismogram(1,1:8) = (/0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0/) ! Put amplitude information in column #2 seismogram(2,1:8) = (/1.45,1.53,1.62,1.73,1.41,1.38,1.33,1.20/) ! --------------------------------------------------------------! ! write out the initial seismogram (time, amplitude) ! --------------------------------------------------------------! write(*,*) " " write(*,*) "Initial seismogram" write(*,*) "--------------------------------------" DO J=1,8 write(*,*) seismogram(1,J), seismogram(2,J) ENDDO write(*,*) "--------------------------------------" write(*,*) " " ! Manipulate the amplitude values based on the ! time values w/o a Loop! ! --------------------------------------------------------------! WHERE (seismogram(1,:) >= 6.0) seismogram(2,:) = 0.0 ELSEWHERE (seismogram(1,:) >= 4.0) seismogram(2,:) = 1.0 ELSEWHERE seismogram(2,:) = 2.0 ENDWHERE ! write out the final seismogram (time, amplitude) ! --------------------------------------------------------------! write(*,*) " " write(*,*) "Final seismogram" write(*,*) "--------------------------------------" DO J=1,8 write(*,*) seismogram(1,J), seismogram(2,J) ENDDO write(*,*) "--------------------------------------" write(*,*) " " END PROGRAM wherestatement

Geophysical Computing

L11-10

Note that in the above example we can base our logical statements on subarray sections! In this case we based our logic just on the values in the first column of data.

5. Optimization Flags
Thus far we haven’t talked too much about compiling our codes. But, at this point it may be prudent to point out that there are tons of flags that we can use during compile. The compiler man page describes these and you should look around to see what is available. But, as a primary note all compilers have a standard Optimization flags. If I want to compile my code and make it run a little faster I might type: >> g95 mycode.f90 –O2 –o mycode.x where the –O2 flag says to optimize this code to run a bit faster. You might get even better output from the –O4 flag: >> g95 mycode.f90 –O4 –o mycode.x But you need to be a little bit careful to make sure the results still make sense. Sometimes the compiler will perform tricks that will make the code run faster, but at the result of the numerical accuracy of the calculations. In addition to the –O flags most compilers have special flags that let you optimize the code for the specific cpu your computer is using.

6. Homework
1) Write a program that will read in a 1-column data set of arbitrary size from a file, and will output the average, standard deviation, and variance of the data set. Recall the definition of standard deviation (σ) is:

σ=

1 N

∑ (x
i =1

N

i

− x)

2

Where N is the total number of samples in the data set, and xi is the ith data sample. Recall that the variance is just = σ2.

Geophysical Computing

L12-1

L12 – Fortran Programming - Part 4
In all of our Fortran lectures so far we have told our codes what we want them to do with our data in a linear fashion. If we are performing a lot of operations on our data our code can get really messy. There are also occasions where we want to perform the same operations to our data many times. All of this begs for an elegant solution. Never fear, Fortran provides a nice class of solutions: subroutines, functions, and modules. That will be the subject of this lecture!

1. Functions
We have used all kinds of intrinsic functions in Fortran such as ABS, NINT, SIN, etc. But, there may be other functions not supplied in Fortran that you want to have access to. For example, we are always converting our latitudes and longitudes from degrees into radians. A nice function like DTR that converted from degrees to radians would be nice. Let’s see how we could create such a function. PROGRAM funex IMPLICIT NONE REAL(KIND=8) :: lat, DTR lat = 45.0 lat = DTR(lat) write(*,*) lat END PROGRAM funex !---------------------------------------------------------------! ! DTR - convert degrees to radians ! Input 'argument' is the angle in degrees !---------------------------------------------------------------! FUNCTION DTR(argument) IMPLICIT NONE REAL(KIND=8) :: DTR REAL(KIND=8), PARAMETER :: conv=.01745329251994329444 !pi/180 REAL(KIND=8), INTENT(IN) :: argument DTR = argument*conv END FUNCTION DTR !---------------------------------------------------------------! This is as simplistic of an example as one can define, however it illustrates all of the main points one needs to know to create more elaborate functions: • • The function is named DTR and DTR has a type. That is, in the function we set DTR as a REAL number. This means our function will output a REAL number. The function can use input parameters. In this case, we only use the input parameter we named argument. We tell the function that we want to read this variable into the function with the INTENT(IN) attribute.

Geophysical Computing

L12-2

Our function is defined entirely within the FUNCTION – END FUNCTION block. Note that we don’t include this definition within the block of the main program. That is, our program is written within the block PROGRAM – END PROGRAM, and our function is defined elsewhere. Our function uses variables (e.g., the conv variable) that are not used by our main program. Hence, we need to define all variables we use inside our function. The main program does not see these variables, so there is no problem if functions have some variable names that are also used in the main program. Even though we define DTR to be a REAL number in our function. We also still have to define DTR as a REAL number in the main program. Note that in the function we call the latitude we are reading in by the variable argument. In the main program we call it lat. This doesn’t matter. What matters is that they have the same type!

• •

Functions are useful for cleaning up our codes, but more advanced operations can be defined using subroutines.

2. Subroutines
Most programmers tend to have the majority of their coding time invested in writing good subroutines. Subroutines are generally small bits of code designed to do one specific task, but to do that specific task well. The main program tends to be a collection of CALLS to these subroutines. To see how we write a subroutine I show an example below that I wrote for calculating vector cross products.
!Cross Product for determining euler rotation pole !Angle between two points on a sphere and !Distance between two points on a sphere !*************************************************** SUBROUTINE cross(v1, v2, v3, w1, w2, w3, u1, u2, u3, alpha, s) ! For vectors v and w, where ! v = (v1, v2, v3) ! w = (w1, w2, w3) ! the cross product v x w = x = (x1, x2, x3) ! output is u = (u1, u2, u3); where u is the unit ! vector in the direction of the vector x ! the angle between v and w is given as alpha; ! where alpha = arcsin(|v x w|/|v||w|) ! s is the distance along the arc between the two ! endpoints of vectors v and w, where s=r*theta ! alpha and s are only valid for angles <= 90 deg IMPLICIT NONE INTEGER, PARAMETER :: k=kind(0d0) REAL(k), PARAMETER :: pi=3.141592653589793_k REAL(k), INTENT(IN) :: v1, v2, v3, w1, w2, w3 REAL(k), INTENT(OUT) :: u1, u2, u3, alpha, s REAL(k) :: magV, magW, magX, arg, rtd, theta REAL(k) :: x1, x2, x3

Geophysical Computing

L12-3

rtd = 180_k/pi x1 = v2*w3 - v3*w2 x2 = -v1*w3 + v3*w1 x3 = v1*w2 - v2*w1 magV = SQRT(v1**2 + v2**2 + v3**2) magW = SQRT(w1**2 + w2**2 + w3**2) magX = SQRT(x1**2 + x2**2 + x3**2) u1 = x1/magX; u2 = x2/magX; u3 = x3/magX arg = magX/(magV*magW) theta = ASIN(arg) s = magV*theta alpha = (ASIN(arg))*rtd END SUBROUTINE cross

The subroutine looks very similar to our function. The only additional points are: • • We use SUBROUTINE and END SUBROUTINE to define it. We use a number of input arguments – defined with INTENT(IN) and also output a number of different arguments – defined with INTENT(OUT). This is not shown in the above example but we can use the same variables in a subroutine to bring variables in and out of the subroutine with the INTENT(INOUT) attribute. We can also pass arrays into and out of subroutines (not shown in the above example). As with our functions the SUBROUTINE – END SUBROUTINE definition does not occur within the main program.

• •

Using subroutines in a program is slightly different than with functions. The following example shows how to use the cross product subroutine in a program.
PROGRAM calc_cross IMPLICIT NONE INTEGER, PARAMETER :: k=kind(0d0) REAL(k) :: x1, y1, z1 REAL(k) :: x2, y2, z2 REAL(k) :: x3, y3, z3 REAL(k) :: alpha, s !Define vector 1 x1 = 0.0; y1 = 5.0; z1 = 1.0 !Define vector 2 x2 = 0.0; y2 = -5.0; z2 = -1.0 !Take the cross product between the two vectors CALL cross(x1, y1, z1, x2, y2, z2, x3, y3, z3, alpha, s)

!vector 1 !vector 2 !vector 3 !angle and distance

Geophysical Computing

L12-4

!write out write(*,*) write(*,*) write(*,*)

the “x3 “y3 “z3

coordinates of the output vector = “, x3 = “, y3 = “, z3

END PROGRAM calc_cross

The main point is: • We CALL the subroutine to use it.

Now as a point of where do we keep our subroutines. Here we have three basic options: 1) We add the subroutine to the same file as our main program file; adding it in after the END PROGRAM statement, or 2) We create an entirely new file that just contains the subroutine, or 3) We add it to a MODULE file as described in the next section. If we chose option number two we have to compile our code in a special way. As it turns out this is exactly the same way we compile modules and is thus described in the next section.

3. Modules
Modules provide a convenient way to package subroutines and functions and other more exotic features we haven’t talked about into a single program file that can then be used by other programs. For example we could create a module file called mod_constants.f90 that looks like: MODULE constants IMPLICIT NONE REAL(KIND=4), PARAMETER :: REAL(KIND=4), PARAMETER :: REAL(KIND=4), PARAMETER :: END MODULE constants If we populated this file with all kinds of constants we use on a regular basis, then it might be useful to have this file around for use in all of our programs. In fact, this is exactly what we can do. We can now create a program file that uses these constants. E.g., let’s now create a file called program_main.f90: Avogadro = 6.022137E23 G = 6.6726E-11 c = 2.99792458E8

Geophysical Computing

L12-5

PROGRAM main USE constants IMPLICIT NONE REAL(KIND=4) :: numbr numbr = Avogadro write(*,*) numbr END PROGRAM main Note that in the above program all we had to do was state USE constants (where constants represented the name we gave to the module) to have access to these variables. All we really have to concern ourselves with is compiling these codes. First of all we need to compile the module file: >> g95 –c mod_constants.f90 We use the –c flag which doesn’t produce an executable file but an object file (mod_constants.o) and a module file (constants.mod) that the main program can use. Now we need to compile the main program: >> g95 program_main.f90 –o main.x ./mod_constants.o Where the last argument links the object file to the main program. In the last example we just added a bunch of variables for use in other programs. But we can also add subroutines and functions to a module. The basic syntax looks like: MODULE module_name CONTAINS SUBROUTINE mysub1(arguments) IMPLICIT NONE ... END SUBROUTINE mysub1 SUBROUTINE mysub2(arguments) IMPLICIT NONE ... END SUBROUTINE mysub2 ... END MODULE module_name All we did here is add the CONTAINS statement, which allows us to pack the module file full of subroutines. We USE this type of module in the same way as the above example.

Geophysical Computing

L12-6

4. Good Programming Practice – Large Scale Problems
In the above section we saw that we could pack our subroutines and functions into Modules. The key to good programming is a modular approach. That is, we typically write small subroutines or functions aimed at solving specific problems. If we have a good subroutine that solves a problem we encounter over and over then we should use that well-tested subroutine over and over. The best approach is then to pack these subroutines into module files that we can easily use in our codes. In this manner, we can build very large and sophisticated programs from very small pieces. A few keys that I use are: 1) Pack subroutines and functions that are aimed at solving similar types of problems into their own module. For example, I have a series of subroutines that involve manipulating latitudes and longitudes points, finding great circle paths between them, distances, angular distances etc. So, I have packed all of these subroutines into a module called gcarcs stored in the file mod_gcarcs.f90. Hence, any time I am writing a program that involves the need to do calculations involving math on a sphere then I can just link my programs to the gcarcs module. Another example is that I have a module I call fdoperators stored in the file mod_fdoperators.f90. This module contains subroutines for calculating finite difference (FD) derivatives with a variety of different options. Hence, any time I want to calculate derivatives I just link my program to the fdoperators module and use the subroutines I have already built. 2) For really large programs we may have hundreds of variables. A really good practice is to define a module called global in the file mod_global.f90. This just defines all of the global variables our program may use. This also allows us to use just a few of those variables in specific subroutines with a statement like USE global, ONLY: variable_1, variable_2

Geophysical Computing

L12-7

3) Create a single file that just drives the flow of the program. The name of this file is arbitrary, but something like program_main.f90 is a good choice so that we know it is the main driving routine. This program then basically consists of a well ordered list of what happens in the code. For example,

PROGRAM myprog_main USE global USE check USE funapps USE output USE input IMPLICIT NONE !Initialize program by reading the Input CALL readinput

!Check that input info makes sense CALL check

!Do something useful like calculating some derivatives DO it = 1,last CALL nice_application(inputdata) CALL useful_operation(inputdata) ENDDO !write output CALL writeoutput END PROGRAM myprog_main

4) Create input files that define the major program options. Make the input files readable and include at least a rudimentary explanation of what the variables are. Using input files in this manner allows you to run your codes for various options without having to recompile the codes. It also allows you to keep a record of what options were used. The following shows an example of an input file I use for my code to generate models of small scale random seismic heterogeneity.

Geophysical Computing

L12-8

=============================================================================== Model -----------------------------------------------------------------------rseed =1 ! Random Seed Number (positive integer) dimension =2 ! Model Dimension (1,2,3) nx =1000 ! Number of X grid points ny =500 ! Number of Y grid points nz =1 ! Number of Z grid points dx =200.0 ! X-grid increment (m) dy =200.0 ! Y-grid increment (m) dz =200.0 ! Z-grid increment (m) Autocorrelation Functions ---------------------------------------------------acf =2 ! (1-Gauss, 2-Exponential, 3-von Karman) alx =10000.0 ! X- autocorrelation wavelength (m) aly =2000.0 ! Y- autocorrelation wavelength (m) alz =2000.0 ! Z- autocorrelation wavelength (m) Order =0.0 ! Bessel function order (acf=3) Perturbation Parameters -----------------------------------------------------dVs =2.0 ! Vs Standard Deviation (%) dVp =2.0 ! Vp Standard Deviation (%) drho =0.0 ! rho Standard Deviation (%) alpha =3.0 ! Multiples of STD to keep Wavenumber Filter -----------------------------------------------------------kmin =0.0 ! Currently Unsupported kmax =10.0 ! Currently Unsupported Input -----------------------------------------------------------------------modeltype =1 ! (0-Homogeneous, 1-crfl) modelfile =./models/crfl.dat Vs0 =3000.0 ! Vs for homogeneous models (m/sec) Vp0 =6000.0 ! Vp for homogeneous models (m/sec) rho0 =2000.0 ! rho for homogeneous models (kg/m^3) Output ----------------------------------------------------------------------prefix =./output/test1 ! status =1 ! (0,1 = off,on) show status messages oformat =2 ! 1-Ascii; 2-E3D otype =1 ! 0- output %dV pert.; 1- output actual ofiles =0 ! 0-just final model files; 1-all files

=============================================================================== rseed: Positive integer. Sets the random seed of the random number generator Works such that the same distribution will always be returned for the same 'rseed' number. nx,ny,nz: These do not have to be powers of 2. MUST be even numbers.

Order:

Order of Bessel Functions for von Karman type media. Must be set between 0.0 and 0.5 if otype is set to 0 then the %dV perturbation is written and not the actual perturbed velocity or density values from the input model.

otype:

5) Create a makefile that includes directions for properly compiling your codes. The next section will talk more about this. 6) If you think your code will be distributed widely go the extra step and create a man page. The class website contains a hand out on how to prepare man pages.

Geophysical Computing

L12-9

5. Makefiles
This section is not aimed at telling you how to write makefiles. There is a hand out on the web page that describes this. However, many codes use makefiles to provide instructions on how to compile them. At the very least you should know how to use these makefiles to compile codes you get from other sources. Below is an example of what a makefile looks like:
#Makefile for program sac2xy #-------------------------------------------------------------------F90=g95 FFLAGS=-O4 RM=/bin/rm -f BINDIR=../../bin all : main #Compile modules mod_sac_io.o : mod_sac_io.f90 $(F90) $(FFLAGS) -c mod_sac_io.f90 #Compile Source-code and link modules sac2xy : sac2xy.f90 $(F90) $(FFLAGS) sac2xy.f90 -o sac2xy ./mod_sac_io.o #Copy executable to appropriate directories main : mod_sac_io.o sac2xy cp sac2xy $(BINDIR) clean : $(RM) sac2xy mod_sac_io.o sac_i_o.mod

I won’t talk about all the details. But here are the important points: • • • • I have a variable F90 which sets which Fortran compiler to use. Another variable is FFLAGS which sets which compile flags to use during compilation. An important variable here is BINDIR which is where the final executable will be copied to. The primary actions are: (1) all of the Fortran modules are compiled, (2) the main program is compiled and linked with the pre-compiled modules, and (3) the executable gets copied to the BINDIR location.

If a file named makefile exists as above, then to compile the code all you have to do is type: >> make Occassionaly the file will have another name from makefile (e.g., makefile_mycode). Then you can compile it by typing: >> make –f makefile_mycode

Geophysical Computing

L12-10

Note that in the above example there are some instructions called clean. This is common practice and usually gives directions of how to remove the compiled codes. One can use this by typing: >> make clean

6. Homework
1) One of the most important concepts signal processing is that of convolution. For example, in seismology a seismic signal is just the convolution between a source time function, the receiver structure, and the Earth structure (otherwise known as the Green’s functions). Hence, having a convolution code handy is a must. For example, if we compute the Green’s functions in the Earth for a seismic signal, we can adequately synthesize what is recorded on a seismometer by convolving those Green’s functions with a source time and receiver structure function. In this exercise we will create a basic convolution code in Fortran 90. If we have discrete signals h[n] and x[n] the convolution between them is defined as:

y[n] = h[n] ∗ x[n] =

k = −∞

∑ x[k ]h[n − k ]

+∞

Write a code that will read in an arbitrary signal x[n] and convolve it with one of a set of predefined functions h[n]. The predefined functions that should be available are: (1) box car, (2) triangle, and (3) Gaussian function. The code should output the convolved signal. Also write a C Shell script that will drive the convolution program and produce a plot of the input and output signals. Your code should be written such that it is easy to read, which means that you should break up the main constituents of the program into subroutines. For example, you may want separate subroutines that (a) read in the data, (b) create box car functions, (c) create triangle functions, (d) create a Gaussian function, (e) performs the convolution, and (f) writes out the convolved signal.

Geophysical Computing

L13-1

L13 – Supercomputing - Part 1
Without question the recent wide spread availability of large scale distributed computing (supercomputing) is revolutionizing the types of problems we are able to solve in all branches of the physical sciences. Currently almost every major university now hosts some kind of supercomputing architecture, and hence most researchers currently have the ability to develop software for such an environment. This is in stark contrast to the situation a decade ago where one had to obtain computing time from dedicated supercomputing centers which were few and far between. This availability of resources is only going to increase in the future and as a result it is important to know the basics of how to develop code and how to utilize supercomputer facilities. We could actually dedicate an entire seminar series to supercomputing, but in this class we only have two lectures. So, what we will do here is (1) Introduce the primary concepts behind supercomputing, and (2) Introduce the fundamentals of how to actually write code that supercomputers can run. There are many details on the coding aspects that are better suited to a full scale course.

1. What is Supercomputing?
So, what is a supercomputer? Here’s a picture of one – the common type of picture you will see on a website. Looks impressive right, a whole room full of ominous looking black boxes just packed with cpu’s.

Typical picture of a now obsolete supercomputer. Here’s the official definition of a supercomputer: • A computer that leads the world in terms of processing capacity, speed of calculation, at the time of its introduction.

My preferred definition is: • Any computer that is only one generation behind what you really need.

Geophysical Computing

L13-2

So, the definition of a supercomputer is really defined by processing speed. What does this mean for our current supercomputers? Computer Speed Computer speed is measured in FLoating Point Operations Per Second (FLOPS). Floating point is way to represent real numbers (not integers) in a computer. As we discussed previously this involves an approximation as we don’t have infinite memory locations for In Terminator 3 Skynet is said to be operating at “60 teraflops our real numbers. We usually per second” either this makes no sense or the speed of represent real numbers by a Skynets calculations are accelerating. number of significant digits which we scale using an exponent: significant digits × baseexponent We are generally most familiar with the base 10 system so as an example we could represent the number 1.5 as: 1.5 × 100 , or 0.15 × 101, or 0.015 × 102, etc. We say floating point because the decimal point is allowed to float relative to the significant digits of the number. So, a floating point operation is simply any mathematical operation (addition, subtraction, multiplication, etc.) between floating point numbers. Currently the LINPACK Benchmark is officially used to determine a computers speed. You can download the code and directions yourself from: http://www.netlib.org/benchmark/hpl The benchmark solves a dense system of linear equations (Ax =b) where the matrix A is of size N × N. It utilizes a solution based on Gaussian elimination (which every student here should at least recall what that is) that utilizes a numerical approach called partial pivoting. The calculation requires

2 3 N + 2 N 2 FLOPS. The benchmark is run for different size matrices (different N 3

values) searching for the size Nmax where the maximal performance is obtained. To see the current computing leaders you can check out the website: http://www.top500.org

Geophysical Computing

L13-3

It’s truly amazing to look at this. The last time I gave a talk on supercomputing the most recent update to the Top500 list was posted on Nov. 2006. At this time the computer BlueGene/L at Lawrence Livermore National Laboratory (LLNL) was the unchallenged leader with a max performance of 280.6 Tera FLOPS. It’s amazing to see how dramatically this has changed. The current leader (June 2010) is the Jaguar supercomputer at Oak Ridge National Laboratory which maxes out at 1759 Tera FLOPS. Blue Gene/L is now at about 480 Tera FLOPS but has dropped to the number 8 position. The first parallel computers were built in the early 1970’s (e.g., Cray’s ILIAC IV). But, we can see a pretty linear progression in computing speed: Year 1974 1984 1994 2004 2006 2010 Speed Mega FLOPS Giga FLOPS Giga FLOPS Tera FLOPS Tera FLOPS Tera FLOPS Computer CDC STAR 100 (LLNL) M-13 (Scientific Research Institute, Moscow) Fujitsu Numerical Wind Tunnel (Tokyo) SGI Project Columbia (NASA) Blue Gene/L (LLNL) Jaguar (Oak Ridge National Laboratory)

100 2.4 170 42.7 280.6 1759

This result is a basic outcome of Moore’s Law which states that the number of transistors that can be placed inexpensively on an integrated circuit has doubled approximately every two years. The next figure is an interesting look at what may happen if this trend continues.

From Rudy Rucker’s book, The Lifebox, the Seashell, and the Soul: What Gnarly Computation Taught Me About Ultimate Reality, the Meaning of Life, and How to Be Happy.

Geophysical Computing

L13-4

2. Parallelism in Physics
To understand why the current model of supercomputing has been so successful we must first look at what this model is. Basically the preferred supercomputer architecture today is called Parallel Computing, which means that we divide our problem up among a number of processors. The following diagram shows the basic computer lay out:

The main points are: • • The computer is divided up into nodes. Each node may have multiple processors (E.g., most Linux clusters may have 2 processors per node; but the majority of the computers I’ve worked on have 8 processors per node). Each processor has access to a global memory structure on it’s node – but doesn’t have access to the memory on the other nodes. Communication of information can occur between processors within or across nodes. Each processor can access all of the memory for each node.

• • •

The reason this strategy is so important is because: The fundamental laws of physics are parallel in nature. That is, the fundamental laws of physics apply at each point (or small volume) in space. In general we are able to describe the dynamic behavior of physical phenomena by a system(s) of differential equations. Examples are: • • • • Heat flow The Wave Equation Mantle Convection Hydrodynamics

Geophysical Computing

L13-5

etc.

The art of parallel programming is identifying the part of the problem which can be efficiently parallelized. As a quick example let’s look at the 1-D wave equation. We can write this as:

∂ 2 p ( x, t ) ∂ 2 p ( x, t ) = c2 ∂t 2 ∂x 2
Where p is pressure and c is velocity. Here we have time derivatives that describe how the system evolves with time and spatial derivatives describing the interaction of different particles. We can solve this equation by a simple finite difference approximation:

p(t + dt ) = 2 p (t ) − p(t − dt ) +

p( x + dx) − 2 p( x) + p( x − dx) 2 2 c dt dx 2

Consider we are solving our wave equation at discrete spatial points represented by the green circles separated by a distance of dx. At the point x, solution of the spatial derivative (2nd derivative in this case) only involves the values of pressure at the points in the immediate vicinity of x (e.g., using a 3-point centered difference approximation the solution only involves the two neighboring points inside the blue box).

Note that what happens in the near future (t + dt) at some point x only depends on: • • • the present time (t), the immediate past (t – dt) and the state of the system in the nearest neighborhood of x (x ± dx)

This type of behavior is inherent in physics. The key now is to determine how best to subdivide the problem amongst the many processors you have available to you. That is, we want to parallelize the problem. It is important to note our desire is to Parallelize and not Paralyze our code. In the example above it makes sense that we may want to divide the problem up spatially and have different processors work on chunks of the problem that are closely located in space. An equivalent 2D example may look as follows, where we have here shown the 2D grid divided up into 3 blocks.

Geophysical Computing

L13-6

However, these spatial divisions can get much more difficult in 3D problems. Below is an example grid from Martin Käser (Ludwig Maximilians University, Munich) where each color represents the part of the problem that a different node will work on.

Grid from Martin Käser. One of the primary issues in parallelizing code has to do with the exchanging of information at domain boundaries: • Each processor is working on a single section of the code, but at the boundaries requires information from other processors. For example, in our example of the 1D wave equation we may need the pressure values being calculated on other processors to be able to calculate the FD approximation in our own domain.

Geophysical Computing

L13-7

Hence, some form of communication needs to take place. This is where the Message Passing comes into play.

We have two fundamental concerns: (1) Load balancing – we want to divide the problem up as equally as possible so as to keep all of the processors busy, and (2) we want to minimize the interprocessor communication. There is generally a tradeoff between processing and communication.

3. Parallel Programming Environments
Parallel programming requires special programming techniques to be able to exploit their speed. Typically, Fortran produces faster code than C or C++ (this is because it is really hard to optimize pointers) and as a result most supercomputer applications are written in Fortran. This is definitely the case in Seismology (all major supercomputing codes in global seismology are written in Fortran 90) and appears to be the case in meteorology from the people I’ve talked to. In any case, parallel programming can be done in either Fortran, C, or C++ (and in other languages as well, but less commonly). When I was employed at the Arctic Region Supercomputing Center I asked one of the people running the center what language was used the most in applications running on their computers. I was actually a little surprised that greater than 90% of the applications were written in Fortran, however this was dominated by the meteorologists who were running the weather models. I don’t know if this paradigm is true elsewhere. How one exploits the parallelism depends on the computing environment. For each environment there are different utilities available: Distributed Memory: • MPI (Message Passing Interface) • PVM (Parallel Virtual Machine) Shared Memory – Data Parallel (also known as multi-threading): • OpenMP (Open Multi-Processing) • Posix Threads (Portable Operating System Interface)

Usually parallel computers address all of these environments. It is up to the programmer to decide which one suits the problem best. In this class we will focus on distributed memory systems and MPI programming which is the most common. However, it is not uncommon to use a combination of methods. Think about our example of how supercomputers are set up. One node is a shared memory environment, and looking across nodes is a distributed memory environment. Hence, it is common to use OpenMP to deal with parallelization between processors on the same node, and to use

Geophysical Computing

L13-8

MPI to deal with the parallelization across nodes.

5. Intro to Message Passing Concepts
Here we will start to describe the concepts of actually writing parallel code using the Message Passing Interface (MPI). The key point is that we are going to write our code to solve a problem where we have several different processors working on a different chunk of the problem. For example, suppose we are going to numerically integrate a 2D function. The first thing we might do is decide how we are going to break this problem up. We might just want each processor to compute an equal part of the integral. Hence, if I have 4 processors at my disposal each processor might try and compute these parts of the integral

The main points here is that: • • I divided my problem up into 4 sections, and have decided that each processor is going to do the numerical integration in each one of these sections. In parallel programming we refer to each of our sections as ranks, and we start our numbering scheme with rank = 0. Hence, we refer to the part of the problem that our first processor is working on as rank 0. Our second processor is working on rank 1, etc. Our task as a programmer is to tell each processor what it should be doing. That is, we specify the actions of a process performing part of the computation rather than the action of the entire code. In this example we are simply telling every processor to sum up an area under the curve, but we are telling each processor to calculate this sum under a different region of the curve. Note that each rank is only solving a part of the integral. To determine the final answer we have to communicate the result of all ranks to just a single rank and sum the answers.

Geophysical Computing

L13-9

As another example, imagine that we just have two processors. At the start of the code execution we initialize the variable X = 0.0. Processor 1 myrank: 0 Initialization: X = 0.0 X = 0.0 Processor 2 myrank: 1

Here we use the variable myrank to tell us which process we are using. At this point we could provide some code. For example: Processor 1 Code: Processor 2

IF (myrank == 0) THEN X = X + 10.0 ENDIF

As you can see our code is giving a specific instruction based on which processor is doing the work. After execution of this line of code we get: Processor 1 myrank: 0 Result: X = 10.0 X = 0.0 Processor 2 myrank: 1

And the important point that although we are just using the single variable X, it can take on different values depending on which processor we are referring to. But, at some point one processor may be interested in what the value of a variable is on another processor. For example, Processor 2 wants to know what X is on Processor 1: Processor 1 myrank: 0 X = 10.0 Processor 2

Hey, what do you have for X?

myrank: 1 X = 0.0

To determine this we have to Pass a Message from rank 1 to rank 0 asking it to supply its value of X, and then we have to send the answer from rank 0 back to rank 1. In Passing Messages the following items must be considered: • • • Which processor is sending a message? (which rank) Where is the data on the sending processor? (which variable) What kind of data is being sent? (e.g., integer, real, …)

Geophysical Computing

L13-10

• • • •

How much data is being sent? (e.g., a single integer, how many array elements) Which processor(s) is (are) receiving the message? (which rank) Where should the data be left on the receiving processor? (which variable) How much data is the receiving processor prepared to accept? (e.g., how many array elements)

In the next lecture we will show the details of how this is done using the Message Passing Interface.

6. Homework
This is a buy week. Have fun!

Geophysical Computing

L14-1

L14 – Supercomputing - Part 2

1. MPI Code Structure
Writing parallel code can be done in either C or Fortran. The Message Passing Interface (MPI) is just a set of subroutines that has bindings in either language. That is, we write our codes as normal and use the MPI subroutine set to handle the details of communication between processors for us. We just need to worry about when that communication takes place and what is said. MPI has many subroutines (125 total functions), however it is really easy to work with and many programs can be written using just 6 functions. The six main functions of MPI are: 1) 2) 3) 4) 5) 6) MPI_Init – Initialize MPI environment MPI_Finalize – Finalize MPI environment MPI_Comm_size – Determine total number of processors MPI_Comm_rank – Determine the rank of the current processor MPI_Send – Send a message MPI_Recv – Receive a message

All MPI programs will have the same basic structure. The main elements are organized as follows: PROGRAM example_mpi USE mpi IMPLICIT NONE INTEGER :: mpi_ierr, nprocs, mpi_rank CALL MPI_Init(mpi_ierr) CALL MPI_Comm_size(MPI_COMM_WORLD, nprocs, mpi_ierr) CALL MPI_Comm_rank(MPI_COMM_WORLD, mpi_rank, mpi_ierr) ! Do your calculations here, i.e., the main program elements

CALL MPI_Finalize(mpi_ierr)

END PROGRAM example_mpi Note that what we do is: • We start out by initializing the MPI environment. This is done with just the MPI_Init subroutine. All this does is say start up MPI. We have designated the variable mpi_ierr to tell us about the status of each MPI action we perform.

Geophysical Computing

L14-2

• • • •

Next we find out how many processors we are actually using. Here we use the subroutine MPI_Comm_size and place the result into the variable nprocs. Next we find out which processing rank we are. We use the subroutine MPI_Comm_rank and place the result into the variable mpi_rank. Now the MPI environment is completely set up and we can write the main part of the code. Once everything is done we need to close off the MPI environment with the subroutine MPI_Finalize.

2. Your First Parallel Code
Often times when we start to write code our first code is a Hello program. We will do this with MPI because it is a little more interesting than in the normal case. Our examples will be shown for the environment of the University of Utah’s Center for High Performance Computing (CHPC). Most of you will have an account on one of CHPC’s computers (if not join up with someone who does for this exercise) and log in now to sanddunearch: >> ssh –X –l username sanddunearch.chpc.utah.edu Our basic hello world program is mpiexample.f90: PROGRAM example USE mpi IMPLICIT NONE INTEGER :: mpi_ierr, nprocs, mpi_rank ! initialize MPI environment CALL MPI_Init(mpi_ierr) CALL MPI_Comm_size(MPI_COMM_WORLD, nprocs, mpi_ierr) CALL MPI_Comm_rank(MPI_COMM_WORLD, mpi_rank, mpi_ierr) ! just have rank 0 state how many processors we are using IF (mpi_rank == 0) THEN write(*,*) nprocs, “processes have been requested.” ENDIF ! Here is the hello world part... write(*,*) “Hi, I am Rank: “, mpi_rank CALL MPI_Finalize(mpi_ierr) END PROGRAM example Compiling MPI Codes The first thing we may now note is that we can’t compile MPI code with just the standard g95 type of call. Instead we need to use a Fortran 90 compiler built for MPI. Usually this is called

Geophysical Computing

L14-3

mpif90. On sanddunearch we have several flavors (similar to all the different flavors of f90). You can see the list of all the compilers CHPC supports on their web page http://www.chpc.utah.edu/docs/manuals/user_guides/arches/ Here we will use the pathscale mpif90 compiler since it is my favorite. Hence, to compile we need just to know the path to this compiler: >>/uufs/sanddunearch.arches/sys/pkg/mvapich/std/bin/mpif90 mpiexample.f90 –o mpiexample.x Executing MPI Codes Note that now you should have an executable file called mpiexample.x but that we can NOT just type ./mpiexample.x to execute this code. To execute this code there are two basic ways: (1) through interactive mode, or (2) through the batch system. Typically we will execute our codes through the batch system. To enter the interactive mode you would type: >> qsub -I -l nodes=1:ppn=4,walltime=10:00 Before we go on, note that we are requesting to use 1 node and 4 processors (ppn = processor per node) for a total time of 10:00 minutes. But it is preferable to just create a script that we can submit through the batch system. To run this code we need to use the program mpirun. But, we also need to use the same version of mpirun that was set up for the version of mpif90 that we used above: >> /uufs/sanddunearch.arches/sys/pkg/mvapich/std/bin/mpirun_rsh – rsh –np 4 –hostfile $PBS_NODEFILE ./mpiexample.x So, to run this job lets create a file: run_mpi.pbs #!/bin/bash #PBS #PBS #PBS #PBS #PBS #PBS #PBS –N –A –l –l –o –e –l testjob tj-sda qos=thorne walltime=00:10:00 test.out test.err nodes=4

cd $PBS_O_WORKDIR /uufs/sanddunearch.arches/sys/pkg/mvapich/std/bin/mpirun_rsh –rsh –np 4 –hostfile $PBS_NODEFILE ./mpiexample.x

Geophysical Computing

L14-4

We can now submit the job to be run by typing: >> qsub run_mpi.pbs Note that the output we wrote to screen is written to the file test.out. You can see what the program output is by examining this file. This is an example of how to use the batch system to run jobs. We will discuss this further in Section 5, but note that there are directives in this script: tj-sda and qos=thorne which specifically state to run on my personal nodes. This is fine in the confines of this class, but please don’t use my nodes for your personal jobs. This is of course a really simple example, but note that we got a response from each processor. Now let’s take a look at how to pass information between processors.

3. Basic Communication Routines (Send/Receive)
Let’s think of a simple example where we want to send a real number to the rank immediately to the right (or wrap around if we are at the farthest right processor). The situation may look like:

Rank 0 Recv (1)

Rank 1 Send (0) Recv (2)

Rank 2

Send (1) Recv (0)

Send (2)

The first thing we may want to do is define a variable we will call torank that defines which processor rank we want to send information to: IF (myrank > 0) torank = myrank – 1 IF (myrank == 0) torank = 2 To actually send this information we use the MPI_Send subroutine. The basic format of this subroutine looks like: MPI_Send (buf, count, datatype, dest, tag, comm, ierror) Where, buf count datatype dest tag comm ierror

= = = = = = =

the actual variable to send. the number of elements to receive the type of data to send the rank of the process to send the message to an integer number identifying the message the communicator (e.g., MPI_COMM_WORLD) the fortran return code.

Hence, if the real number we wanted to send was in the variable data we would do: CALL MPI_Send(data,1,MPI_REAL,torank,tag,MPI_COMM_WORLD,mpi_ierr)

Geophysical Computing

L14-5

So far we have told which processor where to send its data to. But, we haven’t specified which processors should be listening for data. To receive the data being sent we need to add an MPI_Recv call. For our above example: IF (myrank < 2) fromrank = myrank + 1 IF (myrank == 2) fromrank = 0 CALL MPI_Recv(rec,1,MPI_REAL,fromrank,tag,MPI_COMM_WORLD, & mpi_status,mpi_ierr) The MPI_Recv subroutine is quite similar to the MPI_Send subroutine: MPI_Recv(buf, count, datatype, source, tag, comm, status, ierror) Where, buf count datatype source tag comm status ierror

= = = = = = = =

the actual variable to send. the number of elements to receive the type of data to send the rank of the process to receive the message from an integer number identifying the message the communicator (e.g., MPI_COMM_WORLD) message status the fortran return code.

Let’s put this altogether into a program:
PROGRAM mpisendexample USE mpi IMPLICIT NONE REAL :: X, Y INTEGER :: mpi_ierr, nprocs, myrank, mpi_status INTEGER :: torank, fromrank, tag

! Initialize the MPI environment !---------------------------------------------------------------------! CALL MPI_Init(mpi_ierr) CALL MPI_Comm_size(MPI_COMM_WORLD,nprocs,mpi_ierr) CALL MPI_Comm_rank(MPI_COMM_WORLD,myrank,mpi_ierr) !---------------------------------------------------------------------!

! Let's Make the variable X be something specific to each processor: !---------------------------------------------------------------------! X = 10.0*float(myrank) !---------------------------------------------------------------------! !---------------------------------------------------------------------! ! Now let's read the value of X from the rank to the right ! and store it in the variable Y !---------------------------------------------------------------------!

Geophysical Computing

L14-6

!First let's send the data !---------------------------------------------------------------------! IF (myrank > 0) torank = myrank - 1 IF (myrank == 0) torank = nprocs - 1 tag = 1 write(*,*) "myrank: ", myrank, "sending to rank: ", torank CALL MPI_Send(X,1,MPI_REAL,torank,tag,MPI_COMM_WORLD,mpi_ierr) !---------------------------------------------------------------------! !We will add a barrier here, not because its necessary but so that our ! output comes in a more reasonable fashion for this example !---------------------------------------------------------------------! CALL MPI_Barrier(MPI_COMM_WORLD,mpi_ierr) IF (myrank == 0) write(*,*) "--------------------------" !---------------------------------------------------------------------! !Now Let's receive it! !---------------------------------------------------------------------! IF (myrank < (nprocs-1)) fromrank = myrank + 1 IF (myrank == (nprocs-1)) fromrank = 0 write(*,*) "myrank: ", myrank, "receiving from rank: ", fromrank CALL MPI_Recv(Y,1,MPI_REAL,fromrank,tag,MPI_COMM_WORLD,mpi_status,mpi_ierr) !---------------------------------------------------------------------! !We will add a barrier here as well !---------------------------------------------------------------------! CALL MPI_Barrier(MPI_COMM_WORLD,mpi_ierr) !---------------------------------------------------------------------! ! Now let's report what do we have !---------------------------------------------------------------------! IF (myrank == 0) write(*,*) "--------------------------" write(*,*) "On rank: '", myrank, "'; X =", X, " and Y = ", Y !---------------------------------------------------------------------! CALL MPI_Finalize(mpi_ierr) END PROGRAM mpisendexample

It is useful to note here that in our send and receive messages we had to specify that we were sending a real number with MPI_REAL. The primary data types you will use in Fortran are: MPI_REAL MPI_INTEGER MPI_CHARACTER

Geophysical Computing

L14-7

4. Some useful MPI functions
The above example utilized another function called MPI_Barrier. The action of the Barrier function is to synchronize processes. That is it essentially halts the program until all of the processors have reached the Barrier call. We used it above so that the output would be written in a little more sequential manner. Nonetheless, it wasn’t necessary. As noted there are over 100 MPI functions. You can find what they are and their syntax at the following web page: http://www.dei.unipd.it/~addetto/manuali_online/SP/MPISubRef/d3d80mst02.html But, let’s review a couple of the most useful here so you can see how these functions work in general. MPI_BCAST – Which is short for BroadCAST. With the broadcast command one processor sends the same message to a number of recipients with a single operation. Let’s look at a simple example that reads in some information to rank 0 and then broadcasts that information to all other processors. Let’s first create a file input.txt with some information to read in (a character, an integer, and a real number): my_input_example 10 13.567 Our code might look like:
PROGRAM mpisendexample USE mpi IMPLICIT NONE REAL :: realnum INTEGER :: nr INTEGER :: mpi_ierr, nprocs, myrank, mpi_status CHARACTER(LEN=30) :: title ! Initialize the MPI environment !---------------------------------------------------------------------! CALL MPI_Init(mpi_ierr) CALL MPI_Comm_size(MPI_COMM_WORLD,nprocs,mpi_ierr) CALL MPI_Comm_rank(MPI_COMM_WORLD,myrank,mpi_ierr) !---------------------------------------------------------------------! ! Read in file from standard input on rank 0 !---------------------------------------------------------------------! IF (myrank == 0) THEN read(*,*) title read(*,*) nr read(*,*) realnum ENDIF !---------------------------------------------------------------------!

Geophysical Computing

L14-8

! Now let's send this information to all of the other processors !---------------------------------------------------------------------! CALL MPI_BCAST(title, 30, MPI_CHARACTER, 0, MPI_COMM_WORLD, mpi_ierr) CALL MPI_BCAST(realnum, 1, MPI_REAL, 0, MPI_COMM_WORLD, mpi_ierr) CALL MPI_BCAST(nr, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, mpi_ierr) !---------------------------------------------------------------------! ! Now let's write out the value of one of these variables on the ! other processors to check that it worked !---------------------------------------------------------------------! write(*,*) "On rank: ", myrank, "; title= ", title !---------------------------------------------------------------------! CALL MPI_Finalize(mpi_ierr) END PROGRAM mpisendexample

Note that to run this code we would direct the input into the code through standard in: >> mpirun_rsh –rsh –np 4 –hostfile $PBS_NODEFILE ./mpiexample.x < input.txt MPI_allreduce – This handy little utility lets you choose a variable and find the minimum or maximum value of the variable across all ranks and put the output of the action in another variable. Here’s an example:
PROGRAM mpiredex USE mpi IMPLICIT NONE REAL :: X, MinX, MaxX INTEGER :: mpi_ierr, nprocs, myrank, mpi_status ! Initialize the MPI environment !---------------------------------------------------------------------! CALL MPI_Init(mpi_ierr) CALL MPI_Comm_size(MPI_COMM_WORLD,nprocs,mpi_ierr) CALL MPI_Comm_rank(MPI_COMM_WORLD,myrank,mpi_ierr) !---------------------------------------------------------------------! ! Make some dummy variable X !---------------------------------------------------------------------! X = 0.5*float(myrank+2) write(*,*) "Rank: ", myrank, "; X= ", X !---------------------------------------------------------------------! ! Find the min and max of X across all ranks and store in the ! variables MinX and MaxX !---------------------------------------------------------------------! CALL MPI_AllReduce(X,MinX,1,MPI_REAL,MPI_MIN,MPI_COMM_WORLD,mpi_ierr) CALL MPI_AllReduce(X,MaxX,1,MPI_REAL,MPI_MAX,MPI_COMM_WORLD,mpi_ierr) write(*,*) "Rank: ", myrank, "; MinX= ", MinX write(*,*) "Rank: ", myrank, "; MaxX= ", MaxX !---------------------------------------------------------------------! CALL MPI_Finalize(mpi_ierr) END PROGRAM mpiredex

Geophysical Computing

L14-9

Obviously there are many more MPI functions we could talk about. But, to be honest, many of the parallel codes I’ve written haven’t needed to use any other functions than the ones I covered in this lecture.

5. The Batch System
The final thing we need to talk about is submitting jobs. On the CHPC computers here at UU we use the Portable Batch System (PBS) for job scheduling. Other supercomputers may use other systems but they all basically work in the same way although there might be slight differences in syntax. So, as noted a typical batch script may look as follows: #!/bin/bash #PBS #PBS #PBS #PBS #PBS #PBS #PBS #PBS #PBS –N –A –l –l –o –e –l –M –m testjob tj-sda qos=thorne walltime=00:10:00 test.out test.err nodes=4 michael.thorne@utah.edu ab

cd $PBS_O_WORKDIR /uufs/sanddunearch.arches/sys/pkg/mvapich/std/bin/mpirun_rsh –rsh –np 4 –hostfile $PBS_NODEFILE ./mpiexample.x You can find a description of the flags at: http://www.chpc.utah.edu/docs/manuals/user_guides/arches/#batch The most important points are: • • • We must specify an amount of time (the walltime) that the job will require. If your job exceeds the walltime it will get killed! You must specify how many processors to use. On sanddunearch this is just done with the nodes option. When executing the code, you must again specify how many processors to use. This is done with the flag –np.

To submit the code we use the qsub command: >> qsub pbs_script Once we have submitted the code, we can check its status by just typing showq or qstat. However, I find this to be a little annoying since it shows everyone’s jobs. Hence, it is useful to create an alias that might just show your jobs. For example,

Geophysical Computing

L14-10

alias q = “qstat –a | grep username” Where you will obviously substitute in your own username.

6. Homework
This is another buy week as not all students have access to the supercomputing facilities.

Geophysical Computing

L15-1

L15 – POV-Ray - Part 1

1. What is POV-Ray?
POV-Ray stands for the Persistance of Vision Raytracer. POV-Ray belongs to a class of programs called ray tracers. For you seismologists this concept should be quite familiar. This has nothing to do with ray tracing seismic ray paths, but is essentially the same thing. Here we are creating an image by ray tracing light paths from an object to an observer or camera. You can generate some really incredible images using this tool. The cover art for Peter Shearer’s new seismology text was designed using POV-Ray by Gunnar Jahnke (of LMU, Munich). For some really nice examples of what you can do check out the POV-Ray Hall of Fame: http://hof.povray.org In this class we will use POV-Ray on the Windows side of the computers. However, POV-Ray can be run on almost any system. The reason is that POV-Ray is a very basic program that doesn’t include any graphics. It can produce images, but relies on other software available on any computer system to view these images. In this sense POV-Ray is similar to the Generic Mapping Tools (GMT). To create an image in POV-Ray we create a text file in the POV scene description language and then we render this image in POV-Ray. Just as in GMT we will be writing out our scenes in a text file which tells POV-Ray where the camera, lights, and objects are.

2. Getting Started
To launch POV-Ray: Start > Programs > POV-Ray for Windows Now let’s create a new text file to describe our scene: File > New File Here is a simple example you can type in: // This is a simple sphere // first, the camera position camera { location <0,8,-15> look_at <1,0,5> } // now add some light light_source { <5,10,-15> color rgb <1,1,1> }

Geophysical Computing

L15-2

plane { // the floor y, 0 // along the x-z plane (y is the normal vector) texture { pigment { color <0,0,1> } // checkered pattern normal { ripples 20 } } } sphere { <1,3,-5>, 3 pigment { marble turbulence 1 color_map { [0.0 color <1,0,0>] [0.25 color <0,0,1>] [1.0 color <0,1,0>] } scale 3 } finish { reflection 0.2 refraction 0.8 ior 1.5 phong 1 } } First we need to save the file: File > Save As… Let’s just save the file name as sphere.pov. And now we can render the image by hitting the Run button: The following image should be rendered. Note that POV-Ray will save the image as sphere.bmp in the same location as you saved the sphere.pov file.

Geophysical Computing

L15-3

Note that we can change the size of the rendered image by going to Render > Edit Settings/Render and selecting a different size (e.g., 1024 × 768). Initialization files (.INI files) contain information such as the resolution of the rendered image. If you need a resolution that is not included in the QUICKRES.INI file you can change them by going to: Tools > Edit resolution INI file.

3. Camera Angle
The example in the previous section shows a few key points about what is necessary in a POVRay file: (1) camera location, (2) lighting location, and (3) some objects to display (in this case a plane and a sphere). To understand where to place the camera we need to first understand the POV-Ray coordinate system. POV-Ray uses a simple Cartesian coordinate system that looks like this:

The two easiest attributes to use of the camera object is as shown in the first example. Namely the location and look_at attributes. In our above example we used: camera { location <0,8,-15> look_at <1,0,5> } Where the location just gives us the x-, y-, and z- coordinates of where the camera is located (x = 0, y = 8, z = -15), and we stated we wanted it to look at the position x = 1, y = 0, and z = 5. We could make a change and say we want to look directly at the origin by changing the look_at attribute: look_at <0,0,0> It’s more fun now to change the location attribute. Try a few changes, for example: location <0,10,-50> location <0,5,-10>

Geophysical Computing

L15-4

4. Lighting
Adding lighting to our scene is very similar to that of how we describe the camera position. The simplest light source is just a point light source. light_source { <5,10,-15> color rgb <1,1,1> } Here we first specify that we want the light placed at position x = 5, y = 10, and z = -15. We next state we want to use a white light. We use a RGB vector to describe the color of the light, but note that in GMT our RGB colors ranged from 0 to 255, in POV-Ray our colors range from 0 to 1. So, the vector <1,1,1> would be the same as 255/255/255 in GMT. In general it’s best to use a white light source, but you are not restricted to it. To change our light source so that it is looking directly down on our object try a position: <0,15,0> You can also add additional light sources. E.g., just add another line group: light_source { <-20,0,0> color rgb <0.5,1,1> }

5. Simple POV-Ray Objects
Defining a camera position and light source are great, but they serve no purpose unless you have something to look at. The simplest thing one can do as a new comer to POV-Ray is learn how to manipulate some of the basic POV-Ray shapes. In general primitive POV-Ray objects are described as: Object_Name { Object_Parameters Some_Simple_Attribute Some_Complicated_Attribute { Some_Attribute } } 5.1 Sphere We already saw an example of a sphere above. The basic form of drawing a sphere is as follows: sphere { <center>, radius } But, we should add some color to it with the pigment attribute. An example is:

Geophysical Computing

L15-5

sphere { <0,0,0>, 3 pigment { color rgb <0, 0, 1> } } To make sure we understand how these objects work let’s create a simple pov-ray file that we can play with: // Make Some Simple Objects camera { location <0,10,-10> look_at <0,0,0> } light_source { <5,10,-15> color rgb <1,1,1> } // Simple Sphere sphere { <0,0,0>, 3 pigment { color rgb <0, 0, 1> } } There are several classes of basic objects we can use in POV-Ray. The next table shows the basic syntax for some of the most important. Object Box Syntax
box { <corner-1>, <corner-2> }

Example
box { <0,0,0>, <3,3,3> pigment { color rgb <0,0,1> } }

Cone

cone { <center-1>, radius-1 <center-2>, radius-2 }

cone { <0,0,0>, 3 <0,4,0>, 0 pigment { color rgb <0,0,1> } }

Geophysical Computing

L15-6

Object Cylinder

Syntax
cylinder { <center-1>, <center-2>, radius }

Example
cylinder { <0,0,0>, <0,4,0>, 3 pigment { color rgb <0,0,1> } }

Disc

disc { <center>, <normal>, radius [, hole radius] }

disc { <0,0,0>, <1,1,0>, 3, 2 pigment { color rgb <0,0,1> } } plane { <0,1,0>, 1 pigment { color rgb <0,0,1> } } sphere { <0,0,0>, 3 pigment { color rgb <0,0,1> } }

Plane

plane { <normal>, offset }

Just a plane surface!

Sphere

sphere { <center>, radius }

Torus

torus { major radius, minor radius }

torus { 3, 0.5 pigment { color rgb <0,0,1> } }

Triangle

triangle { <corner-1>, <corner-2>, <corner-3> }

triangle { <0,0,0>, <1,0,0>, <0.5,1,0> pigment { color rgb <0,0,1> } }

Geophysical Computing

L15-7

6. Finish
An objects finish describes how the objects interact with light. For example, how much light they reflect. To play with the finish attribute let’s look at our simple sphere example again. We can make our object shine a bit by having our light source hit the sphere and giving it the phong finish: camera { location <0,10,-10> look_at <0,0,0> } light_source { <5,5,-15> color rgb <1,1,1> } // create a checker board surface plane { <0,1,0>, -5 pigment { checker color <1,0,0> color <1,1,1>} } // Simple Sphere sphere { <0,0,0>, 3 pigment { color rgb <0, 0, 1>} finish { phong 0.8} } The next table describes the primary ways we can apply finish to our objects. Finish ambient Description
How much of the lighting comes from ambient light (i.e., light that bounces off other objects). Range {0.0 to 1.0} 0.0 means that objects that are not directly lit will be black. Higher values will make an object appear to glow.

Example
finish { ambient 1.0 }

Geophysical Computing

L15-8

brilliance

How much of the lighting from direct light sources will bounce off of the object. Range {0.0 to ??} Large numbers can make an object more metallic looking. Numbers less than 1.0 make the object look softer.

finish { brilliance 0.5 }

crand

Used to make an object appear to have a rough surface. Range: {0.0 to 1.0} The larger the number the rougher the surface.

finish { crand 0.5 }

diffuse

Similar to the ambient keyword, but how much of the lighting will come from diffuse (direct) light sources. Range: {0.0 to 1.0}

finish { diffuse 1.0 }

phong

Create a highlight on the object. Range: {0.0 to 1.0}. The larger the number the brighter the highlight.

finish { phong 1.0 }

phong_size

Describes size of phong highlight. Range: {1.0 to 250.0} The larger the number the smaller (tighter) the highlight

finish { phong 1.0 phong_size 1 }

Geophysical Computing

L15-9

metallic

Only works in conjuction with phong and specular finishes. Highlight on object takes on color associated with object and not just on the light color.

finish { phong 1.0 metallic}

reflection

Using reflection will give the surface a mirrored finish. Range: {0.0 to 1.0}. A value of 0.0 turns reflection totally off. The larger the number the more reflective the surface is.

finish { reflection 1.0 }

specular

This is similar to phong, in that it is used to create a highlight on the object. This is purportedly more realistic than phong. Range: {0.0 to 1.0}. The larger the number the brighter the highlight.

finish { specular 1.0 }

roughness

This controls the size of the highlight used in the specular command. Range: {0.0005 to 1.0} The smaller the number the smoother the object is.

finish { specular 1.0 roughness 1.0 }

refraction

Refraction only works if your objects are partly transparent. This can be changed in the pigment statement. For example: pigment { color rgbf <0,0,1,0.8> } Will partially transparent objects light can now refract through them.

finish { refraction 1 }

Geophysical Computing

L15-10

Refraction = 0 means to turn off refraction. Refraction = 1 means to turn on refraction.

ior

Stands for index of refraction. By default the ior = 1.0 which means no refraction (speed of light is the same inside and outside of the object). This allows us to change lights index of refraction for an object

finish { refraction 1 ior 1.5 }

Now that we know all about finish, just for fun try out the following: // Simple Sphere sphere { <0,0,0>, 3 pigment { color rgbf <1, 1, 1, 0.8> } finish { reflection 0.1 refraction 1.0 ior 1.5 phong 0.8 } } For even more fun start adding in texture statements: // Simple Sphere sphere { <0,0,0>, 3 texture { pigment { color rgbf <1, 1, 1, 0.8> } normal { bumps 1/2 scale 1/6 } finish { reflection 0.1 refraction 1.0 ior 1.5 phong 0.8 } } }

Geophysical Computing

L15-11

7. Image Overlays
In our next lecture we will go over the different uses of the pigment command in detail. But, for now let’s look at one of the most useful keywords: image_map. On the course web page the material for this lecture contains a file called: Earth Surface, Clouds & Ocean. Download this file into the directory where you are keeping your .pov files. Unzip these files and note that these are just .png images of the Earth. Now, see how simple it is to overlay this on a sphere in POVRay using the pigment command:
// Earth Sphere sphere { <0,0,0>, 3 pigment {image_map {png "./Earth/03_Earth_Land.png" map_type 1 interpolate 2 }} }

Of course, we don’t necessarily need to restrict ourselves to a spherical Earth…
box { <-3,-3,-3>, <3,3,3> pigment {image_map {png "./Earth/03_Earth_Land.png" map_type 1 interpolate 2 }} }

The most important modifier is map_type: • • • • map_type = 0: This is a planar mapping. We will describe it more below. map_type = 1: This is the spherical mapping. The image is wrapped around the origin. map_type = 2: This is a cylindrical mapping. The image is wrapped around the y-axis. map_type = 5: This is a toroidal mapping. Fantastic if you ever find the need to wrap an image on a donut.

For plotting maps onto spheres we will typically want to use the map_type 1 modifier. But it is often useful to paste planar images onto surface. For example the following image of a nebula is taken from the Hubble telescope:

Geophysical Computing

L15-12

This image is of size 1024 × 831 pixels. It might make a really nice background image. So, let’s create a plane surface in the x-y plane and image_map this onto it:
plane { <0,0,1>, 50 pigment {image_map {png "./Hubble/04_Hubble.png" map_type 0 interpolate 2 } } }

What you get is the next image. It’s not too useful. The problem is that the normal image map takes any image no matter what it’s size and maps it onto the interval from 0-1 (for both axes). So, not only does the image turn out small it is also now stretched. It also repeats the image over and over.

The key is to start scaling the image. Since the image wider than it is tall we can use different scales for the different axes. For example, camera { location <0,0,-40> look_at <20,18,-5> } light_source { <5,5,-15> color rgb <1,1,1> } plane { <0,0,1>, 50 pigment {image_map {png "./Hubble/04_Hubble.png" map_type 0 interpolate 2 once} scale <123,100,0> } } Using the once key word will turn off wrapping the image. The image below shows that we can build up perspective views of our image maps for nice effects.

Geophysical Computing

L15-13

8. Translation and Rotation
If you play around with image mapping onto spherical objects POV-Ray only seems to behave well if you place the sphere at the plot origin. That isn’t very useful if you want multiple objects. Luckily, we have the translate function. Translate basically just tells us by how much in the x-, y-, and z- directions to move our object by. E.g., for our example of the Earth we can move it over in the positive x-direction by: camera { location <0,0,-10> look_at <0,0,0> } light_source { <5,5,-15> color rgb <1,1,1> } sphere { <0,0,0>, 3 pigment {image_map {png "./Earth/03_Earth_Land.png" map_type 1 interpolate 2 }} translate <5,0,0> } Rotate is really useful for our image maps mapped to spheres. We could change our look_at parameter for our camera. But, it would be easier just to rotate the sphere. The rotate command looks like: rotate <x angle, y angle, z angle> Which tells us how many degrees to rotate our object around one of the principal axes. For example, change our earth plot to:

Geophysical Computing

L15-14

sphere { <0,0,0>, 3 pigment {image_map {png "./Earth/03_Earth_Land.png" map_type 1 interpolate 2 }} rotate <30,180,0> }

9. Homework
1) On the course web page there is a large tar file called Solar System Objects. This file contains surface maps of many of the bodies in our solar system. This is my collection of image maps, and is not complete, but about as complete and as current as you will find anywhere I think. Use these images to do something creative in POV-Ray that utilizes at least 2 objects. Print out a copy of your image and bring to the next class. We will vote on the best image with the winner taking home a prize! Here is an example image I created in about 15 minutes. Now, good luck and have fun!

Geophysical Computing

L16-1

L16 – Povray - Part 2

1. Pigment
In the previous lecture we have briefly introduced pigments. This section should tell you pretty much everything you need to know about them. Solid Colors Our previous examples showed us how to make objects solid colors. To remind us, let’s draw a sphere and make it red: camera { location <0,0,-10> look_at <0,0,0> } light_source { <5,5,-15> color rgb <1,1,1> } background { color rgb <1,1,1> } sphere { <0,0,0>, 4 pigment {color rgb <1,0,0>} } We just declared our pigment to be a color with the statement color rgb. Color Maps Many of the pigment options that we can use are best used when combined with a color map. The concept of a color map in POV-Ray is quite similar to that in GMT. Only we usually define the range of colors to be between 0 and 1 as many of the pigment options act on the 0-1 range of color maps. Below is an example of how to define a color map in POV-Ray. // Red-to-Blue through White Color Map color_map { [0.0 color rgb <1,0,0>] [0.3 color rgb <1,0,0>] [0.5 color rgb <1,1,1>] [0.7 color rgb <0,0,1>] [1.0 color rgb <0,0,1>] }

Geophysical Computing

L16-2

As an example of how to use this we can replace the sphere command we used in the above example by: #declare Red_Blue = color_map { [0.0 color rgb <1,0,0>] [0.3 color rgb <1,0,0>] [0.5 color rgb <1,1,1>] [0.7 color rgb <0,0,1>] [1.0 color rgb <0,0,1>] } sphere { <0,0,0>, 4 pigment { agate color_map {Red_Blue} } } There are three main points to be made from the above example: • • • We used the special pigment agate that gives a swirly agate-like appearance to our sphere. The colors that the agate function used were defined by the color map. We can use #declare to give names to specific objects (in this case a color map) so we don’t have to re-type them over and over if we want to use them more than once.

The next table shows the different ways that we can affect the color table.

Pigment Agate

Syntax
pigment { agate }

Example
pigment { agate color_map {Red_Blue} }

Bozo

pigment { bozo }

pigment { bozo color_map {Red_Blue} }

Geophysical Computing

L16-3

Checker

pigment { checker color color-a color color-b }

pigment { checker color rgb <1,0,0> color rgb <1,1,1> }

Frequency

Frequency controls how many times the color map is used over the 0.0 to 1.0 range. E.g., Frequency 2 will cause the color map to repeat itself 2 times. pigment { gradient <x,y,z> } The vector in this command is normal to the direction of the gradient. pigment { granite }

pigment { bozo color_map {Red_Blue} frequency 5 }

Gradient

pigment { gradient <1,1,0> color_map {Red_Blue} }

Granite

pigment { granite color_map {Red_Blue} frequency 5 }

Hexagon

pigment { hexagon color color-a color color-b color color-c }

pigment { hexagon color rgb <1,0,0> color rgb <1,1,1> color rgb <0,0,1> }

Leopard

pigment { leopard }

pigment { leopard color_map {Red_Blue} frequency 2 }

Mandel

Creates a pattern that looks like the Mandelbrot set. The number after mandel states how many interations to calculate in making the pattern.

pigment {mandel 50 color_map {Red_Blue} scale 2.5 }

Geophysical Computing

L16-4

Marble

pigment { marble }

pigment {marble color_map {Red_Blue} }

Onion

pigment { onion } Note that the onion command doesn’t work well with the sphere (the sphere would be only one color) Offset’s the phase of the color map. Range: 0.0 to 1.0

Phase

Radial

pigment { radial }

box { <-3,-3,-3>, <3,3,3> pigment { onion color_map {Red_Blue} } rotate <-30,-30,0> } box { <-3,-3,-3>, <3,3,3> pigment { onion color_map {Red_Blue} phase 0.5 } rotate <-30,-30,0> } pigment { radial color_map {Red_Blue} } rotate <0,60,0>

Spotted

pigment { spotted }

pigment {spotted color_map {Red_Blue} }

Wood

pigment { wood } If you really want wood, then better not use the Red_Blud color table.

pigment {wood color_map {Red_Blue} }

Geophysical Computing

L16-5

We can also modify our pigment commands with turbulence. What does turbulence do? You guessed it. It stirs things up a bit. To see its use consider the next two examples:
pigment {wood color_map {Red_Blue} turbulence 0.2 }

pigment {wood color_map {Red_Blue} turbulence <0.2,0,0> }

Here we applied the turbulence command to our wood example from above. The first example applied equal amounts of turbulence (defined in the range from 0.0 to 1.0) in all directions. The second example only applied turbulence in the x-direction. Now, perhaps you can see why we waited until the 2nd lecture to give a run down on the basics of the pigment command. Namely, it took almost 5 pages to do it.

2. Height Fields
Height fields are one of the most useful objects for visualizing real data in POV-Ray. To demonstrate how to use height fields let’s look back at an example of visualizing elevation data. Lecture #7 discussed making GMT .grd files from Digital Elevation Model (DEM) data. In the material for the current lecture I include a .grd file generated for the Zion National Park area. Download that file now as we will use it to generate a height field in POV-Ray. With GMT and ImageMagick we can make a plot of the Zion.grd file and then convert this image to a .gif file. The following script shows an example of how to do this.

#!/bin/csh # Set input/output set Gridfile = Zion.grd set output = Zion_Height.gif

#name of input .grd file to use #name of output file. Must contain # .gif extension

# Set map boundaries set xmin = 300000 set xmax = 339995 set ymin = 4100005 set ymax = 4160000

Geophysical Computing

L16-6

# Make color palette table to utilize the maximum range of elevations # Note that image should be gray scale. Lowest elevations should # be colored black, ranging to hightest elevations white gmtset COLOR_BACKGROUND 0/0/0 gmtset PAGE_COLOR 0/0/0 set cscale = `grdinfo -T100 $Gridfile` makecpt -Cgray -M -Z $cscale >! height.cpt # Generate the postscript file grdimage $Gridfile -R${xmin}/${xmax}/${ymin}/${ymax} -Jx1:250000 \ -Cheight.cpt -Qs -E300 -P >! Height.ps rm height.cpt # Make .gif file for use with POV-Ray height fields convert Height.ps $output

# Show the .gif height field rm Height.ps display $output

Running this script will generate a .gif image of Zion National Park that looks like this:

The most important points about this process are: • • To make a height field in POV-Ray we need a .gif image. Other formats are also supported but the .gif file format is simple to deal with. The .gif format allows for a range in gray from 0-255, hence only 256 distinct elevations are allowable. Hence, we wish to maximize our color palette table to include just the range of elevations in the .grd file. The .gif file should be in gray scale. Noting that the minimum elevations should be color coded black and ranging up to white for the maximum elevations.

Geophysical Computing

L16-7

Now that we have a .gif file as created in the above example it is simple to visualize this in POVRay. The following script shows the simplest (no-frills) way to do this: camera { location <0,5,0> look_at <0,0,3> } light_source { <1,10,0> color rgb <1,1,1> } height_field { gif "Zion_Height.gif" smooth pigment {color rgb <0,1,0> } finish {phong 0.4} scale <6,0,6> translate <-3,0,0> } An important note to make is that with POV-Ray we are not limited in where we can place our camera location whereas in GMT our camera location is constrained to be off at infinity. So, for example, we can zoom in to very unique views. In addition, one can now play with all of POV-Ray’s unique finishes, textures, pigments and so on to create very realistic views, including the addition of clouds and other objects. It is up to you as an artist at this point. As another point, almost anything can be turned into a height field. For example, open up Adobe Photoshop or ImageReady and create a black and white file with some text. For example, I created the following .gif file where I just typed my name:

Note that if saving a .gif file in Photoshop or ImageReady you need to set Reduction to Grayscale or it might not work properly. Now we can use this as a height field in POV-Ray:

Geophysical Computing

L16-8

As a final note, what you do with this type of height field object is pretty much up to your creativity. Once you get used to playing around with POV-Ray you stick these height fields on objects and do all kinds of interesting things (see the next section on bump maps)!

3. Normals
The normal attribute modifies the normal surface vectors to give an appearance of bumpiness. The next table shows the basic ways this can be done.

Normal
Bumps

Syntax
normal {bumps bump_size}

Example
sphere { <0,0,0>, 4 pigment {color rgb <1,0,0>} normal {bumps 1.0} finish {phong 0.8} }

bump_size can range from 0.0 (no bumps) to 1.0 (maximum size) Dents
normal {dents dent_size}

normal {dents 1.0}

dent_size can range from 0.0 (no dents) to 1.0 (maximum size) Ripples
normal {ripples size} box { <-3,-3,-3>, <3,3,3> pigment {color rgb <1,0,0>} normal {ripples 1.0} rotate <-30,-30,0> finish {phong 0.8} } normal {waves 1.0}

size takes the standard input range from 0.0 to 1.0. Waves
normal {waves size}

size takes the standard input range from 0.0 to 1.0. Wrinkles
normal {wrinkles size} normal {wrinkles 1.0}

size takes the standard input range from 0.0 to 1.0.

Geophysical Computing

L16-9

Bump Maps One of the most interesting way to affect the normals is to use the bump_map attribute. This is like a cross between an image_map and a height_field, allowing us to extrude image information onto an object. The next example shows how to do this simply with the .gif image of my name that we generated earlier. camera { location <0,1,-10> look_at <0,0,0> } light_source { <0,0,-20> color rgb <1,1,1> } sphere { <0,0,0>, 3 pigment {color rgb <1,0,0>} normal{ bump_map { gif "Mike_1.gif" bump_size 1 map_type 1 interpolate 2 } } finish {phong 0.8} rotate <-20,0,0> }

4. Homework
(1) Generate a height field image of some region of Utah (e.g., Antelope Island, Twin Peaks Wilderness, King’s Peak area, etc.). Generate two images: (a) an aerial overview and (b) a zoomed in view to a specific region of interest. Use the various pigment, finish, and normals modifiers we have discussed thus far to generate the image. As an example, below is an image I created looking at the Wasatch Front (Mt. Olympus is in the center of the frame).

Geophysical Computing

L16-10

Geophysical Computing

L17-1

L17 – Povray - Part 3

1. Constructive Solid Geometry Thus far we have only considered some very simple shapes that we have created in POVRay (e.g., spheres, triangles). We can however create much more interesting objects by combining some of these simple shapes. This is what we term constructive solid geometry. As an example, you may be wondering how the cross-section of the Earth shown in Lecture 15 was drawn? The answer you will shortly see is quite simple. The following table outlines the basic operations we may perform using a sphere and a box. First let’s imagine the simple case where I create a sphere and a box that are overlapping:
camera { location <0,5,-10> look_at <0,0,0> } light_source { <1,10,-10> color rgb <1,1,1> } background { color rgb <0.812,0.812,0.922> } sphere { <0,0,0>, 3 pigment {color rgb <1,0,0>} } box { <0,0,0> <3,3,-3> pigment {color rgb <0,0,1>} finish {phong 0.8} } Now, I could subtract the box away from the sphere using the difference command:

Geophysical Computing

L17-2

difference { sphere { <0,0,0>, 3 pigment {color rgb <1,0,0>} } box { <0,0,0> <3,3,-3> pigment {color rgb <0,0,1>} finish {phong 0.8} } rotate <0,15,0> } Voila, now I have a cross-section I can be proud of! Well, almost – we can now think of how do we drape imagery onto the cross section! Other options are: Intersection
intersection { sphere { <0,0,0>, 3 pigment {color rgb <1,0,0>} } box { <0,0,0> <3,3,-3> pigment {color rgb <0,0,1>} finish {phong 0.8} } rotate <20,210,0> } merge { sphere { <0,0,0>, 3 } box { <0,0,0> <3,3,-3> } pigment { bozo color_map {Red_Blue}} }

Merge

In addition to these basic types of options another important option is the union command. With the union command we can group objects together and then apply pigments, translations, rotations, etc. onto the group of objects at once. Consider the next example where we create an object composed of three spheres:

Geophysical Computing

L17-3

#declare TripleSphere = union { // sphere #1 sphere { <0,0,0>, 2 } // sphere #2 sphere { <0,3,0>, 1 } // sphere #3 sphere { <0,-3,0>, 1 } } // end union // Display the object here! object {TripleSphere pigment {bozo color_map{Red_Blue}} rotate <0,0,20> }

Note that now we use object to display the TripleSphere object that we declared. Now we can use just a single pigment and rotate command to act on the entire object! One can create incredibly complex objects using such constructive geometry. However, it is often easiest to draw it out on graph paper first.

2. A CSG Example
Let’s take a look at what we’ve learned thus far in this class and put some things together into a nice cross-sectional image of the Earth. Here, we will build up a complex image by putting several small pieces together in a somewhat cookbook fashion. 2.1 - Declaring Lights and Camera Angle It is customary to start out by declaring a camera object and light source. Let’s start out as follows: // Mantle Cross-Section with Image overlay #include "textures.inc"

// -------------------LIGHTS/CAMERA camera { location <0,6000,-15000> look_at <0,0,3> }

----------------------//

Geophysical Computing

L17-4

light_source { <1,12000,-10000> color rgb <1,1,1> } // background background { color rgb <0,0,0>} // -------------------- END LIGHTS/CAMERA

-------------------//

Note that on the second line I use the statement: #include “textures.inc”. There are some nice pre-defined textures and pigments that come with the standard POV-Ray download. On the class webpage there are several reference sheets that describe what these are. We include this statement here because we will use one of them in this example. 2.2 - The Inner Core Let’s start out by generating a metallic looking inner core. We can do this as follows: // -------------------INNER CORE -------------------------// #declare Inner_Core = sphere { <0,0,0>, 1221.5 texture {Chrome_Metal normal {bumps scale 250} finish {phong 1.0} } } // -------------------END INNER CORE ---------------------// Note that we scaled our core size to the actual size. Also, we used the Chorme_Metal texture from textures.inc. Since we declared the inner core as its own object to view it, we need to add a statement to the end of our code like: object {Inner_Core} At this point we should see the following when rendering our image:

Geophysical Computing

L17-5

2.3 - The Outer Core Obviously, let’s now add the outer core to the mix! // -------------------OUTER CORE #declare OC_Color = color_map { [0.0 color rgb <0.8,0.8,0.6>] [0.5 color rgb <0.8,0.75,0.6>] [1.0 color rgb <0.8,0.7,0.6>] } #declare Outer_Core = difference { sphere { <0,0,0>, 3480.0 } sphere { <0,0,0>, 1221.5 } box { <0,-7000,0>, <7000,7000,-7000> } texture { pigment {wood color_map {OC_Color} scale 100 turbulence 0.3} finish {phong 0.8} } // end texture } // end difference // --------------------------------------------//

END OUTER CORE

----------------------//

This is a little more complicated of a declaration. Let’s look at what we did here: • • First, we declared a new color map that we will use to color our outer core. We defined this as OC_Color. We made the outer core by taking the difference between two spheres: (1) A sphere with radius of the Outer Core subtracted from (2) A sphere with a radius of the Inner Core. Next we subtracted off a box to generate a cross-sectional view. In plan view, we took the difference between the three following objects:

Geophysical Computing

L17-6

Which when differenced leaves us with a single object:

With all of the objects combined into one, we can use a single texture statement to color it all. Here we used a wood pigment with a touch of turbulence to try and emulate a feel of mixing or stirring of the liquid outer core.

In order to view this image we now create a new object that we call Earth that is the union of the Inner and Outer Cores. Then we can rotate the object as a whole to get some better views.

Geophysical Computing

L17-7

#declare Earth = union { object {Inner_Core} object {Outer_Core} } object {Earth rotate <0,25,0> } The resulting image at this point should look like (Render this yourself to see the fine turbulent detail in the outer core):

2.4 - The Crust and Earth Surface It would seem to make more sense to do the mantle now. But, for this example the mantle will be a little trickier. Whereas, addition of the crust will be similar to what we did in creating the outer core, except that now we will add the Earth Surface image as a pigment for the outermost layer.

// ---------------#declare Crust = difference {

CRUST AND SURFACE

----------------------//

sphere { <0,0,0>, 6371.0 pigment {image_map {png "./Earth/03_Earth_Land.png" map_type 1 interpolate 2}} finish {ambient 0 diffuse 1} } sphere { <0,0,0>, 6331.0 pigment {color rgb <1,0.8,0.6>} }

Geophysical Computing

L17-8

box { <0,-7000,0>, <7000,7000,-7000> pigment {color rgb <1,0.8,0.6>} finish {phong 0.8} } } // end difference // -------------------- END CRUST Don’t forget we now need to declare our Earth as: #declare Earth = union { object {Inner_Core} object {Outer_Core} object {Crust} }

--------------------------//

Doing this we can see that we have a nice thin shell of crust now encompassing the Core. At this point Neal Adams would be pleased – a perfect rendition of his hollow Earth theory! Nonetheless, we are scientists and know better. Hence, here comes the mantle!

Geophysical Computing

L17-9

2.5 - The Mantle We are going to do this in two separate pieces. The first part will have a solid color on the lefthand side of the cross-section. Here is how we would create the solid part: // ------------------ SOLID MANTLE #declare Mantle_Solid = difference { sphere { <0,0,0>, 6331.0 } sphere { <0,0,0>, 3480.0 } box { <0,-6371,6371>, <6371,6371,-6371> } ------------------------//

texture { pigment {color rgb <0.82,0.87,0.611>} finish {phong 0.8} } } // end difference // --------------- END SOLID MANTLE

------------------------//

If you just rendered this you would now get an image like:

Geophysical Computing

L17-10

Now, on the right-hand side of the cross-section let’s drape some imagery. On the website material for this lecture there is an image file: mantle_cartoon.png. We can drape this just into the mantle portion: // ------------------ MANTLE IMAGE #declare MantleImage = difference { -------------------------//

sphere { <0,0,0>, 6331.0 pigment {color rgb <0.82,0.87,0.611>} } sphere { <0,0,0>, 3480.0 pigment {color rgb <0.82,0.87,0.611>} } box { <0,-6371,0>, <6371,6371,-6371> translate <0,6371,0> pigment {image_map {png "./mantle_cartoon.png" map_type 0 interpolate 2} scale <6331,6331*2,1> rotate<0,0,0>} translate <0,-6371,0> } // end box } // end difference // --------------- END MANTLE IMAGE Combining all of our elements: #declare Earth = union { object {Inner_Core} object {Outer_Core} object {Mantle_Solid} object {MantleImage} object {Crust} } We now have the following image:

------------------------//

Geophysical Computing

L17-11

2.6 - Final Touches At this point there isn’t much else to do. But, one can always play around more and more. For example, adding lights, changing lights, etc. Here, I am going to add some information on our crustal layer so that we also have a sea surface and clouds. Below is some rather complicated looking code (but not too bad if you just look at it line by line) that adds in seas and clouds. Here these lines replace the former line that just added the sphere overlain by the Earth Land file. Some of this was modified from material I found on an old website by Constantine Thomas. I can no-longer find the original site, so I apologize to the original author of this union script for not linking to his site.

// seas, land surface, and clouds union {

sphere { <0,0,0>, 6371.0 pigment {image_map {png "./Earth/03_Earth_Seas.png" map_type 1 interpolate 2 }} finish {ambient 0 diffuse 1 specular 0.5 roughness 0.01} }

Geophysical Computing

L17-12

sphere { <0,0,0>, 6381.0 pigment {image_map {png "./Earth/03_Earth_Land.png" map_type 1 interpolate 2 transmit 2, 1.0 }} finish {ambient 0 diffuse 1} } #declare Indexes = 256; #declare T = 5.5; // // // // // // number of entries in the color map controls how fast the clouds become opaque towards the center T=1 -> linear T<1 -> less transparency on the edges T>1 -> more transparency on the edges

sphere { <0,0,0>, 6391.0 pigment{image_map {png "./Earth/03_Earth_Clouds.png" map_type 1 interpolate 2 #declare n=0; #while (n < Indexes) transmit n,1-pow(n/(Indexes-1),T) #declare n=n+1 ; #end } } normal {bump_map {png "./Earth/03_Cloud_BumpMap.png" map_type 1 interpolate 2 bump_size 0.5}} finish {ambient 0 diffuse 1} } } // end union The final rendered image is shown below.

Geophysical Computing

L17-13

There are many, many other useful things we can do with POV-Ray. One important thing I have not discussed is volume rendering. Perhaps in the future I will post a lecture on this topic as well.

4. Homework
There is no assigned homework for this lecture. Just have fun playing with POV-Ray.

Geophysical Computing

L18-1

L18 – Finalizing Illustrations for Publication

1. Preparing Figures for Publication Now that you’ve spent a ton of time generating wonderful graphics of your research material let’s discuss how do we ultimately format these graphics to be published in a research journal. After all, we’ve spent a lot of time and effort in doing our research and making our plots, we should spend a little bit of time to make them really pop in the journal article. I am a strong advocate of cleaning up all figures using Adobe Illustrator. The primary reason is that: We can spend a long time in programs like GMT or MATLAB trying to make every color and every line and every piece of text perfect. Or, we can spend a smaller amount of time in GMT perfecting line thicknesses and quickly edit everything in Illustrator to look perfect! In what follows we will first describe some basics about preparing images for publication and then do a short tutorial on Adobe Illustrator. 2. Figure Basics
Before preparing your finalized figures you should always check the journals web page for figure guidelines. For example, the Journal of Geophysical Research (JGR) is one of the most common journals you may submit your work to. Searching their website one finds a guide for authors and in particular a guide for preparing figures: http://www.agu.org/pubs/authors/manuscript_tools/journals/graphics_prep.shtml The guidelines given on this page are similar for all journals. Here are the most important points we should adhere to when considering figure design for all journals: • • For most graphics use the Encapsulated Postscript (.eps) file format. For photographic images or raster graphics use Tagged Image File Format (.tiff) file format. The JGR explanation here is nice: ”TIFF provides the highest resolution to ensure patterns and shading are maintained, yet it offers lossless compression and thus smaller file size.” • • All lines must be at least 0.5 point thickness. Do not use font sizes less than 8 points. In general use 12 point font for figure labels and no less than 10 point font for axis labels. Only use 8 point font as a last resort if it is just impossible to fit text with 10 point font into the figure. Use a standard font such as Times New Roman or Helvetica. Combine multi-part plates or figures into a single figure adding letter labels for the individual plates. For example,

• •

Geophysical Computing

L18-2

Always determine the allowable figure size (e.g., a single column figure is typically 8.5 cm in width) and scale your figure into the allowable size. Then adjust all fonts and line weights, etc.

3. Adobe Illustrator Cookbook
What we are going to do here is use Adobe Illustrator to take two raw figures generated in MATLAB and prepare them such that they are suitable for publication. First, go to the course webpage and download the material for L18. The files we will be working with here are called: ce_ex_new_hv.eps and res_ex_new_hv.eps. Then launch Adobe Illustrator on the PC side of the computers. Both space and color cost money (depending on the journal color figures can cost you thousands of dollars) so our goal here is to make a figure combining both plots above that is (1) in black & white, and (2) only spans 1-column. As we go along be sure to save your figure often (in .eps format so that you don’t lose any changes you made due to unforeseen problems). Step 1) Open Files in Illustrator. First of all, right click on each file and select: Open With Adobe Illustrator. Select one of the images on one plot and copy it into the other window such that both figures are on the same plot. You should have a window that looks something like:

Geophysical Computing

L18-3

Step 2) Create Figure bounding box and roughly scale figures. In this example we want to create a single column figure. In general column sizes are 8.5 cm (but may vary depending on journal). So, we will create a box to overlay our figure inside of that is 8.5 cm wide to give us a guide as to how to scale our figure. • Create a new layer to place the bounding box on. Find the Layers box (Select Window Layers if it isn’t visible on your screen) and hit the Create New Layer button.

Select this new layer so that we are working on it.

Geophysical Computing

L18-4

• •

We want to create a box that is 8.5 cm wide. So, let’s make sure our units are in cm. Select File Document Setup… and change the Units to Centimeters. Now select the Rectangle Tool and click somewhere on the drawing to create a box. Just clicking on the plot allows us to directly enter the box size.

Now place this rectangle in the center of the drawing board. Select both figures (Use the selection tool and holding Shift to select more than one object) and scale them down evenly (Hold Shift while dragging on the corners of the box).

Geophysical Computing

L18-5

Step 3) Axis labels and axis tick marks. The figure now looks dramatically different than before in terms of how thick the lines are and how big the axis labels are. Just remember, this is what the journal would do to your figure if you give them a full size plot. So, if you don’t want your final journal figures to look like crap then you see the importance of what we are doing. So, now it is time to adjust the font size of our axes. • • • Zoom into the drawing by either using the Zoom Tool same time. or pressing Ctrl and + at the

Select each of the figures and ungroup all of the objects: Object

Ungroup

Now select one of the axis labels. E.g., here I select one of the X-axis labels. Using the Character Window change the Font size to 12 pt and the font type to Times New Roman:

I don’t like that Hertz is all capitalized so I want to change HZ to Hz. Use the Type Tool and change the text.

Now go ahead and edit all of the axis labels (Frequency and Ellipticity) making them 12 point font, and the labels on the tick marks (e.g., 0, 2, 4, 6, etc.) making them 10 point font. You may want to delete every other Ellipticity value so that we can fit all of the labels into a single plot. What you end up with should look as follows, such that the labels are now readable.

Geophysical Computing

L18-6

Step 4) Add letter labels for each subplot. Similar to what we did in the previous step lets create individual labels for each subplot. In each case we will name our subplots after the stations the data were recorded at. Make your labels: (a) CCP and (b) RES, and make them 12 point font. Align the labels along the left plot edge. After this step your plot should look like:

Geophysical Computing

L18-7

Step 5) Choose line weight and color for lines. • Notice that if we try to select an individual line all lines get selected. To be able to select a single line: (1) Select a group of lines then, (2) Select Object Clipping Mask Release. Now note that you can select individual lines. Notice that if we select one of the thick black lines that it has a line weight of 5 points, which is stated in the Appearance window as Stroke: 5 pt

Let’s make the stroke smaller. The Color window allows us to change the line’s Weight under the Stroke tab. Let’s change the Weight to 3 points:

In this case, the thin red, green, and blue lines are essentially the same measurement. Here we have made the decision to make these three lines the same thickness and the color to be a shade of gray. Select the three lines. Note another common problem that now occurs. These lines are still part of a compound path. You can see this by looking at the Appearance window. It shows: Compound Path and Mixed Appearance.

Geophysical Computing

L18-8

• •

We can correct this problem by selecting Object

Compound Path

Release.

Now, we can change the color in the Color window for the lines. Let’s make each line 0.5 pt thick (modify the Stroke) and in the Color Window Choose an RGB color of R = G = B = 102. Your plot should now look like:

Lastly, let’s select the dashed blue line. Let’s make the Stroke = 2 pt. and the color the same as the last line R = G = B = 102. I don’t like the current dashed line appearance so under the Stroke tab let’s change the dashed line to dash = 8 pt, gap = 2 pt, dash = 3 pt, gap = 2 pt.

Geophysical Computing

L18-9

Doing these steps to both plot panels we should now have an image that looks like this:

Step 6) Final adjustments to position and size. During our process of making our axis labels big enough for a human to read them we managed to make our overall figure larger than our guide box. Hence, at some point (it may actually have been better to do this earlier on – but better now than never) we need to go back and scale our figure to ensure it all fits in the box. You should know how to do this by now – so go ahead and

Geophysical Computing

L18-10

do it! You should end up with something that looks like the following – noting that your font sizes should still be 12 and 10 in this example.

Step 7) Add a figure legend. At last we can’t just draw lines on a plot and not give a legend describing what they are. You can easily do this by drawing short lines of the same style as in your figure and adding text next to them. Do so as in the next image:

Geophysical Computing

L18-11

Step 8) Final touches. Now, all that’s left are some final touches to make sure the journal editors don’t screw things up too much. • First of all, remove the layer with the bounding box. The easiest way to do this is to select the layer and hit the trash icon in the lower right corner:

Finally, I always like to add a label at the bottom stating: i. ii. iii. iv. Figure Number Authors Names Should the figure Span a single column or Span two columns Black & White or Color

Why you may ask? Once again – there is nothing worse than spending a year or so of your life working on a research project, produce very elegant figures and illustrations, pay thousands of dollars for the publication, and then have the figure get screwed up when it goes to print. It has never happened to me personally, but I know people to whom it has happened. Take my word for it, it isn’t desirable. Our final figure now looks pretty acceptable. Here it is:

Geophysical Computing

L18-12

4. A Gallery of Good Figures
In wrapping up this lecture, here are a few example figures that were done entirely in Illustrator. The idea here is just to give you some ideas of what can be done. Cartoons Sometimes after you’ve put together your conclusions and final model it’s easiest to demonstrate this with a cartoon drawing. Here are two exceptional examples drawn almost entirely in Adobe Illustrator.

Geophysical Computing

L18-13

From: Schmerr, N., Garnero, E., (2007), Upper Mantle Discontinuity Topography from Thermal and Chemical Heterogeneity, Science, 318, 623-626.

From: Garnero, E.J., and A. K. McNamara, Structure and dynamics of Earth’s Lower Mantle, Science, 320, 626-628, 2008.

Geophysical Computing

L18-14

6. Homework
1) Everyone should at this point have some figure they have created that is related to their research. Format it in Adobe Illustrator such that it looks presentable for publication. For this homework please provide a print out of your final illustration to me.

Sign up to vote on this title
UsefulNot useful