Professional Documents
Culture Documents
Quantum User's Guide: Volume 1 Data Editing
Quantum User's Guide: Volume 1 Data Editing
COPYRIGHT
Maygrove House
67 Maygrove Road
LONDON
NW6 2EG
England
Please address any comments or queries about this manual to the
Support Department at the above address, or via e-mail to:
support-uk@spssmr.spss.com
All trademarks acknowledged.
Contents
About this guide ............................................................................................................ vii
1
1.1
1.2
Introduction.........................................................................................................................1
What Quantum does .............................................................................................................1
Stages in a Quantum run ......................................................................................................2
2
2.1
2.2
3
3.1
3.2
3.3
3.4
3.5
3.6
4
4.1
Basic elements.................................................................................................................13
Data constants .....................................................................................................................13
Individual constants .............................................................................................................13
Strings of data constants ......................................................................................................15
Numbers .............................................................................................................................16
Whole numbers ....................................................................................................................16
Real numbers .......................................................................................................................16
Variables and arrays ...........................................................................................................17
Data variables ......................................................................................................................18
Integer variables ..................................................................................................................20
Real variables ......................................................................................................................21
Reading real numbers from columns ...................................................................................23
Subscription ........................................................................................................................23
4.2
4.3
4.4
5
5.1
5.2
Expressions ......................................................................................................................25
Arithmetic expressions .......................................................................................................25
Combining arithmetic expressions ......................................................................................26
Counting the number of codes in a column .........................................................................28
Generating a random number ..............................................................................................29
Logical expressions ............................................................................................................30
Comparing values ................................................................................................................30
Comparing data variables and data constants ......................................................................31
Checking the arithmetic value of a field of columns ...........................................................38
Combining logical expressions ...........................................................................................39
Contents / i
5.3
6
6.1
7.2
7.3
ii / Contents
7.4
7.5
7.6
8
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
9
9.1
9.2
9.3
9.4
9.5
Contents / iii
11.3
11.4
11.5
11.6
12
12.1
12.2
12.3
iv / Contents
15
15.1
15.2
15.3
15.4
15.5
15.6
Contents / v
Chapters 1 to 3 give you an overview of the language and explain the basic concepts of
Quantum spec writing.
Chapter 6, How Quantum reads data, describes types of records, data structure, trailer cards,
reserved variables, merging data files and reading non-standard data files.
Chapter 7, Writing out data, describes creating a new data file, copying records to a print file,
and writing to a report file.
Chapter 9, Flow control, describes the if and else statements, routing around statements,
loops, rejecting records, jumping to the tabulation section and canceling the run.
Chapter 11, Data validation, describes the require statement, column and code validation, and
validating logical expressions.
Chapter 12, Data correction, describes forced cleaning, on-line data correction, creating clean
and dirty data files, correcting data from a corrections file, and missing values in numeric
fields.
Chapter 13, Using subroutines in the edit, describes how to call up subroutines, the
subroutines in the Quantum library, writing your own subroutines and calling functions from
C libraries.
Chapter 14, Creating new variables, describes how to name and define variables in your
Quantum spec.
Chapter 16, Running Quantum under Unix and DOS, describes how to compile and run your
Quantum program.
Chapter 2, The hierarchy of the tabulation section, describes the components of a tabulation
program, the hierarchies of Quantum, how to define run conditions, the options that are
available on the a, sectbeg, flt and tab statements, the default options file and some sample
tables.
Chapter 3, Introduction to axes, describes how to create an axis, the types of elements within
an axis, how to define conditions for an element, the n count creating elements, subheadings,
netting and axes within axes.
Chapter 4, More about axes, describes the col, val, fld and bit statements, filtering within an
axis, and options on axis elements.
Chapter 5, Statistical functions and totals, describes totals, averages, means, the standard
deviation, standard error and error variance statements and how to create percentiles.
Chapter 6, Using axes as columns, describes special considerations for when axes are used
for the columns of a table.
Chapter 7, Creating tables, describes the syntax of the tab statement, multidimensional tables,
multilingual surveys, combining tables, printing more than one table per page, and suppressing
percentages and statistics with small bases.
Chapter 8, Table texts, describes table titles, underlining titles, printing text at the foot of a
page, table and page numbers and controlling table justification.
Chapter 9, Filtering groups of tables, describes general filter statements, named filters and
nested filter sections.
Chapter 10, Include and substitution, describes filing and retrieving statements, symbolic
parameters and grid tables.
Chapter 11, A sample Quantum job, provides an example of a Quantum specification and the
tables it produces.
Appendix B, Error messages, contains a list of compilation error messages with suggestions
as to why you may see them and how to solve the problems which caused them to appear.
Appendix C, Options in the tabulation section, provides a summary of the options available
in the tabulation section.
Chapter 1, Weighting, describes the weighting methods that you can use in Quantum.
Chapter 2, Row and table manipulation, describes how to create new rows and tables using
previously created tables or parts of previously created tables.
Chapter 3, Dealing with hierarchical data, describes how to use analysis levels in Quantum.
Chapter 4, Descriptive statistics, describes the axis-level and table-level statistical tests that
are available in Quantum and provides details of the chi-squared tests, non-parametric tests on
frequencies and Friedmans two-way analysis of variance.
Chapter 5, Z, T and F tests, describe the Z, T and F tests that are available in Quantum.
Chapter 6, Other tabulation facilities, describes how to include C code and edit statements in
the tabulation section and how to sort tables.
Chapter 7, Special T Statistics, describes the special T statistics that are available in
Quantum.
Chapter 8, Creating a table of contents, describes how to create a formatted list of the tables
that are produced by a Quantum run.
Chapter 9, Laser printed tables with PostScript, describes how to convert the standard
tabulation output into a file suitable for printing on a PostScript laser printer.
Appendix A, Options in the tabulation section, provides a summary of the options available
in the tabulation section.
Chapter 1, Files used by Quantum, describes files you may need to create in order to use
certain Quantum facilities, including the variables file, the levels file, the default options file,
the run definitions file, the merges file, the corrections file, the rim weighting parameters file,
and the C subroutine code file, aliases for Quantum statements, customized texts, and userdefinable limits.
Chapter 2, Files created by Quantum, describes many of the files created during a run and
draws your attention to those of particular interest.
Chapter 3, Quantum Utilities, describes how to tidy up after a Quantum run and how to check
column and code usage.
Chapter 4, Data conversion programs, describes the q2cda and qv2cda programs that convert
tables into comma-delimited ASCII format, the qtspss and nqtspss programs that convert
Quantum data into SPSS format, and the qtsas and nqtsas programs that convert Quantum data
into SAS format.
Chapter 5, Preparing a study for Quanvert, describes the tasks you need to perform before
converting a Quantum spec and data file into a Quanvert database.
Chapter 6, Files for Quanvert users, describes files that are specific to either Quanvert Text
or Windows-based Quanvert.
Chapter 7, Creating and maintaining Quanvert databases, describes how to create and
maintain Quanvert databases.
Appendix B, Error messages, contains a list of compilation error messages with suggestions
as to why you may see them and how to solve the problems that cause them to appear.
Appendix D, Using the extended ASCII character set, explains how you can use Quantum with
data that contains characters in the extended ASCII character set.
Appendix E, ASCII to punch code conversion table, provides a table showing ASCII to punch
code conversions.
Appendix F, Will this job run on my machine, offers suggestions on how you can check
whether a particularly large job will run on your computer.
Comments
SPSS MR welcomes any comments you may have about this guide, and any suggestions for ways in
1 Introduction
Quantum is a highly sophisticated and very flexible computer language designed to simplify the
process of obtaining useful information from a set of questionnaires.
Quantum has been designed with market researchers in mind so its syntax and grammar are similar
to English. Nevertheless, it is still a computer language and as such should be used with precision
and understanding.
The four volumes of the Quantum Users Guide have three basic functions:
To provide you with enough information about how Quantum works to enable you to carry out
a specific task.
To help you work out what went wrong when errors occur or when your output is not what you
expected.
Generate tables (in different languages, provided that the translated texts exist).
Any Quantum run may perform as many or as few of these tasks as you like, but for each run the
basic format is the same.
Introduction Chapter 1 / 1
It may be entered directly via a terminal by a telephone interviewer using Quancept CATI.
It may be collected over the World Wide Web using software such as Quancept Web.
Next, the tasks to be performed are defined using the Quantum language.
Then, Quantum translates these tasks into instructions that the computer can understand.
Finally, the computer itself uses this program to run your job.
Quantum comprises two sections an edit section and a tabulation section. The edit section
checks and validates the data, generates lists and reports, corrects data, produces new data files, and
recodes data and creates new variables. The tabulation section produces tables and performs
statistical calculations.
Quantum reads the records in the data file one at a time and passes them through the various parts
of the Quantum program. As long as there are records remaining in the data file, the loop of read
a record > edit > tabulate is repeated; once the last record has been processed, the tables are
ready for printing.
If errors occur at any point in a Quantum run an error message is printed telling you what is wrong.
For details of the error messages that can occur, see appendix B, Error messages in the
Quantum Users Guide Volume 2.
2 / Introduction Chapter 1
where the file called edit contains editing instructions, the file called tabs contains statements
defining the tables required, and axes contains statements which define the individual rows and
columns each table is to have. The a; statement lists characteristics that all tables are to have,
although some of these characteristics can be overridden for individual tables or individual table
elements.
Edit statement
Quantum edit statements contain a Quantum keyword and other texts and numbers. Statements in
the edit section can generally start in any column, although comments and continuation characters
must start in column 1. A line may contain one or more statements, as long as each statement is
separated by a semicolon.
Edit statements may be preceded by a label number of up to five digits allowing them to be
referenced by other parts of the program, for example:
total = c56 + c57 + c58
if (total.gt.8) go to 100
require sp c(66,70)
100 write
Here we are adding the number in column 56 to those in columns 57 and 58 and saving the result
in a variable called total. If this value is greater than eight we go to statement 100, otherwise we
continue with the statement immediately after the if line.
This says that if column 24 contains a 1, then columns 25 to 30 must not be blank, otherwise, if
column 24 does not contain a 1, then columns 25 to 30 must all be blank.
More generalized checking facilities exist which enable you to produce frequency distributions of
numeric data (e.g., how many respondents have the number 201 in columns 13 to 15) or holecounts
(marginals) which show the broad pattern of coding across all columns in the data. Words
associated with these are list and count.
to write out all records in which column 24 of card 2 contains a 5. The records are written to the
default print file, out2
Incidentally, many of the statements mentioned in this section may be used for other purposes,
rather than just to deal with errors.
Tabulation statements
Tabulation statements tell Quantum which tables are required and how to create them. They consist
of a start letter or keyword to identify the type, and may be followed by other keywords, numbers
or text. They are used to define rows and columns (elements), the variables that are to be crosstabulated (axes) and finally, the tables themselves.
There are also statements for weighting your data and for creating tables by manipulating the
contents of tables created previously in the current run or even in other runs.
Slash sign for division, or an abbreviation for through (i.e., 1/9 is 1 to 9 inclusive)
&
()
{}
>
Greater than sign for identifying tables for manipulation from other runs
Percent sign for introducing options on col, val fld and bit statements
Where symbols have two meanings, the meaning required will become clear in the context in which
the symbol is employed.
The exception to this is text in tables, where the text is printed on the tables in the same case as you
write it in your Quantum program. Additionally, you must set up table text so that it fits on the
paper when you print your tables. Therefore, if you want the table title to be printed on two lines,
you must write it on two lines in your program.
Generally, spaces are allowed anywhere in a Quantum program except within Quantum keywords.
Blank lines in a program are ignored.
As we mentioned earlier, Quantum has separate edit and tabulation sections which may or may not
be in the same file. If your program contains an edit, it must precede the tabulation statements and
must be enclosed by the words ed and end, each on a separate line, thus:
ed
.
edit statements
.
end
Errors will occur if either of these words is missing. If there is no edit, these statements are not
needed.
3.3 Comments
Comment statements insert comments or information into the Quantum program. They do not
affect the way your program works because they are ignored when the program is run to produce
tables.
Comments are identified either by a capital C in column 1 or by a slash and an asterisk in columns
1 and 2 respectively (/*). If a comment needs more than one line, each line must start with the
appropriate notation otherwise it will be assumed to require some sort of action.
/* This is the first comment
/* This is the second comment
It is a good idea to put comment statements in your program in case someone else has to take over
your job or alternatively to remind yourself what you are doing and why. For example:
/* Edam is 1 if Edam mentioned at Q1, Q3 or Q6
if (c1101.or.c1141.or.c1211) edam=1
3.4 Continuation
Any Quantum statement may be continued over several lines by starting the second and subsequent
lines with + or ++, depending on where the statement is split.
A single plus sign is used when the statement is split between keywords. This assumes that a
semicolon appears at the end of each continued line, whether or not there is actually one there. Take
the statement:
if (c13212.and.t5.gt.50) write $t5 incorrect$; else; write ofil
This could be split in three places with a single plus sign for a continuation:
if (c13212.and.t5.gt.50)
+write $t5 incorrect$
+else
+write ofil
We have omitted the semicolons at the end of each line, but it would not be wrong to leave them in.
The double-plus sign introduces an internal continuation of a long statement over several lines.
Statements may be split between lexics; that is, between keywords, conditions, lists of numbers,
and so on, but not in the middle of any of these. In our previous example, we could write:
if (c13212.and.
++t5.gt.50) write $t5 incorrect$; else; write ofil
A double plus is needed here because we have split an expression in which one parameter is
dependent on the other. The statement on the first line means nothing on its own, neither does the
second line, hence the ++. We could equally well have split the expression before the .and. or
before or after the .gt.. To split it between t and 5, or in any other similar place, is incorrect because
the two characters by themselves do not mean anything.
When the Quantum compiler is checking your program and finds an error it flags the incorrect
statement with an explanatory error message and continues with the next statement. If any of these
errors are fatal that is, Quantum cannot convert your statement into C code the run will be
terminated.
Sometimes Quantum finds statements which are not quite correct, but which it can still convert into
C. In these cases the compiler flags the statement with the message Possible syntax error and
continues as if nothing were wrong. You can choose to have this type of error treated as fatal and
have the run terminated at the end of the compilation by entering the statement check_ (note the
underscore at the end) at the start of your edit.
The statement nocheck_ causes possible syntax errors to be flagged but ignored, and this is the
default.
When the Quantum compiler finds errors in your program, it copies them to the compilation listing
file. It also displays the first twenty messages on your screen. You may increase or decrease this
number by placing the statement:
errprint n
at the top of your main program file, before the edit and tabulation sections.
n is the number of messages you want to see on your screen: it must be an integer. Thus:
errprint 5
prints the first five error messages on the screen and in the listing file, and then any others only in
the file.
4 Basic elements
There are three basic elements in Quantum:
Data constants.
Integer numbers.
Real numbers.
store
store
store
data constants
whole numbers
real numbers
An individual constant is one or more of the codes 1234567890& or blank. The is sometimes
referred to as the 11 or X punch, and & is sometimes called the 12, V or Y punch. Each code
represents one answer to a question. For example, lets take the question What is your favorite
color? which has the response list:
Red
Yellow
Blue
Green
Black
White
coded into one column. If my favorite color is green, this will appear in the data file as a 4 in the
appropriate column, just as if your favorite color is red, there will be a 1 in that column.
To refer to these answers inside your Quantum program (maybe we only want our table to include
those respondents whose favorite color is blue), type in the code enclosed in single quotes:
3
You will also have to tell Quantum which column to look in.
To find out how to refer to columns, see Data variables later in this chapter.
Several codes may be combined in the same column and are called multicodes. Throughout this
manual when we talk of multicodes or multicoding we mean two or more codes in the same
column. Suppose the next question asks me to choose three colors from the same list; I pick yellow,
black and white. If these answers were all coded in the same column (a multicoded column), we
would refer to them by typing:
256
or
526
or
652
or any other variation of those three codes. Quantum does not care what order you enter the codes
in.
If you have a series of consecutive codes in the order &01234567890&, you may either type each
code separately or you may enter the first and last codes separated by a slash (/) meaning through,
as shown below:
1/7
means
1234567
&/4
means
&01234
&/9
means
1/&
means
As you can see, the last two examples mean exactly the same thing. However, the notations 0/&
and 0& are not the same: 0/& means 01234567890& whereas 0& is 0, and &
only.
Some combinations of codes represent ASCII characters; that is, they represent characters which
you can type on your screen:
&1
&2
is the equivalent of
is the equivalent of
A
B
The only time you would use letters rather than codes (that is, A rather than &1) is when the
questionnaire tells you that a column should contain a letter.
For further information, see appendix E, ASCII to punch code conversion table in the
Quantum Users Guide Volume 4.
Sometimes we may need to write a notation for no codes for instance, if my favorite color does
not appear in the list of choices. To do this, we write (that is, a blank enclosed in single quotes).
The notation is a special case since blank is not really a code. If you type a blank inside
single quotes with any other characters Quantum will follow its usual rule of ignoring spaces.
This means that references of the form 12 are read as 12.
When data constants are single-coded or the multicodes correspond to ASCII characters (for
example, A, B) they may be strung together. Strings of data constants are sometimes called
literals or column fields. Strings are enclosed in dollar signs, with the component single codes
losing their single quotes. For example:
$12345$
$ABC$
$916 7&$
The first string is five columns long with 1 in the first column, 2 in the second, 3 in the third, and
so on. The third string is six columns wide with the fourth column being blank.
Times when you might use strings are:
When the answers to a question are represented by codes of more than 1 digit. For example, in
a car ownership survey the car make and model owned may be represented by a 3-digit code.
To pick up respondents owning a particular type of car you would need to check whether the
relevant columns contained the code for that car. For instance, to look for owners of Ford
Escorts you might ask Quantum to search for the string $132$ in a particular field of columns.
4.2 Numbers
Quantum can print figures in tables with up to ten characters; figures that require more than ten
characters are printed as asterisks. For example, 12345678.12 appears as 12345678.1 when
displayed with one decimal place, but as asterisks (*) when displayed with two decimal places.
However, you can use the scale= option to apply a scaling factor before printing.
Whole numbers
Quantum can deal with whole numbers in the range 1,073,741,824 to +1,073,741,823 with an
accuracy of up to six significant figures. Numbers with more than six significant figures are
rounded up or down depending on the value of the remaining figures.
For some examples of how Quantum rounds figures up and down, see Real numbers later in
this chapter.
Your data will contain whole numbers whenever there are questions requiring numeric responses:
for example, the question How many children do you have? can only be answered with a whole
number. If the respondent has three children, the number 3 will appear in the appropriate column
in his or her data record, whereas a respondent with five children will have a 5 in that column
instead.
Whole numbers are also used if you want to perform arithmetic calculations during the run, for
instance to multiply a field by a number.
Real numbers
Real numbers are numbers containing decimal points. To be valid, they must have at least one digit
on either side of the decimal point:
0.1
and
1.0
are correct
.1
and
1.
are not
Quantum deals with real numbers of any size with accuracy up to six significant figures. Numbers
with more than six significant figures have the sixth figure rounded up or down depending on the
value of the remaining figures.
is rounded to
is rounded to
is rounded to
is rounded to
is rounded to
96.8253
189462.0
123457.0
123456.0
123456.0
By default, Quantum calculates cell values in single precision. However, when working with very
large numbers, you can produce more accurate results by using the double precision option (dp) on
the a statement.
For further details on double precision, see chapter 2, The hierarchy of the tabulation section
in the Quantum Users Guide Volume 2.
Data variables
Quick Reference
To refer to a single data variable in the C array, type:
cnumber
To refer to a field of data variables in the C array, type:
c(start_pos,end_pos)
To define a data variable, type:
data var_name sizes
before the edit section. To refer to it, use the same notation as above but replace the c with the
variables name.
At the start of every job, Quantum provides you with an array of 1,000 data cells called C. This
array is sometimes referred to as the C matrix. The individual cells are called C-variables. Each
C-variable stores one column of data. Quantum reads data from your data file into this array: we
will discuss exactly how it does this in chapter 6, How Quantum reads data. For the time being,
lets say we have a very small questionnaire which uses 43 columns to store the data. Quantum will
read the data for each respondent into cells 1 to 43 of the C array, one respondent at a time. The
codes from column 1 of the data are copied into cell 1 of the C array, the codes from column 2 of
the data are copied into cell 2, and so on. When Quantum has finished with that respondents data
it clears out the cells in the C array and reads the data for the next respondent, placing it in cells 1
to 43 of the array.
We can access this data by defining the columns whose contents we wish to inspect or change. Lets
take the questions about color that we mentioned earlier.The printed questionnaire tells us that the
respondents favorite color will be coded into column 15. To look at this column we would write:
c15
or
c(15)
The C may be in uppercase or lowercase, and the parentheses around the column number are
optional.
To refer to column 43 we would write:
c43
or
c(43)
Now suppose we want to look at a field of columns such as the questionnaire serial number in
columns 1 to 5. All we have to do is tell Quantum that the serial number is in a field starting in
column 1 and ending in column 5, as follows:
c(1,5)
before the edit section. Where the s at the end of each statement causes Quantum to recognize that,
for example, safe1 is the same as safe(1), just as it knows that c15 and c(15) refer to the same
column of data. If you created the arrays without the s, then Quantum would not recognize safe1
as being the same as safe(1).
Data variables which you create remain blank until you copy data into them. If the data about visits
to Sainsburys is stored in columns 30 to 45, then we might copy this into cells 30 to 45 of the array
called sains. If we then want to use this data we can write statements which refer to sains30 to
sains45. Unless you subsequently change the data in sains(30,45), each time you refer to one of
those cells it is exactly the same as referring to c30, c45, and so on, in the C array, and to columns
30, 45, and so on, in the data file.
In this simple example, there is not much to be gained (apart from an immediate improvement in
readability) by using your own data variables. However, when you have many columns of data per
respondent, or a complicated Quantum program, named data variables can be very useful for
improving readability and also for providing simple yet powerful facilities for data manipulation.
Here are some further examples:
c80
c(130,145)
total(17)
visits(134,136)
means cells 134, 135 and 136 of the array called visits
c(1,80)
To find out more about creating and using named data variables, see chapter 14, Creating new
variables.
Integer variables
Quick Reference
To define an integer variable, type:
int var_name sizes
To refer to an integer variable, type:
name[cell_number]
Integer variables store whole numbers. Strings of integer variables are called integer arrays, and
each cell in the array may store any whole number from 1,073,741,824 to +1,073,741,823.
At the start of each run, Quantum provides an array of 200 integer variables called T. The first cell
in this array is the integer variable t1 which may store any value within the given range; the second
cell in the array is the integer variable called t2 which may also store any value within the given
range.
To illustrate the difference between a data variable and an integer variable, lets suppose that our
data contains the value of the respondents car to the nearest whole pound. If the value is 6,000,
this will take up 4 columns in the data (assuming that we are only concerned with the digits) that
is, four data variables, the first of which will contain the 6, and the other three of which will all
contains zeroes.
If we placed this same value in an integer variable, we would only need one variable to store the
whole value because each variable can store values in the range 1,073,741,824.
We have already mentioned that Quantum provides an integer array of 200 integer variables. You
may create your own arrays using statements similar to those shown above for data variables.
Suppose you have a household survey in which you have collected the value of each car that the
family owns. You want to set up an integer array in which to store each value, so you write:
int carval 10s
This creates an array called carval which contains ten separate integer variables called carval1 to
carval10. Notice that we have followed the array size with the letter s so that we can omit the
parentheses from the individual variable names. We can then copy the value of the first car into
carval1, the value of the second car into carval2, and so on. If a particular household owns three
cars values at 6,000, 2,500 and 500, then carval1 would have a value of 6,000, carval2 would
be 2,500 and carval3 would be 500.
If you create your own integer variables, it is recommended that you name them with names that
reflect their purpose in the run, as we have done in our example.
To find out more about creating and using named integer variables, see chapter 14, Creating
new variables.
All integer variables have a value of zero at the start of a run, and they are not reset between
respondents. If you want your integer variables to store information about the current record only,
you must include statements in the edit to reset those variables to zero when a new record is read.
For example, we might write:
carval1 = 0
at the start of the edit to reset the first integer variable of the carval array to zero.
You can also reset an integer variable to zero by using a clear statement.
For further information about the clear statement, see section 8.7, Clearing variables.
T-variables with non-zero values are printed out at the end of the run.
Real variables
Quick Reference
To define a real variable, type:
real var_name sizes
To refer to a real variable, type:
name[cell_number]
You may define real variables and arrays to store real numbers with accuracy up to six significant
figures. Values with more than six significant figures have the sixth figure rounded up or down
according to the value of the extra figures.
For further information about real values, see Real numbers earlier in this chapter.
As with integer variables, the names of real variables should give some clue to the type of
information they contain. Real arrays are created by statements of the form:
real liters 5s
This example creates a real array called liters which has five real variables named liters1 to liters5.
It can store five real values, the first in liters1 and the fifth in liters5.
To find out more about creating and using named real variables, see chapter 14, Creating new
variables.
Quantum also provides a set of 100 real variables named X which you may use.
All real variables start with a value of 0.0 and are not reset to zero between respondents.
As an example, lets say that the data contains information on how long, on average, each person
in the household spent watching television during a given week. We want to manipulate these
figures so we create an array of real variables in which to store the average viewing figures:
real tvwatch 8s
This provides room for up to eight peoples figures. If our household contains four people with
viewing averages of 20.8 hours, 15.75 hours, 9.75 hours and 10.0 hours, then tvwatch1 will have a
value of 20.8, tvwatch2 will have a value of 15.75, tvwatch3 will be 9.75 and tvwatch4 will be 10.0
hours. The rest of the variables in the array have values of 0.0.
Real variables with non-zero values at the end of the run are not printed out automatically. If you
want to see these values, you will need to write them using a report statement.
For further information about report, see section 7.3, Writing to a report file.
As we have already said, data from the questionnaire is read into columns for use during the run.
When the data contains real numbers you will have to tell Quantum that the dot is to be treated as
a decimal point rather than as a multicode representing a number of different answers. The way to
do this is to refer to the field as cx:
cx(15,20)
cx(131,135)
Here we have two fields containing real numbers: the first is six columns wide including the
decimal place, which means that the number itself contains five digits, whereas the second is only
five columns wide with four digits. Notice that there is no need to tell Quantum where the decimal
point is.
4.4 Subscription
As we have shown above, you may refer to specific variables in integer and real arrays and cells or
columns in data arrays by naming their position in the array.
For example:
c1
t5
time3
seg(2)
Variables within an array may also be referred to using any arithmetic expression. In this case,
parentheses must be used. For example:
c(t1)
The column number depends on the value of t1. If t1 has a value of 10, then the
variable is c10; if t1 is 67, the variable is c67.
c(t4,t5)
The field delimiters depend on the values of t4 and t5. If t4 has a value of 12 and
t5 has a value of 19, the column field referred to is c(12,19).
t(c4)
The variable number depends on the value in c4. If c4 contains a single code in
the range 1 to 9, the integer variable will be one of t1 to t9 depending on the exact
value in c4. If c4 is multicoded, then the result is nonsense.
Basic elements Chapter 4 / 23
time(c4*23)
The variable number is the result of multiplying the value in c4 by 23. As in the
previous example, c4 must be single-coded in the range 1 to 9 for this example
to make sense. Thus, if c4 contains just a 4, the value of the expression is 92 so
the variable referred to is time92.
When variables are referenced in this way, the value of the expression must be positive. The
expression c(t15) is acceptable as long as t1 is at least 5. If the expression has a zero or negative
value Quantum will issue an array dimension error when it comes to read the data during the
datapass. Also, if the variable refers to columns, the value of the subscript must not exceed 32,767.
These are called subscripted variables and they greatly increase the flexibility with which you can
write your edit.
Subscription may be used in repetitive processes to save you writing the same thing over and
over again.
5 Expressions
Quantum recognizes two types of expression arithmetic and logical. Arithmetic expressions are
used to produce numeric values and logical expressions, when evaluated, produce a value of true
or false.
+----2----+
67
+----2----+
6 7
The same applies to multicoded columns. If you use a multicoded column as part of an arithmetic
expression, the multicoded column will be ignored. The exception to this is a multicode of a digit
and a minus sign which creates a negative number: a minus sign anywhere in a numeric field
negates the value in the field as a whole, not just the number it is multicoded with. For example:
----+----1----+----2
5 3778
9
0
2---+----3----+----4
12-4
3
4---+----5----+----6
83-
is 5378
is -1234
is -83
Expressions Chapter 5 / 25
(addition)
(multiplication)
(subtraction)
(division)
Expressions may contain more than one of these operators, for instance:
t5 + c(134,136) / otot
c(150,152) * 10 + 2.5
this adds the values of t5 and c(134,136) first and then divides that by otot. Lets substitute numbers
and compare the results. If t5=10, otot=5 and the value in c(134,136) is 125, the two versions of the
expression would read as follows:
10 + 125 / 5 = 35
26 / Expressions Chapter 5
and
(10 + 125) / 5 = 27
Where two integer expressions are combined, the result is integer (any decimal places are ignored),
but if an expression contains a real then the result will be real. Therefore, if t1=5 and t2=3, then:
t1 + 4
= 9
t1 + 4.0
= 9.0
t1 * t2
= 15
t1 / t2
= 1
t1 * 1.0
= 5.0
t1 * 1.0 / t2
= 1.66667
If you use parentheses in expressions which contain both integer and real variables, you need to
take extra care to ensure that your expression is producing the correct results. Lets look at an
example to illustrate how an expression can look correct but can still produce unexpected results.
If we assume that t40=2 and t41=70, the expression:
t40 * 100.0 / t41
yields a result of 2.85714 (that is, 200.0/70). The final value will be 2.85714 if the result is saved
in a real variable, or 2 if it is saved in an integer variable.
If we use parentheses:
(t40 / t41) * 100.0
the result is 0.0 (or 0 if saved in an integer variable). The reason for this is as follows. Because
Quantum evaluates expressions in parentheses before it deals with the rest of the expression, it
treats that expression as integer arithmetic. The rules for integer arithmetic dictate that real results
are truncated at the decimal point, so the true result of 0.0285714 becomes 0. Any multiplication
involving zero is always zero, so the final result is zero.
If you find that a run gives unexpected zero results, try looking for expressions of this type and
checking whether the parenthesized part of the expression has been truncated because the integer
division results in a decimal number.
Expressions Chapter 5 / 27
The function numb is an arithmetic expression which counts the number of codes in a column or
list of columns. Its format is:
numb(cn1,cn2, ... cnn)
where cn1 to cnn are the columns whose codes are to be counted. So, if we wanted to count the
number of codes in columns 132 to 135 we would type:
numb(c132,c133,c134,c135)
Notice that even though the columns are consecutive, each one is entered separately, with each
column number preceded by a c. It is incorrect to define only the start and end columns of a field
when using numb. Therefore it is wrong to write numb(c(132,135)) or numb(c(132,135)) and, if
you write statements such as these, Quantum will flag them as errors.
Sometimes you will only be interested in certain codes, for instance you may want to know how
many 1, 2 or 3 codes there are in a group of columns. In this case the function is entered as:
numb(cnp1,cnp2, ... cnnpn)
where p1 to pn are the codes to be counted. Only the named codes are counted any others
appearing in the columns are ignored. Lets say our data on card 1 is as follows:
1---+----2---...---5----+----4
1
2
1
6
/
/
8
6
7
9
and we want to count the number of codes in column 115 and also the number of codes in the range
5/8 in columns 121 and 157. The expression would be entered as:
numb(c115,c1215/8,c1575/8)
28 / Expressions Chapter 5
When Quantum checks these columns and codes, it will tell us that there are 9 codes in these
columns which are within the given ranges. These codes are all four codes in column 115 (we did
not specify which codes to count in that column), codes 5 and 6 in column 121 (codes 2 to 4 are
outside the given range), and codes 5 to 7 in column 157 (codes 1 to 4 are outside the given range).
Quantum can generate random numbers automatically with the random function:
random(n)
where n is the maximum value the random number may take. So, to generate a random number in
the range 1 to 100, the expression would read:
random(100)
The number produced may be saved for later use in an integer variable or column, thus:
rnum=random(32)
c(110,112)=random(156)
When using random with columns, always make sure that the number of columns allocated to the
number is sufficient to store the highest possible number that can be generated. In our example, we
need three columns in order to store numbers up to 156.
random generates a different random value each time it is run, even on reruns of the same job.
If you want to retain the same set of random values between runs, copy them into the data the
first time you run the job.
Expressions Chapter 5 / 29
Comparing values
Quick Reference
To compare the values of two arithmetic expressions, type:
arith_exp log_operator arith_exp
where log_operator is one of the operators .eq., .gt., .ge., .lt., .le. or .ne.
Values are compared when you need to check whether an expression has a given value for
example, did the respondent buy more than 10 pints of milk?
Values are compared by placing arithmetic expressions on either side of one of the following
operators:
.eq.
.gt.
.ge.
.lt.
.le.
.ne.
Equal to
Greater than
Greater than or equal to
Less than
Less than or equal to
Not equal to / unequal to
If the number of pints of milk that the respondent bought is stored in columns 114 and 115, the
expression to check whether he bought more than ten pints would be:
c(114,115) .gt. 10
If the number in these columns is greater than ten the expression is true, otherwise it is false.
In chapter 4, Basic elements, we said that integer variables may take numeric values or the logical
values true and false depending upon whether or not the value is zero. To check whether the
respondent bought any packets of frozen vegetables, we can either write:
fveg .gt. 0
to check the numeric value of the variable fveg, or we can simply say:
fveg
30 / Expressions Chapter 5
to check whether the logical value of fveg is true. To check whether fveg is false (that is, zero), we
would write:
.not. fveg
For further information about .not., see Combining logical expressions later in this chapter.
Data variables
Quick Reference
To test whether a data variable contains at least one of a list of codes, type:
var_namecodes
To test whether a data variable contains none of the listed codes, type:
var_namencodes
To test whether a data variable contains exactly the given codes and nothing else, type:
var_name = codes
To test whether a data variable contains exactly the given letter and nothing else, type:
var_name = letter
To test whether two data variables contain identical codes, type:
var_name1 = var_name2
To test whether a data variable contains codes other than those listed, type:
var_nameucodes
To test whether two data variables do not contain identical codes, type:
var_name1uvar_name2
Expressions Chapter 5 / 31
To check whether a column or data variable contains certain codes, place the codes, enclosed in
single quotes, immediately after the name of the column or data variable. For example:
c11
c15623
brand5
The expression:
Cnp
checks whether a column (n) contains a certain code or codes (p). The expression is true as long as
column n contains at least one of the given codes. It does not matter if there are other codes present
since these are ignored.
For example, to check whether column 6 contains any of the codes 1 through 4 we would type:
c61/4
The expression is true if c6 contains any of the codes 1, 2, 3 or 4 or any combination of those codes,
regardless of what other codes may also be present. For instance:
----+----1
1
6
8
&
----+----1
1
2
3
4
----+----1
1
3
0
is false.
In our original example we chose the codes 1 through 4. You can, of course, use any codes you like
and they may be entered in any order.
The opposite of cnp is:
cnNp
which checks that a column does not contain the given code or codes. The expression is true as long
as the column does not contain any of the listed codes.
32 / Expressions Chapter 5
For example:
c478n5/7&
is true as long as column 478 does not contain a 5, 6, 7 or & or any combination of them. A
multicode of 189 returns the logical value true, because it does not contain any of the codes
5/7& whereas a multicode of 1589 makes the expression false because it contains a 5.
The = operator is used to check that the contents of a column are identical to either the given
codes or the given letters.
The expression:
c312=1/46
is true as long as c312 contains all of the codes 1 through 4 and 6, and nothing else. The expression:
c142=
checks that column 142 is blank. The equals sign is optional when checking for blanks, so we could
simply write:
c142
checks that column 124 contains the letter A and nothing else.
The = operator may also be used to compare the contents of two data variables. For example:
c56=c79
checks whether c56 contains exactly the same codes as c79. If so, the expression is true, otherwise
it is false. If we have:
+----6----+ ... +----8---1
1
5
5
Expressions Chapter 5 / 33
yields the value false because column 79 contains a 9 when column 56 does not.
If you have defined your own data variables, you could write a statement of the form:
brand1=c79
to check whether the data variable called brand1 contains the same codes as c79.
The opposite of = is U (unequal):
cnUp
This checks whether column n contains something other than just the code p. Suppose we have
two sets of data:
----+----4
1
4
7
----+----4
1
5
9
and we write:
c34u7
The expression is true for both sets of data. In the first example, the 7 is multicoded with a 1 and
a 4, while in the second example, column 34 does not contain a 7 at all. The only time this
expression is false is when column 34 contains a 7 and nothing else.
34 / Expressions Chapter 5
The contents of data fields must be enclosed in dollar signs with each code in the string referring
to a separate column in the field. For instance, to check whether columns 47 to 50 contain the codes
, 6, 4 and 9 respectively we would type:
c(47,50)=$649$
Expressions Chapter 5 / 35
In a similar way as you can test whether a field contains a given list of codes, you can also check
whether a field contains a given list of letters. For example, to check whether columns 55 to 57
contained the string AAA, we would type:
c(55,57)=$AAA$
All our examples have used columns, but the same rules apply to data variables that you define
yourself. For example:
rating(1,4)=$1234$
checks whether the field rating1 to rating4 contains the codes 1, 2, 3 and 4 in that order. That is, it
checks whether rating1 contains a 1, whether rating2 contains a 2, and so on.
When checking the contents of fields in this way, make sure that you enter as many columns as
there are codes in the string (that is, five codes require five columns). The exception to this rule
occurs when you are checking for blanks when the expression may be shortened to:
c(50,80)=$ $
This type of statement may also be used to compare two fields, to check whether the second field
contains exactly the same codes as the first field. When you compare one field with another,
Quantum takes each column in the first field in turn and looks to see whether the corresponding
column in the second field contains exactly the same codes. For example, if the first column of the
first field contains a code 1 and a code 2 and nothing else, then Quantum will check whether the
first column of the second field also contains a code 1 and a code 2 and nothing else. If all columns
of the second field are identical to their counterparts in the first field, then the expression is true;
otherwise it is false. Here is an example:
c(129,132)=c(356,359)
For this expression to be true, column 129 must contain exactly the same codes as column 356,
column 130 must be exactly the same as column 357, and so on. Once again, the two expressions
on either side of the equals sign must be the same length.
36 / Expressions Chapter 5
Comparisons of one data variable against another are concerned with columns and codes: they
are not concerned with the arithmetic values of the codes in the fields as a whole.
If we have:
----+----3----+---02
2
the expression:
c(24,25)=c(34,35)
is false because the string $02$ is not the same as the string $2$. If you want to compare fields
arithmetically (for example, is 02 the same as 2) then you will need to use the .eq. operator:
c(24,25).eq.c(34,35)
to test whether the value in c(34,35) was equal to the value in c(24,25).
For further information about the .eq. operator, see Comparing values, earlier in this chapter.
To check whether the codes in one field do not match a given string or the codes in another field,
we can use the u (unequals) operator:
c(m,n)U$codes$
cmUcn
c(m,n)Uc(m1,n1)
If codes in the field c(m,n) do not match the given string or the codes in c(m1,n1) then the
expression is true. If the two fields are identical, then the expression is false.
The comparison is of codes in columns, where the columns are compared on a one to one
basis. It is not a comparison of a field with a numeric value, or of the numeric values in two
fields. Numeric comparisons for inequality are written with the .ne. operator.
For further information about numeric comparisons, see Comparing values, earlier in this
chapter.
Expressions Chapter 5 / 37
The expression:
c(67,69)uc(77,79)
is true as long as columns 67 to 69 differ by at least one code from columns 77 to 79. If our data is:
+----7----+----8
123
256
the expression is true because each of columns 77 to 79 differ from columns 67 to 69. Also, if we
have:
+----7----+----8
123
123
5
the expression is true because column 77 is multicoded 15. The only time the expression is false
is when columns 67 to 69 are identical to columns 77 to 79.
The logical expression range checks whether the number in a field of columns is within a given
range. If so, the expression is true, otherwise it is false. The format of this statement is:
range(start,end,min,max)
where start and end are column numbers and min and max are the range delimiters. For example,
the statement:
range(137,139,100,150)
will return the value true if the number in columns 37 to 39 of card 1 is in the range 100 to 150.
38 / Expressions Chapter 5
It is important to remember that this statement is designed for use with purely numeric
columns. Columns which contain blanks, multicodes or an ampersand (12 punch)
automatically cause the statement to be false. The exception to this is a multicode of a digit
and a minus sign (11 code) which converts the whole field to a negative number.
A variation of range is rangeb which allows columns to the left of the field to be blank if the
number is right-justified in the field. In all other respects it is exactly the same as range. If our data
is:
----+----2
123 6
the expression:
rangeb(17,18,1,10)
will be true because the string $ 6$ will be read as 6. With range the value would be false.
However, the expression:
rangeb(15,18,2000,3000)
Both/all true.
.or.
.not.
Any number of subexpressions may be combined to form a larger expression, but whether the result
is true or false depends upon the values of the subexpressions and also upon the operators used to
combine them.
Expressions Chapter 5 / 39
The .and. operator requires that all the expressions preceding and following the .and. be true for
the whole expression to be true. Thus, the statement:
int1.eq.9 .and. c1161
is true if the integer variable int1 has a value of 9 and column 116 contains a 1. If either
subexpression is false, the whole expression is false too.
By comparison, the .or. operator requires that one expression or the other, or both, be true in order
for the whole expression to be true.
c(249,251)=$159$ .or. numb(c132,c135) .gt. 4
For this expression to be true, columns 249 to 251 must contain nothing but a 1, 5 and 9
respectively or the number of codes in columns 132 to 135 must be greater than 4. It is also true if
both expressions are true. However, if both are false, the overall result is false.
Expressions are reversed (negated) simply by preceding them with the keyword .not. Although it
is not wrong to use it with a single variable, it is more generally used to reverse an expression
containing the keywords .and. and .or. Thus, it is not wrong to write .not.c151/5 but it is much
simpler to write this as c15n1/5.
Take care when using .not. with the .eq. operator. Statements of the form:
.not. c(1,3) .eq. 100
are incorrect and will not work. They should be written as either:
(.not.(c(1,3).eq.100))
with the expression to be reversed enclosed in parentheses, or, more efficiently, as:
(c(1,3).ne.100)
Any of the operators .and., .or, and .not. may appear in a statement more than once, as long as you
use parentheses to define the order of evaluation.
For example:
(c151/47 .or. c163579) .and. c22&
causes Quantum to check whether the .or. condition is true before dealing with the .and. Suppose
our data is:
----+----2----+
13
&
79
The first expression (c151/47) is true because column 15 contains a 1 and a 7 and the second
expression (c163579) is also true since the codes it contains are amongst those listed as
acceptable. Thus, the .or. condition is true. Column 22 contains an ampersand so the last expression
is also true, therefore the expression as a whole is true regardless.
40 / Expressions Chapter 5
If both expressions in the parentheses were false, the whole expression would be false.
which refers to unmarried men and all women. This can also be written as:
.not. Male .or. .not. Married
The first .not. collects all the women, the second collects everyone who is not married (for example,
single, widowed, and so on), and together they collect people who are female and unmarried. We
use .or. instead of .and. here because the latter will gather unmarried women but will ignore the
unmarried men and married women.
Reversing .or. expressions works in exactly the same way. The expression:
(Male .or. Married)
means anyone who is Male, or anyone who is Married, or anyone who is Male and Married. The
opposite of this is:
.not. (Male .or. Married)
which means anyone who is not Male or is not Married or is not both; that is, anyone who is a
woman and is unmarried. This can be written as:
.not. Male .and. .not. Married
Negative
Is the same as
(A .and. B)
.not. (A .and. B)
(A .or. B)
.not. (A .or. B)
Expressions Chapter 5 / 41
the expression is true because c(135,137) do not contain just the codes 5, 1 and 9 (c135 is
multicoded), and c160 does not contain any of the codes 6 through 0. The expression will only be
false if:
column 135 contains a 5 only, column 136 contains a 6 only and column 137 contains a 9 only,
and
column 160 contains any of the codes 6 through 0, either singly or as a multicode. We could
therefore write the expression as:
.not. c(135,137)=$519$ .and. .not. c1606/0
From time to time you may need to check whether a variable or arithmetic expression has one of a
given list of values. For example, if the questionnaire codes brands of frozen vegetables as 3-digit
codes into columns 145 to 147 we might want to check that only valid codes appeared in this field.
This is achieved using the logical expression .in. as follows:
variable-name .in. (list)
arithmetic-exp .in. (list)
or
where variable-name is that of the variable to be checked and list is a list of permissible values.
The arithmetic expression is an expression consisting of data or integer variables, arithmetic
operators and integer values as described earlier in this chapter. If the variable or arithmetic
expression has one of the listed values, the expression is true, if not, it is false.
42 / Expressions Chapter 5
The left-hand side of the expression may contain integer variables, columns or data variables
containing whole numbers, or expressions using these types of variables. If it is a data variable, then
the list may contain codes enclosed in dollar signs. Quantum will then compare the codes in the
data variable with the codes inside the dollar signs. We could therefore check that the frozen
vegetables have been coded correctly by keying in a statement which says:
c(145,147) .in. ($205$,$206$,$207$,$210$,$215$,$220$)
Quantum will flag any records in which c(145,147) does not contain exactly 205, 206, 207, 210,
215 or 220 (that is, three single-coded columns) as incorrect.
If the data variable contains a valid positive or negative whole number, then the list may also
contain such values. Ranges of values may be entered in the form min:max, where min is the lowest
acceptable value and max is the highest. Since the frozen vegetables have numeric codes, we could
write the expression as:
c(145,147) .in. (205:207,210,215,220)
Any columns in the field which contain non-numeric data (for example, multicodes) will be flagged
as incorrect, as will any which contain values which do not match the specification.
Sometimes, though, the codes and numbers will not be interchangeable. If you have 2-digit codes
in a 3-column field, the statement:
c(206,208) .in. ($ 10$,$ 11$,$ 12$,$ 13$)
unless column 206 is always blank. If the 2-digit codes have been padded on the left with zeros
instead of blanks (that is, 010, 011) or if they all start in column 206 (that is, $10 $, $11 $), then the
first expression will be false, even though the second one will still be true.
For a fuller explanation of the difference between codes and numbers, see the earlier sections
of this chapter.
If the left-hand side of the expression is an integer variable or an arithmetic expression, the list may
contain positive or negative whole numbers:
total .in. (100,200,500:1000)
Lists may contain up to 247 values or codes, which may be entered in any order. In our examples,
we have always entered them in ascending order, but this is not a requirement of Quantum. You
may enter codes in a list in any order you like. The exception is numeric ranges which must be
entered in the form lowest:highest.
Expressions Chapter 5 / 43
Naming lists
Quick Reference
To assign a name to a list of values, type:
definelist name=(list)
in the edit section. Where list is a comma-separated list of numbers, ranges or code strings enclosed
in dollar signs.
If you have a list that is used more than once you may give it a name and refer to it by that name
instead of typing in the complete list each time. To name a list, write:
definelist name=(list)
For example:
definelist fveg=(205:207,210,215,220)
To use a defined list, simply replace the list with the name:
c(145,147) .in. fveg
You cannot use a definelist in an .in. statement with a data-mapped variable. Quantum cannot
handle this syntax because it needs to read the data in the definelist differently for data-mapped
variables (as strings instead of column punches) but does not know at the time the definelist
is parsed whether it will be used with a data-mapped variable.
44 / Expressions Chapter 5
If you have a large edit, you can speed up the time it takes to run by including the inline statement
in your edit. This instructs the Quantum compiler to convert expressions of the form
c(1,4)=$1234$ into statements in the C programming language in a different way to the way it
normally does. You need not worry about these different methods of conversion, apart from
deciding whether or not to use them.
If you want to speed your program up, place a statement of the form:
inline n
at the beginning of the edit section, where n is the maximum field width to be converted in the
special way.
For example:
inline 6
Here we are saying that fields of six columns or less should be converted in the special way rather
than in the normal way.
Expressions Chapter 5 / 45
Ordinary records
These are strings of codes and numbers, one per respondent, up to a maximum of 32,767 characters
per respondent.
Multicard records
When data originates from punched cards and each questionnaire requires more than 80 columns,
the data is spread over several cards. So that all cards belonging to a particular respondent may be
easily identified, each questionnaire is assigned a serial number which is entered as part of the data
for each card. Within this, each card has a unique card type or card number to distinguish it from
others in the group. It is important that both the serial number and card type be in the same relative
positions on all cards in the file, since this is the only way that Quantum can tell which data belongs
to which respondent.
If the questionnaire serial number is in columns 1 to 4 of each card and the card type is in column 5,
and we are looking at questionnaire 1005, we will see that it has two cards whose first five columns
are 10051 and 10052 respectively. Quantum can deal with records that contain up to 327 cards per
respondent.
Occasionally you may have multicard records in which each card is greater than 80 columns. The
notes that follow refer to multicard records of up to 100 columns per card.
For information on how Quantum deals with cards of more than 100 columns, see section
6.10, Multicard records of more than 100 columns per card.
Here, we have three groups of data at level 2 and eight groups of data at level 3.
Ordinary records
Ordinary records are read into cell 1 onwards of the array. Therefore, for example, the 50th column
is referenced as c50 and the 200th cell as c200.
Multicard records
Records are read into c101 to c200 for card 1, c201 to c300 for card 2, and so on. For example,
80-column cards are read into c101 to c180 for card 1 and c201 to c280 for card 2. Columns 181
200, 281300, and so on remain blank. In this case, the C array may be pictured as ten rows of 100
cells each. Column 50 of card 1 is then accessed by referring to it as c150, and column 67 of card
8 is referred to as c867.
For information on longer records, see section 6.10, Multicard records of more than 100
columns per card.
If you have records with more than nine cards, you need to extend the size of the C array by using
max=. This also tells Quantum which cells to clear between records.
For further details on max=, see Highest card type number later in this chapter.
For further information, see Record type in section 6.8, Describing the data structure.
For more information about levels, see chapter 3, Dealing with hierarchical data in the
Quantum Users Guide Volume 3.
A card is located with a card type matching that of the previous card (for example, two
consecutive card 2s), or
A card is read with a type lower than its predecessor and matching one of the card types already
read in during the current read (for example, a card 2, a card 3, and then another card 2).
In order to produce useful tables, you will need to know which cards are currently in the C array.
Quantum has four reserved variables thisread, allread, firstread and lastread which it uses to
keep track of which cards it has read for each respondent.
thisread
The array called thisread is used to check which cards have been read in during the current read.
thisread1 will be true (or 1) if a card type 1 has just been read in; thisread2 will be true if a card 2
has just been read, and so on.
There are nine such variables (thisread1 to thisread9) available unless extra card types have been
specified using the max= option In this case, these variables will be numbered 1 to max; if there
are 13 cards, we will have thisread1 to thisread13.
For further details on max=, see Highest card type number later in this chapter.
allread
allread notes which cards have been read in so far for this questionnaire. If cards 1, 2 and 3 have
been read so far, allread1, allread2 and allread3 will all be true. Additionally, each cell of allread
will contain the number of cards of the given type read in for instance, if two cards of type 3
have been read, allread3 will be true and it will contain the number 2.
As with thisread, there are nine allread variables available unless extra card types have been
specified with max=.
Examples
You can use these variables in your program to associate specific parts of the edit or tabulation
section with specific types of data. For instance:
if (.not. thisread3) go to 400
* card 3 edit follows
.
.
400 continue
/* calculate average when all cards read for respondent
if (lastread) average=sum / num
.
/* update table when all cards read for this respondent
tab brand demo;c=lastread
Lets take an example and look at the contents of the C array and the values of thisread, allread,
firstread and lastread. Suppose the record has five cards: 1, 2, 2, 2 and 3 of 80 columns each. The
first read places card 1 in c(101,180) and the first card 2 in c(201,280). The second card 2 is not
read into the array yet because it has the same card type as the previous card. As this is the start of
a new respondent, firstread is true (or 1), and because cards 1 and 2 have been read, thisread1,
thisread2, allread1 and allread2 are also true.
The second read deals only with the second card 2 since it is followed by another card of the same
type. thisread2 is true, as are allread1 and allread2. Also, allread2 contains the value 2 because we
have read in 2 card 2s so far. Note that thisread1 is now false (or 0) as no card 1 was read this time.
On the third and final read the third card 2 is read into c(201,280) and card 3 is copied into
c(301,380). lastread is true because we have reached the end of the record, thisread2 and thisread3
are true because we have just read cards 2 and 3, and allread1, allread2 and allread3 are true because
this record contains cards 1, 2 and 3. allread2 now contains the value 3 because there were 3 card 2s
altogether.
The chart below summarizes the cards read and the variables which will be true after each read.
Read 1
allread
firstread
Card 1
Card 2a
12
12
Read 2
Card 2b
12
Read 3
Card 2c
23
123
Card 3
lastread
If Quantum reads a record in which the repeated cards are out of sequence, it inserts blanks cards
of the appropriate types wherever necessary to force the cards into the correct sequence. For
example, if the record contains the cards 1, 2, 4, 3, 4, 4 in that order, Quantum will generate a
completely blank card 3 when it reads the first card 4. The record is then processed as if it contained
cards 1, 2, 3, 4, 3, 4, 4.
Set to true when the last record in the file has been read or, in the case of trailer
cards, the last read of the last record has occurred.
rec_count
card_count
For ordinary records, only columns 1 to reclen are reset to blanks, where reclen is the
maximum record length as defined by the reclen= keyword on the struct statement.
For further information about defining the record length, see Record length in the next
section.
In multicard records you may not use c(1,100). However, you may use any columns between the
end of the card (reclen) and the end of that row of the C array. For instance, when reclen=80 you
may use c(181,200), c(281,300) and so on. You may also use full sets of columns in which there is
no data: that is, if the record has only four cards (1, 2, 3 and 4), then c(501,1000) are the spare
columns you may use. Additionally, cells 101 to c(100+reclen), c201 to c(200+reclen), and so on
are reset to blanks before the next record is read in.
For information about levels and how to describe the levels data structure, see chapter 3,
Dealing with hierarchical data in the Quantum Users Guide Volume 3.
The struct statement is used to define the type of records, the location of the serial number and card
type in the record and the number of the highest card type if greater than 9. Its format is:
struct; options
Record type
Quick Reference
To define the record type, type:
struct; read=n
where n is 0 for ordinary records, 2 to read multicard records in sections according to the card type,
or 3 to read multicard records all in one go.
Quantum recognizes two types of record: single card and multicard. The type of record is defined
by the keyword read= on the struct statement:
Ordinary records Ordinary records are defined using read=0. Each record is read into c1
onwards of the array. Since it is the default, you need only use it when other options are
required; for example, when the records contain serial numbers and you wish to have the serial
number printed out as part of the record, or when you are working with long records of more
than 100 columns.
Multicard records Multicard records are identified by the keyword read=2. Each card in
the record is read into the row corresponding to the card type of that card that is, card 1 in
c(101,200), card 2 in c(201,300), and so on.
We mentioned briefly that it is possible to read all cards in a multicard record in at once and
ignore the card type. The first card goes in c(101,200), the second in c(201,300), and so on.
This is achieved with read=3.
Record length
Quick Reference
To define the record length of records greater than 100 columns, type:
struct; reclen=n
The keyword reclen=n defines the maximum number of characters to be read into the C array, the
number of cells to be reset to blanks and the number of cells to be written out by the write statement.
With ordinary records reclen may take any value, but with multicard records the maximum is
reclen=1000. In both cases, the default is reclen=100. When data is read into the array, any record
which is longer than reclen characters is truncated to that length and a warning message is printed.
When ordinary records are written out with write or split, cells c1 to c(reclen) are copied, with any
trailing blanks being ignored. For instance, if we have:
struct;read=0;reclen=200
and the current record is only 157 characters long, the record written out will be 157 characters
long. This length can be overridden by an option on a filedef statement.
When multicard records are written out, columns c101 to c(100+reclen), c201 to c(200+reclen),
and so on will be output. Thus, if we write:
struct;read=2;reclen=70
and we have 2 cards per record, Quantum will write out c(101,170) and c(201,270).
Finally, with ordinary records cells c1 to c(reclen) are reset to blanks between records, but with
multicard records cells c101 to c(100+reclen), c210 to c(200+reclen), and so on are reset.
For information about the write statement, see section 7.1, Print files.
For information about the split statement, see section 12.4, Creating clean and dirty data
files.
For information about the filedef statement, see section 7.4, Defining the file type.
The keyword ser=c(m,n) defines the field of columns containing the respondent serial number. For
example, if the serial number is in columns 1 to 5 of an ordinary record we would write:
struct;read=0;ser=c(1,5)
Notice that even with multicard records we only give the actual column numbers containing the
serial number, rather than card type and column number as is usually the case when identifying
columns in such records. This is because the column numbers refer to all cards in the data set rather
than to a single card in the file.
Defining the card type location is much the same as defining the position of the serial number in
the record. The keyword is crd=cn for a single digit card type or crd=c(m,n) for a card type of more
than one digit. Once again, m and n are column numbers only, not card type and column number.
For example:
struct;read=2;ser=c(1,4);crd=c5
tells us that we have a multicard record with serial numbers in columns 1 to 4 and the card type in
column 5 of each card. Each card will be read into the row corresponding to its card number.
Sometimes some cards will be optional and others mandatory. You define the cards which must
appear in every record by using the keyword req= followed by the numbers of the cards that each
respondent must have. For example:
req=1,2
tells us that cards 1 and 2 must be present in each record for that record to be accepted. Any other
cards are optional. If a record is read without one of these cards, the error message Card Missing
in Set and a note of the records position in the file are printed and the record is ignored.
If you have ranges for required card types, you may type the numbers of the lowest and highest
cards separated by a slash (/) or a colon (:) rather than listing each card type separately. For
example, if cards 1 to 4 are all required, you may type:
req=1,2,3,4
or
req=1/4
or
req=1:4
If the data contains trailer cards and the Levels facility is not used, you must list their card types
with the keyword rep=. For instance, if card 2 is a trailer card we would write rep=2. Where there
is more than one trailer card, each card type is listed separated by a comma. If cards 2, 3 and 4 are
all trailer cards we could write:
rep=2,3,4
If you have ranges for repeated card types, you may type the numbers of the lowest and highest
cards separated by a slash (/) or a colon (:) rather than listing each card type separately.
For example, if cards 2 to 4 are all repeated, you may type:
rep=2,3,4
or
rep=2/4
or
rep=2:4
If rep= is not used and a record is read with two or more cards of the same type, the last card of
that type will be accepted and the message Identical duplicate or Non-identical duplicate and a
note of the records position in the file will be printed. For example:
Record structure error: serial 026, card 234 in run, card 234 in dfile
card type 2 non-identical duplicate
Because rep= refers to trailer cards only, it will be ignored if read=2 and crd= are not both present
on the struct statement.
The only time you need to inform Quantum of the highest card type is when you have records with
more than nine cards. This is so that Quantum can allocate sufficient cells in the C array to store
the extra cards. The highest card type is defined with max=n, where n is the number of the highest
card type. Cells 1 to max*reclen are then cleared between respondents. For example, to read a data
set with 11 cards per respondent we might write:
struct;read=2;ser=c(1,4);crd=c5;req=1,2,3,4;max=11
If you forget max=, and a record is read with more than nine cards, the message Too many cards
per record is printed and the record is rejected. On the other hand, if a card is read with a card type
higher than that defined with max=, the record is rejected with the message Card number out of
range.
Since the maximum size of the C array is 32,767 cells, the maximum value you can set with
max= is 327 cards.
From time to time you may need to read in records with alphabetic as well as numeric card types.
This generally happens in a multicard data set containing more than nine cards per record where
only one column has been allocated to the card type.
Quantum can deal with this data but first you have to say where in the C array the alphabetic card
types should go. This is done with the keyword:
order=n
where n is one or more of the codes 1234567890& or the letters A to Z (in upper or lower case)
not separated by spaces.
The card type bearing the first number in the list is read into c(101,200), the card bearing the second
code in the list is read into c(201,300), and so on. For example, suppose each record has ten cards
1 to 9 and A our struct statement might say:
struct;read=2;ser=c(1,4);crd=c5;max=10;order=123456789A
Data from card A would be read into cells 1001 to 1100 of the C array.
When trailer card data is merged during a run with the merge facility, you may wish trailer cards
to be merged in a specific order, according to a sequence number entered as part of the data. The
location of this sequence number can be defined with the keyword seq=cn for a single column code
or seq=c(m,n) for a multicolumn code. For more information on merging data see the next section.
To merge data files you must create a file called merges telling Quantum which items to merge on,
and which files to merge. The type of merge is represented by a number:
1
Merge on serial number. Cards are read in from each data file according to their serial number
only the card type and sequence number, if any, are ignored. You might use this option
when you have two files, dat01 containing cards of type 1 and dat02 containing cards of
type 2, and you want the files to be merged so that card type 1 is read into the C array, followed
by card type 2.
Merge on serial number and card type (default). With this option, cards with the same serial
number read from different data files are merged to form a single record by comparing the
serial number and card type. Cards within a record are then sorted sequentially from 1 so that
each card is read into the appropriate cells of the C array. For example, if dat01 contains cards
1 and 3, and dat02 contains cards of type 2, the merge will produce records containing cards
1, 2 and 3 in that order.
Merge on serial number, card type and sequence number. This is similar to merge type 3,
except that trailer cards are merged according to their sequence number. For example, if dat01
contains cards 1 and 2, where card 2 is a trailer card with a sequence number of 2, and dat02
contains cards 2 and 3, where card 2 is a trailer cards with a sequence number of 1, the merged
record will contain cards 1, 2/1, 2/2, and 3, in that order.
The type of merge is the first item in the merges file, and is followed by the names of the files to
be merged with the main data file named in the Quantum command line. Items may be entered on
separate lines or all on the same line separated by semicolons. For example, if we want to merge
data in files dat02 and dat03 with data in the main file, dat01, by serial number, card type and
sequence number, the merges file would look like this:
5; dat02; dat03
Notice that we have not mentioned dat01 in the merges file because it will be named on the
Quantum command line instead.
This facility is not designed to work with merge files that contain *include or #include
statements to read additional data files into the current data file. All merge files must be named
in the merges file, which accepts pathnames if the data files are not in the project directory.
key_field
is the location of the key in the main data file, entered using the standard Quantum
notation for columns and fields.
key_start
copy_to
is the field in the main data record in which to place the external data. The field is
defined using the standard Quantum notation for columns and fields.
The mergedata statement merges a field of data from an external file with the main data at the
datapass stage of the Quantum run. Merging is by means of a data key present in both the main
records and the records in the external file. If a record in the external file has a key which matches
that of a record in the main data file, the external data will be merged into a user-defined field of
the main record when it is read into the C array.
In order for data to be merged correctly, both the main data file and the external file must be sorted
in ascending order by key value. If the key is the record serial number then the data file will already
be sorted in the correct order (assuming, of course, that the data is sorted by serial number). If you
are using a key that is not the record serial number you must sort the data file so that it is ordered
by key rather than by serial number.
The syntax for mergedata is:
int_variable=mergedata($ex_file$, key_field, key_start, copy_to, data_start)
where:
int_variable
is the name of an integer variable in which the function can place its return value.
ex_file
is the name of the file containing the extra data. It must be enclosed in dollar
signs.
key_field
is the location of the key in the main data file, entered using the standard
Quantum notation for columns and fields.
key_start
is the start column of the key in the external data file, for example, 1 if the key
starts in column 1. The length of the key is taken from the length of key_field.
copy_to
is the field in the main data record in which to place the external data. The field
is defined using the standard Quantum notation for columns and fields.
data_start
is the start column of the data to be copied. Quantum copies as many columns as
are defined by copy_to.
For example:
t1 = mergedata($manuf_codes$,c(178,180),15,c(168,175),1)
tells Quantum to compare the key in columns 178 to 180 of the main record with the key which
starts in column 15 of the external records in the file manuf_codes.
Because the key field in the main record is 3 columns long, Quantum reads columns 15 to 17 of
each external record to obtain its key. If the keys match, Quantum copies the data from the external
record into columns 168 to 175 of the main record in the C array. The external data to be copied
starts in column 1 and, since the destination field is 8 columns long, Quantum copies 8 columns
starting at that column.
This statement returns a value of 1 if a match was found (i.e., merging took place), or 0 if not.
There is no limit on the number of mergedata statements in a specification, but you may only merge
data from up to nine different files per record.
Errors
Errors can occur if your run contains a mergedata statement and either the main data file or the file
of supplementary data for merging has records with duplicate keys or records that are out of
sequence. In some cases the run is also canceled after all data has been read, when a complete error
report is available. The following table lists the situations when duplicate or out of sequence data
may occur and shows what happens to your job.
Circumstance
Message
Run
canceled?
No
DUPLICATES IN
DUPLICATES IN
DUPLICATES IN
SEQUENCE IN
SEQUENCE IN
key_field
Yes
key_field
Yes
key_field
Yes
key_field
Yes
key_field
For further details, see Reading non-standard data files in chapter 10, Include and
substitution of the Quantum Users Guide Volume 2.
Data and print files are both accessed by the write statement, but the exact format of the statement
varies according to the type of file and the information being written. You write to report files using
the report statement.
The word write by itself prints out a whole record in the form it is when the write statement is
executed, together with a ruler showing which codes fall in which columns, the line number of the
record in the data file and the message write indicating that the record was generated by a write
statement. Any multicodes in the record are shown as asterisks, but you may change this with an
option on the filedef statement.
For information on the filedef statement, see section 7.4, Defining the file type.
If the record contains more than one card, each card is listed separately beneath the ruler. For
example, the statement:
write
by itself might give us:
Quantum edit report
1 in file
----+----1----+----2-- ... --9----+----0
columns 1 - 100 are |12345
write
2 in file
----+----1----+----2-- ... --9----+----0
columns 1 - 100 are |23456
write
Each write statement will produce a line in the default print file, out2, telling you how many records
were written out, as follows:
2 (1%) write
Which cards are printed from multi-card records depends upon which cards have been read in so
far. Quantum looks at the allread variables and writes out cards for those which are true; so for
example, if allread1, allread2 and allread3 are true, cards 1, 2 and 3 will be printed. If you have
changed the contents of these variables prior to printing out the record, you will see the cards for
which allread is true rather than those which were originally read.
The example above was very simple; more often than not your program will contain several write
statements and you will want some way of identifying which records were printed by which
statement and why. If the write is dependent upon some other statement for instance, it is part
of an if statement the whole statement is printed underneath each record, thus:
67 in file
----+----1----+----2-- ... --9----+----0
columns 1 - 100 are |0015263-16*735 *837361 ... 79&
if (c14n1/4) write
Here, as you can see, we are checking whether column 14 contains a 1/4. This record has been
printed out because it contains a 5 instead.
Sometimes it is more helpful to have an explanatory text printed instead of the statement itself. In
this case all that is necessary is to follow the word write with the text to be printed enclosed in dollar
signs:
if (c308n1/5) write $c308 incorrect$
if (numb(c117,c118,c119).gt.3) write $too many choices$
Our first statement writes out all records in which column 308 does not contain any of the codes
1/5, and the second picks up all records having more than 3 codes in columns 117 to 119.
Normally all output from write goes to the default print file, and whenever the current record is
written to this file, the variable printed_ becomes true. You may change the output file by following
the word write with the name of the file to write to. For example:
write pfile $First Print$
For information on the filedef statement, see section 7.4, Defining the file type.
If two or more write statements apply to a single record, the record is printed out once in the state
it was when the first applicable write was read, with all relevant write statements or texts listed
below it. If a record satisfies two or more write statements which write to different files, Quantum
writes the record out once for each statement, in the state it is when each write is executed.
If you want to write out more than one field at a time, or to print more than one text, you can
define those fields and/or texts on an ident statement. All write statements from that point on
will then print those fields and texts.
To find out more about ident, read section 7.5, Default print parameters for write statements.
checks that columns 110 and 119 both contain a 2, and if so prints out columns 110 to 120 in the
print file, followed by the text Married woman. If you are writing out fewer than ten columns,
Quantum does not print a ruler above the codes.
If you are dealing with multi-card records, you may prefer to use this form of write to print only
the card containing the error, rather than all cards in the record. If we take our previous example
where we were checking the contents of column 308:
if (c308n1/5) write $c308 incorrect$
The write statement can only write out information from the C array.
write may also be used to copy records to a data file. This is useful if you want to separate a
particular card type from the rest of the data, or if you want to correct errors and save the corrected
data in a new file for later tabulation.
To write records to a data file the command is:
write filename
to write the whole record to the named file, or
write filename c(m,n)
to write columns m to n only.
If you use write in a levels job to write data to a new data file, the statement write datafile at
any level will write out data for that level only. Additionally, if the write statement is inside an if
clause, or a return statement is encountered, then only relevant data is written for that level. To
write out data for all levels, you will need one write statement per level.
In all cases, records are written in the state they are when the write is executed, and all cards read
in with the current read are copied; that is, all cards for which thisread is true. For instance, if
thisread1, thisread2 and thisread3 are true, Quantum will write out cards 1, 2 and 3. To prevent any
of these cards being written, you may set the appropriate variable to false (zero); therefore to print
only card 1 of our three cards, we would write:
thisread2=0; thisread3=0
write newdat
Any number of writes to data files are allowed in the edit, and each one may write to a different file.
Records written by write are normally as long as the record length defined with reclen on the struct
statement. You may change this with len= on the filedef statement. The exception is where records
end with blank columns. In this case Quantum ignores the blank columns. If you want to create a
data file of fixed length records, and your data is single coded, you can use the reportn statement.
If your data is multicoded you can convert it to single coded first by using the explode statement.
For further information about explode, see Converting multicoded data to single-coded data
in chapter 13, Using subroutines in the edit.
If your data is multicoded and you need to preserve the multicodes, the only way of writing out
fixed length records if the data currently has trailing blank columns is to insert a dummy code in
the last column of those records.
A report file is a special type of print file in which you can print out records, fields or variables in
the format of your choice. To write information in a report file, use the report statement, as follows:
report filename parameters
where filename is the name of the file to be written to, and parameters define exactly what is to be
written.
Lines in a report may be up to 1024 characters long. Report does not start a new line automatically
at the end of each write, but you may tell it to do so by following the keyword report with the
letter n:
reportn filename parameters
In both cases, the named file must be identified as a report file using a filedef statement, as
described in section 7.4, Defining the file type.
The parameter list defines what is to be printed in the report file. It may contain variables, texts,
and special characters representing tabs and spaces.
Data variables
Quick Reference
To print the contents of a data variable, type:
var_name
or
var_name(start,end)
To print the contents of a field, evaluated as an integer right-justified in a field of a given width,
type:
var_name:field_width
To print a the contents of every column in a field, even if they are multicoded or blank, type:
start:field_width
where start is the first position in the field. You may also use this notation to print fields whose
contents evaluate to a value greater than the maximum integer value Quantum can deal with.
All data variables that are single coded are printed using as many positions as there are columns in
the variable. For example, if the data is:
----+----4
511 538253
2
&
the statement:
report rfile c31,c35,c40
prints the contents of columns 31, 35 and 40 one after the other, as follows:
553
The statement:
report rfile c(35,40)
In both the examples the last column of the field has contained a code. If the last column or columns
of a field are blank, Quantum omits those columns when printing the contents of the field. (You
can get round this by entering the field specification as start:field_width as described later in this
section.)
A single data variable that is blank is printed as such, while a single data variable that is multicoded
is printed as an asterisk. The statement:
report rfile c35, c34, c33, c32
If a variable refers to a string that contains multicoded or blank columns, Quantum ignores the
multicodes and blanks and evaluates the contents of the remaining columns as an integer. For
example, using the data shown above, the statement:
report rfile c(31,40)
prints a line containing the value 51538253. The value starts in the first print position available.
If the field you wish to print is very long, its contents may produce an incorrect value when
evaluated as an integer (the maximum integer value which Quantum can deal with is
1,073,741,824). You can get round this by specifying the first column and the field width as
described below.
If you want to see all columns in a field which contains blanks or multicodes, or you need to have
the correct evaluation of a long field, you will need to deal with each column in the field separately.
You could type each column number separately, but it is quicker just to specify the start column
and the total number of columns you want to print starting at that column.
The format for this type of reference is:
start:field_width
For instance, to print columns 31 to 40 you would type:
report rfil c31:10
The output from this command would be 51* 538253, the same as if you had typed each column
number separately. As before, the data is printed starting in the first print position available.
You can use this alternative notation with field specifications too. In this instance Quantum will
evaluate the contents of the field as an integer and will print the result right-justified in a field of
the given width. If you type, for example:
report rfil c(31,40):10
Quantum will print the value 51538253 in positions 3 to 10 of a ten-position field. The first two
positions will be blank.
This notation is also useful if you need to create data files with fixed length records, and some
records end with blank columns. Writing records to a data file preserves multicodes but ignores
trailing blank columns. Writing to a report file allows you to create a single-coded data file with
fixed length records. If your data is multicoded you will need to convert it to single-coded form
before writing it out. You can do this by exploding any multicodes into a field of single codes.
You use the explode statement for this.
For information on how to use explode, see Converting multicoded data to single-coded data
in chapter 13, Using subroutines in the edit.
Once your data is in single-coded form you can then write the whole record out to a report file using
a reportn statement as follows:
reportn repdata c101:80
reportn repdata c201:80
Integer variables
Quick Reference
To print the contents of an integer variable, type:
var_name[:field_width]
If the report statement names a variable by itself, Quantum prints the variables value starting in
the first print position available. If the specification includes a field width, Quantum prints the
variables value right-justified in a field of the given width. Any extra columns on the left of the
field width are shown as blanks.
prints the values of the variable called codenums right-justified in a field five positions wide.
Values that are shorter than five characters are padded on the left with blanks.
Real variables
Quick Reference
To print the value of a real variable, type:
var_name[:field_width.dec_places]
where field_width is the width of the field in which the values are to be printed (values are rightjustified and padded on the left with blanks if necessary) and dec_places is the number of decimal
places to be shown for each number. If you omit these parameters, Quantum prints the values
starting in the first available print position and with six decimal places.
you can create a neat column of figures all with two decimal places and all right-justified in a field
six characters wide.
Most reports require some sort of text or spacing on the line, either on the same line as the values
or on lines by themselves to create titles, column headings, and the like.
To print text on a report, type:
$text$
The text may contain spaces.
To print spaces between the values on a line, you can either use spaces or tabs. To print a given
number of spaces between one value and the next, type:
[number]x
where number is the number of spaces required. The default is one space.
If you are producing tabular or columnar output youll probably find tabs are more useful for
creating blank space since they allow you to skip to a particular print position on the line. For
example, typing:
25t
takes you directly to position 25 on the line, regardless of the current print position. Compare this
with 25x which moves you 25 positions on from your current position.
Examples
Here are some examples of report statements:
reportn summary 20t,$Bought Brand A$,1x,brda:3,1x,$times$
produce a report showing the serial numbers of all respondents who buy yogurt. As you can see,
we have given our report a title.
As a final example, lets look at the difference between printing a field of columns all in one go and
printing them one at a time. If our data is:
+----4----+
18 036
&
/
7
the statement:
reportn test $c(37,43) is $,c(37,43)
0*6
You cannot write information to the standard print file (usually called out2) using report. To
do this use the function qfprnt.
For information about qfprnt, see section 7.6, Writing out data in a user-defined format.
All files named on write and report statements must be defined by a filedef statement before they
are used. This tells Quantum whether the file is a report, print or data file, and defines more
specifically how the output should be written. So that you can be sure that all filenames will be
recognized, you are advised to place all filedef statement at the beginning of the edit.
For report files, the definition is:
filedef filename[=pathname] report [len=rec_len]
where filename is the name of the report file and report is a mandatory keyword indicating that the
file is a report file.
If you are writing out more than 200 characters to a report file, you need to set len= on the
filedef statement to more than 200 to ensure that no lines are truncated.
Quantum normally creates report files in the main project directory. If you want the report file to
be created in a different directory, follow the filename with =pathname. When specifying a
pathname, the filename acts as a short-hand reference (tag). This means that you still have to tell
Quantum the filename by appending it to the pathname.
For example, to declare a report file called repfile1 that is to be created in the directory /home/ben,
you would write:
filedef repfile1=/home/ben/repfile1 report
This example says that records written to the data file newdat1 must be 80 columns long.
The file definition statement for print files is:
filedef filename[=pathname] print options
where filename is the name of the print file with an optional pathname, print is a mandatory
keyword indicating that the file is a printout file, and options is a list of optional keywords defining
more specifically how the records should be written. Filename lengths are as described above for
data files.
$text$
mpa
Prints the codes in a multicode across the page enclosed in curly brackets. For example:
000401 635495{134}45111
Here, we have a multicode of 134. The ruler is of little use when multicodes are printed
in this manner, so you may prefer to suppress it with the option norule.
mpd
mpe
Prints multicodes as an asterisk, but lists the individual codes within each multicode
beneath the record. For example:
----+----1----+----2
000401 635495*45111
Column 14 contains codes 134
norule
noser
Prevents the messages Record nnn and n in File from being printed.
The default output file is a print file called out2, and the default output style is as described above.
To change the output style for this (for example, to suppress the ruler or print multicodes in a
different format), simply use a filedef statement naming this file and giving the appropriate options
from the list above:
filedef out2 print norule mpe
The ident statement gives you increased control over the content of the print file by allowing you
to print more than one field of columns and one text per write statement.
The format of this statement is:
ident[] [$text$] [,variable_name] [,variable_name, ]
Each ident statement may contain any number of texts, variable names and columns as long as each
one is separated from the others by a comma. The order in which you define items with this
statement controls the order in which they will be printed. For example, if you type:
ident $bad film code$, c(1,10)
if (films0 .gt. 0) write $check c(1,6)$
and Quantum finds a record which fails this test, it will print the following:
bad film code
Column c(1,10) is |----+----|
040506
check c(1,6)
Notice that the text defined with ident does not replace the text given with write. If you do not
define a message on the write statement, Quantum will print the complete statement as it usually
does.
In this example there is not much difference between using ident and writing the test as:
if (films0 .gt. 0) write c(1,10) $bad film code - check c(1,6)$
The real power comes when you want to write out more than one field and/or text per write
statement, or if you want to write out the values of data, integer or real variables. For example, if
you type:
ident t1, t2, t3
write
is
is
is
10
15
20
in the print file (the values reported will, of course, be the values of the variables as they are in your
run).
In ident statements you can refer to a field of adjacent entries in a data variable array by
specifying the first and last entries. For example, you can specify c(1,12) to refer to columns
1 through 12 of the C array. However, like most other Quantum statements, you cannot use
this syntax for other types of variable, such as integer arrays.
So the example above must be specified as:
ident t1, t2, t3
You can combine texts, columns and variable names. The statements:
ident $Bad film code$, c(1,10), films0, films1, films2, films3
if (films0 .gt. 0) write
might print:
Bad film code
Column c(1,10) is |----+----|
010209
if (films0 .gt. 0) write
films(0)
films(1)
films(2)
films(3)
is
is
is
is
1
1
1
0
You could use this type of output for checking records which may be incorrectly coded for use with
field and bit statements.
For information about field, see section 8.6, Reading numeric codes into an array.
For information about bit, see section 4.4, Responses with numeric codes: bit in the
Quantum Users Guide Volume 2.
When ident writes out data variables, it prints the data according to the specification on the filedef
statement for the file to which you are writing the data. If the filedef statement includes the keyword
norule to suppress the ruler, the data is written out without a ruler, otherwise the ruler is always
printed above the data, as in the previous example.
You can alter this behavior without having to respecify the filedef command by typing a + or sign
at the end of the ident keyword. If filedef normally requests a ruler, type:
ident data variables
to print the listed variables without a ruler. If filedef normally suppresses the ruler, type:
ident+ data variables
to print the variables with a ruler.
To switch off ident and revert to the standard write behavior, type:
noident
Print an integer variable in the next num_pos positions on the line. If the
variable has a negative value the value is printed starting with a minus
sign.
%num_pos.dec_plr
Print a real variable in the next num_pos positions on the line and with
dec_pl decimal places. The number of print positions must allow for the
required number of decimal places and a decimal point.
%num_colc
Print num_col columns starting with the column whose name or number
appears in the variable list. Columns are printed as texts not punch codes;
that is, multicodes are converted to letters where possible.
%numberb
write and report are both powerful statements for writing out data, but they do have limitations
which you may find restrictive in some circumstances. The write statement lets you write data out
to a print file, including the standard print file (usually called out2), but it always writes the data in
a fixed format that you cannot change. The report statement lets you write out data and text in any
format you like, but only to a report file. You cannot write to a print file with report.
The qfprnt function brings together the functionality of write and report by writing text and data to
the standard print file in a format of your choice. To use it, type:
call qfprnt(0, $format$, variables)
where format defines the format in which the data is to be written and the data types of the variables
used. variables is a comma-separated list of the variables to be written out. Variables must be listed
in the order they are used in the format statement.
Here is a simple example to start with:
call qfprnt(0,$Number of products tested is: %2i$,t1)
If the respondent tested five products this statement will appear in the standard print file as:
Number of products tests is: _5
The underscore character in front of the 5 represents a space and appears as such in the print file.
Well explain why we have printed it here shortly. First, lets look at the qfprnt statement itself.
The format section of the statement consists of text to be printed exactly as it is written and
references to variables whose values are to be substituted in the text at the given points. In this
example we are writing out the value of the numeric (integer) variable t1. The variable is named in
the variable list section of the statement and is represented by the characters %2i in the format
section.
There are three parts to the variables reference. The % sign signals to Quantum that it has reached
a variable reference: all references start with a % sign. The i says that the variable is an integer
variable and the 2 says how many print positions to reserve for printing this variable. In the example
two positions are reserved for printing the value of t1, but since the value of t1 is only 5, Quantum
prints the value on the right of the reserved space and fills the remaining positions with spaces. In
the sample output we have used an underscore to represent this space.
Here is another example using two integer variables:
call qfprnt(0,$Record %4i tested %2i products$,recnum,t1)
As before, the underscore represents a space used to pad a value to the full field width.
This qfprnt statement produces the correct results because the variables are in the same order as
their references in the format section. This is your responsibility. As long as a variable has the same
type as the reference in the corresponding position in the format section, Quantum will print its
value at that point in the statement. So, if we had written:
call qfprnt(0,$Record %4i tested %2i products$,t1,recnum)
As you can see, Quantum does not increase the number of print positions to accommodate the value
it needs to print. Instead, it prints asterisks. In this example, the asterisks would alert you to the fact
that there is something wrong with the qfprnt specification, but this would not always be so.
More often than not youll be printing positive values. If Quantum needs to print a negative
number, it prints the minus sign directly in front of the first digit, just as you would write it
manually.
Writing out data Chapter 7 / 85
Besides integer variables, you can also print real variables, columns or fields of columns and blank
strings. You use a reference similar to the one youve seen for integer variables.
To print a real variable, type:
%num_pos.dec_plr
where num_pos is the number of print positions required and dec_pl is the number of decimal
places. As an example, the statement:
call qfprnt(0,$%5.2r liters bought$,liters)
prints the value of the real variable called liters in a field 5 positions wide. The value is printed with
two decimal places so, allowing for the decimal point, the maximum value that can be printed in
99.99:
15.27 liters bought
9.01 liters bought
Quantum can also print the text values of a column, a field of columns or a data variable. By this
we mean that Quantum converts multicodes to letters or other keyboard characters before printing
them. Multicodes that do not correspond to letters or characters are printed as asterisks. For
example, the multicode &1 translates into the letter A and would be printed as such; the multicode
&123 is simply as collection of codes and would therefore be printed as an asterisk.
To print single columns, type:
%numberc
in the format section, where number is the number of print positions required, and the name of a
single column in the corresponding position in the variable list. Quantum will then print number
columns starting at the named column. For example:
call qfprnt(0,$Record %4c tested %2i products$,c1,t1)
might produce:
Record 1234 tested
5 products
The statement:
call qfprnt(0,$Columns 11 to 20 are %10c: $,c11)
To replace certain codes in one column with those from a second column.
To copy codes from groups of columns into another column using the logical operators and, or
and xor.
In spite of the diversity of these functions the basic format of any assignment statement is:
variable=item
where item defines what is to be copied into the variable.
Remember that comments can be identified by an uppercase C in column 1. If the first variable in
your statement starts with a C, make sure that you type it in lower case otherwise the whole line
will be read as a comment and will be ignored. For example:
c(15,16)=$12$
is correct, but
C(15,16)=$12$
Alternatively, you may precede assignment statements with the word set, thus:
set c(15,16)=$12$
Copying codes
Quick Reference
To copy codes into a single data variable, overwriting the variables original contents, type:
variable=codes
To copy a string of codes into a field, type:
var_name(start,end)=$codes$
To copy the contents of one variable or field into another, type:
variable1 = variable2
Assignment statements are most commonly used to copy codes into a column or to copy the
contents of one variable into another. For instance:
c121=159
c121=c134
In the first example we are copying the codes 1, 5 and 9 into column 121 overwriting whatever is
already there. The second example copies everything in column 134 into column 121, again
overwriting what was originally there. Column 134 remains unchanged.
You can also copy strings of characters into fields of columns. Lets say we want to copy the code
59642 into columns 76 to 80 of card 3; we would write:
c(376,380)=$59642$
Notice that the characters to be copied into the array are enclosed in dollar signs as is the rule when
dealing with strings.
If you need to use a semicolon in a string, you must type it as:
\;
Quantum uses a semicolon to mark the end of a statement, and will issue an error message if it finds
a semicolon by itself in the middle of a string. The backslash in front of the semicolon tells
Quantum to read the next character as an ordinary character with no special meaning. For example:
c(376,380)=$59\;42$
When characters are being copied into columns, the equals sign may be omitted:
c104
is the same as
c10=4
c(11,14)$6353$
is the same as
c(11,14)=$6353$
Just as the contents of a single column can be copied into another, so the contents of one field can
be copied into another field. For example:
c(10,19)=c(70,79)
or
c(20,22)=c(45,47)
copies the contents of c(70,79) into c(10,19) and the contents of c(45,47) into c(20,22), in both
cases overwriting the original contents of those columns.
Data variables in assignment statements may be subscripted. The following are valid:
c(t1)=c145
c(178,180)=c(t4,t5)
c(t3,t5)=c(t10,t10+2)
means:
c(120,122)=c(240,242)
Generally you will know how many characters are required to hold the information they will
receive, but this is not always the case. What if the field on the left of the equals sign is longer than
the string to be copied into it? Quantum always copies a string starting with the right-most column
and transferring it into the right-most column of the field. It continues in this way until all
characters have been copied, then if there are still columns left in the field they are reset to blanks.
When strings are copied in this way they are called right-justified and blank-padded.
...
---4----+----5
84635
and we enter:
c(241,245)=c(185,187)
...
---4----+----5
100
If there are fewer characters than there are columns in the field, the characters are right-justified in
the field with the remaining columns set to blanks. If the reverse is true, and there are more
characters than there are columns in the field, the error message Attempt to set too many columns
into too few columns is issued.
Columns in assignment statements may overlap; for instance:
c(145,150)=c(143,148)
copies the contents of columns 143 to 148 into columns 145 to 150, so:
----+----5
83645902
becomes
----+----5
83836459
When a field is set to blanks it is never wrong to type in as many blanks (enclosed in dollar signs)
as there are columns in the field, but it is much quicker and more efficient to type, say:
c(301,380)=$ $
Assignment statements are also used to replace parts of one column with those of another, leaving
the remaining contents of that column intact. Note that this is the only time that assignment does
not overwrite everything in the recipient variable. Lets start with a simple example. Suppose we
have:
----+----3
3
/
7
...
----+----6
6
/
8
and we want column 124 to contain a 1 only if column 159 contains a 7. We would write:
c1241=c1597
...
----+----6
6
/
8
However, if we wrote:
c1243=c1593
meaning that c124 should only contain a 3 if c159 contains a 3, Quantum would give us:
----+----3
4
/
7
...
----+----6
6
/
8
As you can see, the 3 in c124 has been deleted because there is no 3 in c159. Both examples
could equally well be written using if, else, emit and delete, but an assignment statement is much
more efficient when you have a set of codes to check for.
For further information about if, see section 9.1, Statements of condition if.
For further information about else, see section 9.2, Statements of condition else.
For further information about emit, see section 8.2, Adding codes into a column.
For further information about delete, see section 8.3, Deleting codes from a column.
Column 10 contains a 1 and a 2 because c11 contains a 4 and a 5. The 3 that was originally
there has been removed because there was no 6 in c11. The 4 in column 10 remains untouched
because it has no corresponding code in c11.
Partial assignment need not have different column numbers either side of the equals sign. Quantum
accepts statements of the form:
c1270/3 = c1271/4
which can be used for recoding incorrectly coded data. The example we have used will recode a
0 in column 127 as a 1, a 1 in column 127 as a 2, and so on.
When entering codes with this type of statement, make sure that there are the same number of codes
on either side of the equals sign and that they are in the same relative positions in the order
&-0123456789. In the previous example we used 123 and 456. We could also have used &-1,
789 or 234 instead of 456, to name but a few alternatives. The important thing is that the two
groups follow the same pattern: if the first set names alternate codes (for example, 1357) then so
must the second (for example, &024).
The following statements are valid:
c21&0=c92456
c2105=c8649
The statement for columns 56 and 91 is incorrect because blank is not a valid code here; the
statement for columns 78 and 81 is wrong because the codes 367 cannot be superimposed on
123 (either 345 or 567 would be correct).
In many of your Quantum programs you will need to save the result of some arithmetic expression
in a variable. The variable may be a column or an integer or real variable and the arithmetic
information may be the contents of a column, integer or real variable, an integer or real number, or
the results of the functions numb or random. It can also include arithmetic expressions which have
been manipulated using the arithmetic operators +, , / and *. Here are some examples to start with:
var1=100
/* Next statement expects that variable ntim is < 10
c135=ntim
/* In next example, if c315678, variable np=4
np=numb(c31)
/* Increment rect (record total) by 1 for each record processed
rect=rect+1
Copying a number into an integer or real variable is easy because the variable has no predetermined
size that is, Quantum does not say that such variables may only store numbers of up to, say, three
digits. Integer variables can store any whole number in the range +2,147,483,648 to -2,147,483,647
and real variables may take values of any magnitude with six digits accuracy.
Suppose our questionnaire tells us how many pints of milk a respondent bought and we want to
save this is in an integer variable called npt. Heres what we might write:
npt=c(125,126)
Similarly, if we know how many miles the respondent travels to work each day, and we want to
convert this to kilometers, we could save the conversion in a real variable called km0:
km=c(213,214) * 1.609
If the respondent travels 5 miles, km will have the value 8.045, but if he or she travels 9 miles, km
would be 14.481.
The main difference between the two examples is the type of variable in which the results are saved.
The number of pints bought will always be a whole number so we save it in an integer variable,
whereas the conversion from miles to kilometers is likely to produce a real number so we save it in
a real variable.
t1=2.5 + 3.4
t1=5
but integer values placed in a real variable are saved as reals with decimal places and accuracy to
6 significant figures:
gives
x1=1 + 7
x1=8.0
Integer variables are often used to count the number of respondents having a specific characteristic.
For instance, to count the number of respondents holidaying at home and the number taking
holidays abroad we can say,
/* Home is c1131; abroad is c1132; both is c11312
if (c1131) home=home+1
if (c1132) abroad=abroad+1
This example uses the if statement that is described in chapter 9, Flow control.
Whenever a record is read with c1131, the variable home will be incremented by one and
whenever a record is read with c1132 the variable abroad will be increased by 1.
Lets say we have five respondents who took the following holidays:
Respondent 1
Home
c1131
Respondent 2
c11312
Respondent 3
Home
c1131
Respondent 4
No holiday
c113
Respondent 5
Abroad
c1132
At the start of the run, the variables home and abroad are both zero. After these records have been
processed, home will equal 3 and abroad will be 2. The person unlucky enough to have no holiday
at all will be ignored.
In the example above we were accumulating information about holiday habits for all respondents
together, but on many occasions you will want to store information on a per respondent basis
instead. Normally, integer and real variables are not reset between respondents, but all you need
do to overcome this is to enter a statement at the start of your edit to reset the variable in question
to zero each time a new record is read. For instance:
home=0
We will discuss in more detail the times when you might want to do this when we describe the
do statement in section 9.5, Loops.
Columns which contain single codes may be treated as a whole number. For instance, if our data is:
+----2----+
4922
the statement:
value=c(219,222)
will assign the value 4922 to value. If any of the columns are blank or multicoded in any way, they
are ignored.
+----2----+
49 2
and
+----2----+
4912
2
Columns
Columns may also store arithmetic information, but unlike other variables they have a predefined
size which means they can only store numbers of a certain size. For instance, c(1,10) can store
numbers of up to ten digits whereas c(1,3) only stores numbers of up to three digits.
If the number is negative Quantum places the minus sign in the column immediately to the left of
the first digit, but if there are no spare columns the first digit will be dropped and the minus sign
placed in the left-hand column. If t5=278, the statement:
c(46,49)=t5
gives
4----+----5
-278
yields
4----+----5
-78
but:
c(47,49)=t5
Note that this does not hold true for negative numbers whose length exceeds the field width by
more than one character. Then, the number is copied into the field from the right and the minus sign
and any excess digits are ignored. Thus, if t5=1278, c(42,44) will contain the number 278.
If the value to be saved has fewer digits than there are columns in the field, it will be right-justified
in the field and the remaining columns padded with zeros.
Here are some more examples:
/* Room to store values of t60 between 99 and +999
c(110,112)=t60
/* visits*4 should be between 999 and +9999, otherwise truncated
c(34,37)=visits*4
/* Result never truncated since maximum value is 81
c(10,11)=c7*c8
/* Total holidays taken
c(224,230)=home + abroad
/* Count the number of codes
pch=numb(c21,c22,c231/5,c241/9)
When copying real numbers into columns, Quantum needs to know how many decimal places are
required. This is done by following the variable with a colon and a digit defining the number of
places. For example, if x5=10.22, the statement:
cx(15,19):2=x5
results in:
----+----2---10.22
If the real number has more decimal places than we have allowed for, say 3 instead of 2, the extra
decimal places will be ignored.
The final type of assignment is copying codes from a set of columns. The codes copied depend
upon the type of operator used:
and
or
xor
Suppose we have:
----+----4
111
/22
453
77
and we type:
c181=and(c137,c138,c139)
Notice that even though the codes 3 and 7 appear in more than one column they are not copied
to c181 because they are not common to all columns.
Lets take the same three columns with the or operator. We type:
c182=or(c137,c138,c139)
c182 contains a list of all codes present in at least one of the named columns.
yields:
----+----4 ... ---8----+
111
4
/22
5
453
77
Here only two codes have been copied because all other codes appear in more than one column. If
one column was blank, this would be ignored if there were other codes unique to one column. Only
if there were no other unique codes would column 183 be blank. For instance, if we have
c11= , c12=12, c13=13 and we type:
c14=xor(c11,c12,c13)
we would have c14=23, but if c13 were to contain a 12 instead, c14 would be blank.
All our examples so far have referred to whole columns, but sometimes you will only be interested
in specific codes in those columns. To write this in Quantum, follow each column number with the
positions to be checked enclosed in single quotes. Any unnamed codes in those columns are then
automatically ignored. Here is an example. Our data is:
----+----4----+----5
1
1
2
/
3
/
5
5
6
...
8----+----9
3
Even though column 31, 41 and 45 all contain a 3 and a 5, Quantum only copies the 3 because
the 5 is not part of our specification. We have used the same code specification for all three
columns, but you can use whatever combination you like.
These types of statement are extremely useful for setting up shorthand references to the codes
present in a group of columns. Say, for instance, that you wanted various statements
throughout the edit to be executed only if there was a 1 in one or more of c110, c112, c120
and c125. You can always write out each column and code separately each time:
if(c1101.or.c1121.or.c1201.or.c1251) .....
especially if you will need to refer to the contents of these columns again later on in the edit.
This facility may also be used to simplify what would otherwise be complicated filter
conditions in the tabulation section.
The emit statement inserts codes into a column leaving the original contents intact. Its format is:
emit cnp
Suppose we have:
----+----7
4
5
&
More than one column may be entered on each line, provided that each one is separated by a
comma.
emit c5677, c1102, c(t5+6)7
emit can only be used with single columns; string variables are not valid: emit c(109,110)$99$
does not work.
The delete statement is the opposite of emit in that it deletes codes from a column leaving the
remainder intact. Its format is:
delete cnp
Suppose we have:
+----1----+
5
6
8
9
More than one deletion may be effected with the same delete statement as long as each column is
separated by a comma.
delete c1105, c(t1+3)6, c17956
Sometimes when you are cleaning your data you will come across a column which is multicoded
when it ought to contain only one code. You can either print out the record and change the incorrect
codes later or you can have Quantum do it for you automatically. When data is to be corrected
automatically, you will need to write a statement saying which codes should be discarded and
which are to be kept. Obviously, there can be no hard and fast rule since the codes may vary
between questionnaires, so what you may do is assign each code a priority so that when a certain
code is found Quantum knows that all others in that column are to be deleted.
The statement used for this is:
priority cncode1, code2, coden,[cn2code1a, code2a ,code3a, ... ]
where cn is the column whose codes are to be checked and code1 to coden are the positions to
check, entered in order of priority, the most important first.
priority checks only the listed positions; if any other codes are present they are ignored.
Lets work through an example to clarify this.
Suppose one of the questions in a survey asks respondents to give their overall opinion of a product,
rated on a scale of 1 (Poor) to 5 (Excellent). You have been told that if the question has accidentally
been multicoded you are to assume that the higher rating is correct and delete the lower rating from
the column. You will not know beforehand exactly what multicodes there are, if any, but you will
know the column and the possible codes it may contain, and also that low codes should be discarded
in favor of high ones. If this question is coded into column 249, you could write:
priority c2495, 4, 3, 2, 1
This causes Quantum to scan column 249 to see first whether it contains a 5 and, if so, to delete
all subsequent codes in the list. If c249 contains a 5 and nothing else, obviously there will be no
extra codes to delete; this does not matter. If there is no 5 in c249, Quantum then checks whether
it contains a 4; if so, any other codes in the range 1/3 are deleted, otherwise the program skips
to the next code in the list and checks for that. If none of the listed codes are found, the column
remains unchanged.
If our first record has c24953 Quantum will give us c249=5, but if the second has c249942 we
will end up with c24994; the 9 has not been removed because it was not one of the named
positions.
You can also use priority to force a field to be single-coded simply by listing the columns and codes
to be checked in order of importance. If a listed code is found in the first column, any other listed
codes will be removed from that column, as will any that appear in subsequent columns. For
example, if our record is:
-----+----6
22
3
5
and we write:
priority c552, 3, 4, c561, 2, 3, 4, 5
However:
-----+----6
22
3
&
would become
-----+----6
2&
In the previous example, we have named two different columns on the same priority statement
because together they form a field which must be single coded overall. If you want to force two
completely separate columns to be single-coded, you must write two priority statements, one for
each column. If our data is:
+----3----+
21
33
6
the statement
priority c1291,2,3,c1301,2,3
but:
priority c1291, 2, 3
priority c1301, 2, 3
results in:
+----3----+
21
6
Occasionally you may wish to set a random code into a column, perhaps because the code in that
column is incorrect. To do this, write:
cvar = rpunch(p)
where cvar is the column into which one of the codes p is to go. For example:
c115 = rpunch(1/5)
Once this statement has been executed, column 115 will contain one of the codes present in
column 120.
On some studies you will find responses which are represented by numbers rather than codes. There
are various methods of checking and tabulating these responses. Which one you use depends on
whether you want to know the number of respondents whose record contains a given code in a field
or group of fields, or the number of times a code appears in a group of fields.
To illustrate this, lets suppose the question and response list in the questionnaire are as follows:
Q6A: Which films did you see on your last three visits to the
cinema?
(12-13)
(14-15)
(16-17)
01
02
03
04
05
01
02
03
04
05
01
02
03
04
05
Columbus ...................
Aliens 3 ...................
Pretty Woman ...............
Green Card .................
Batman 2 ...................
If you want a table which shows how many people saw each film, one way of tabulating this data
is to use a fld statement in the axis which tells Quantum which columns to read and which codes
represent each film.
For information about the fld statement, see section 4.3, Responses with numeric codes: fld
in the Quantum Users Guide Volume 2.
Another way is to use a combination of field in the edit and bit in the axis. This is particularly
efficient if, rather than wanting to count the number of people who saw each film, you want to count
the number of times each film was seen.
The field statement counts the number of times a particular code appears in a list of fields for each
respondent. It stores these counts in an integer array that consists of as many cells as there are fields
to count. In the films example, the array will have five cells. Cell 1 will hold the number of times
code 01 appears in the fields c(12,13), c(14,15) and c(16,17). If the respondent saw Green Card
then Batman 2 and then Green Card again, his/her data will be:
1----+----2
040504
Cell 4 (Green Card) of the array will be set to 2, and cell 5 (Batman 2) of the array will be set to 1.
You can then tabulate the contents of this array using a bit statement in the axis.
The format of the field statement is:
field output_array = column_specs [,special_specs]
output_array is the name of the array in which you wish to store the counts of responses. You can
use spare columns in the C array, but you may find your program is easier to read if you define an
integer array of your own with a name which reflects the type of information it contains. For
example, if you want an integer array called films, you might write:
int films 5s
ed
field films = .....
When you define the integer array, make sure that you request as many cells as there are codes in
the data. In this example there are five films so you define the array as having five cells. Quantum
automatically creates an extra cell (cell 0) which it uses to count responses for which there is no
cell allocated. If there were six films, for example, Quantum would increment cell 0 each time it
found code 06 in the films columns. You might like to check the value of this cell as a means of
reporting on invalid codes:
if (films0 .gt. 0) write c(1,20) $Bad film code$
Negative and zero values also cause cell zero to be incremented. Codes which are shorter than the
field width are accepted as long as they are left-padded with blanks or zeros. Codes which are
shorter than the field width and which are right-padded with blanks only increment cell zero.
The input_specs part of the statement defines the columns to read. You have a number of choices
here. First, you may list each column or field reference one after the other, separated by commas.
The list must be enclosed in parentheses. In our example this would be:
field films = (c(12,13), c(14,15), c(16,17))
Second, if you have sequential fields as you do here, you can type the start columns of each field
followed by the field length. The list of start columns is separated by commas and enclosed in
parentheses, and the field length comes after the closing parenthesis and starts with a colon. If you
use this notation for the film example you would write:
field films = (c12, c14, c16) :2
If you wish, you can abbreviate this further by typing just the start columns of the first and last
fields, followed by the field length.
field films = c12, c16 :2
Third, if the fields are not sequential, you list the start columns and field width of each group of
columns (as shown above) and separate each group with a slash. For example, to read data from
columns 12 to 17 and 52 to 57, with each field being two columns wide, you would type:
field films = c12, c16 / c52, c56 :2
If you want to count more than one non-numeric code, list each one individually, separated by
commas.
To tabulate data counted by a field statement, you use a bit statement which names the integer
array you have created and defines the element texts associated with each cell of the array.
For further information about the bit statement, see section 4.4, Responses with numeric
codes: bit in the Quantum Users Guide Volume 2.
Quantum normally resets the cells of the integer array to zero at the start of each record. If you want
counts to continue from one record to another, use a fieldadd statement instead of field. For
example:
fieldadd films = (c12, c14, c16) :2
The advantage of using field or fieldadd is that they automatically count the number of times
a code appears in a list of fields. If you want a table which uses this information, you just tell
Quantum to increment the counts in the table by the values stored in the appropriate cells of
the array.
You can also manipulate the values stored in the cells before you tabulate the data. For
example, if you had codes for Aliens 1, 2 and 3, you might wish to merge them into a single
cell for all Aliens films so that the tabulation spec is easier to write.
Data variables are reset to blank, integer variables are reset to 0 and real variables are reset to 0.0.
Variables can also be cleared using assignment statements (e.g., t1=0), but there are advantages to
using clear instead. Firstly, clear is much easier to write. Secondly, with clear the compiler checks
that the subscripts are in the correct range (e.g., 1 to 33 if myarray has only 33 cells); this is not
possible with the loop method because the subscript is a variable. However, if you use variables as
subscripts with clear (e.g., clear c(t1,t1+5) subscript checking once again cannot be done.
Quantum normally terminates if it detects that you are writing beyond the end of an array. For
example:
int number 10s
ed
do 5 t1=1,12,1
number(t1)=c(132,135)*t1
5 continue
Here, we have defined an integer array called number as having 10 cells. When Quantum reads
the assignment statement and detects that it refers to number(11) it will terminate because there
are only 10 cells in the array, not 11. The same would be true for statements which referred to, say,
t201 when the size of the T array had not been extended past the default of 200 cells.
The exceptions to this are emit, delete, partial column moves and reads from fetch files.
emit, delete and partial column moves are discussed earlier in this chapter. For further
information about fetch files, see The fetch statement in chapter 13, Using subroutines in
the edit.
While they may save you time in the long run, these checks do mean that your job will run slightly
slower than it otherwise would.
If you wish to run without these checks, insert a nobounds statement near the start of the edit.
You may use a *set statement in the data file to assign a value to a T variable. Its format is:
*set tn = value
where n is a number between 1 and 200 (unless you have increased the number of T-variables).
The statement must start in column 1. You may type set in upper or lower case, and may follow
it with any number of spaces. If Quantum reads anything that it cannot interpret as a T variable, it
terminates the run immediately.
This facility is available in all jobs with or without levels (trailer cards). You may use it as many
times as you need throughout the data file to assign different values to the same T-variable, or to
assign different values to a number of T-variables.
9 Flow control
Statements in the edit section are usually dealt with in the order in which they occur in the program.
Quantum provides statements which may be used to alter this normal order of execution, for
example, by missing out a statement or repeating a group of statements a number of times.
The if statement has exactly the same meaning as in English; it defines a statement whose execution
depends upon the value of a logical expression. Lets first take an English sentence to explain this:
we might say If it is raining, I will take my umbrella. Here, the statement is I will take my
umbrella and it depends upon the logical expression It is raining. If the expression is true (i.e., it
is raining), the statement is executed (I take my umbrella), if it is false (no rain) it is ignored (I dont
even think about my umbrella).
Now lets take a Quantum sentence. We have a shopping survey in which respondents have been
asked to name the supermarkets in which they shop at least once a week. These responses are coded
into column 21 of card 1, and we want to keep a count of the number of respondents shopping in
Safeway (code 4). Our sentence would say If column 21 contains a 4, increment our counter by 1.
A Quantum if statement consists of three items:
1.
2. The logical expression whose value controls the action to be taken, enclosed in parentheses.
3. The statement(s) to be executed if the expression is true.
For further information about logical expressions, see section 5.2, Logical expressions.
Thus, to translate our sentence into the Quantum language, we would write:
if (c1214) safe=safe+1
The logical expression to be tested states that the number of codes in columns 10, 11 and 12 is
greater than three. If it is true, and there are, say, 5 codes altogether in those columns, we will add
a 9 into column 20 in addition to what is already there. On the other hand, if it there are 3 or fewer
codes in that field we leave column 20 as it is and continue with the statement on the line
immediately after the if. For instance:
+----1----+----2----+
621
0
/
4
yields
+----1----+----2----+
621
0
/
4
9
but:
+----1----+----2----+
21
0
/
4
yields
+----1----+----2----+
21
0
/
4
Once the emit statement has been executed, Quantum continues with the statement on the next line.
The statement to be executed if the expression is true may be any Quantum statement, even another
if. For example:
if (c1301); if (c1319) c18119
says if c130 contains a 1, and then if c131 contains a 9, then put the multicode 19 in c181.
This statement is not incorrect, but it can be more efficiently written as:
if (c1301.and.c1319) c18119
The if keyword may be followed by a whole series of statements as long as each one is separated
by a semicolon. These statements will then be executed in the order in which they appear. For
example:
if (t4.le.5) c23545; emit c5672; delete c7890
This says, if the value of t4 is less than or equal to 5, put the multicode 45 in column 235
overwriting whatever is there already, then add a 2 into column 567 and, finally, remove the 0
from column 789.
You cannot switch missing values processing on or off with an if statement. A missingincs
statement is always executed wherever it appears in the edit. This means that although the
compiler will accept statements of the form:
if (....) missingincs 1
Quantum will, in fact, switch on missingincs for the rest of the edit or until a missingincs 0
statement is read. It does not switch on missingincs selectively for only those records that
satisfy the expression defined by the if clause.
For further information about missingincs, see section 12.6, Missing values in numeric
fields.
In Quantum the keyword else means otherwise. In English we would say If its raining Ill take
the car, otherwise Ill walk; in Quantum we write:
if (expression) statement(s); else; statement(s)
This says, if the expression is true, execute the statements immediately after the if, but if it is false,
execute those following the else. For example:
if (c764) t3=1; delete c763; else; t3=2; emit c772
Here, if c76 contains a 4, t3 is set to 1 and a 3 is deleted from c76. However, if c76 does not
contain a 4, t3 is set to 2 and a 2 is added into c77.
The else keyword may only be used as part of an if statement and must be separated from the if by
at least a semicolon. Statements of the form:
if (c1151); else; emit c140&
are correct, but since action is only required if the expression is not true, it is more usual to write:
if (c115n1) emit c140&
causes Quantum to go immediately to the statement labeled 50 if column 121 does not contain a
1 (for example, the respondent did not buy Brand A soap powder). Any statements between this
if statement and statement 50 are ignored whenever a record is read where c121n1 is true.
The statement labeled 50 may be any Quantum statement, but many people just write:
50 continue
to gather all respondents together before continuing through the rest of the program. This statement
is described in the next section. All labels must be attached to statements: a label by itself is an error
and Quantum will tell you so.
You may route forwards or backwards in your program, but when routing backwards, take care that
you are not creating a situation from which it is impossible to escape: the following will go on and
on forever if you let it:
10 t1=t1+1
- - other statements - go to 10
The only way to avoid situations like this is to make sure that somewhere between statement 10
and go to is another statement that routes you past the go to at some time, for example:
10
if
go
15
t1=t1+1
- other statements - (t1.gt.10) go to 15
to 10
continue
9.4 continue
Quick Reference
Attach the keyword:
continue
to a label to mark a place in the edit.
This statement is a dummy statement whose sole purpose is to join various bits of a program
together. It is often used with a statement label as a destination for routing with go to, or to identify
the end of a loop.
To find out more about using continue with loops, see do with individually specified numeric
values in the following section.
9.5 Loops
Quick Reference
To define a set of repetitive statements, type:
do label_number int_variable=value_list
statements
label_number statement
Loops are extremely important structures because they enable the same set of basic statements to
be executed over and over again on a changing series of numbers, columns or codes. Their use can
reduce the work involved in checking data. The statement which introduces a loop is do which is
formatted as follows:
An integer variable (for numbers or columns) or a letter (for codes) whose value is to be used
by the statements in the loop.
An equals sign.
A list of whole numbers, integer variables or codes which are the values the integer variable or
letter is to take. These may be entered in two ways (see below).
Loops should be terminated by any statement other than go to, stop, return, another do or an if
containing any of these words. The main purpose of the terminating statement is to identify the end
of the loop and send the program back to the start of the loop. Go to and return send the record
elsewhere, stop terminates the run and another do indicates the start of another loop. The statement
most often used to terminate a loop is the dummy statement continue. Any statement that
terminates a loop must be preceded by a label number.
For information about the return statement, see section 9.7, Jumping to the tabulation
section.
The simplest way to define the values for the loop is to list them individually. In this case, values
must be whole numbers, separated by commas with the whole list enclosed in parentheses. For
example:
do 20 t5 = (125,130,140,145)
if (c(t5,t5+4).gt.3000) c(t5,t5+4)=$ $
20 continue
Before we discuss what this loop is doing, lets look at the way it has been written. The do statement
tells us three things, namely that the loop is terminated by the statement labeled 20, the integer
variable to be used is t5, and the statements within the loop are to be repeated four times (there are
four values in the list). The statement labeled 20 is continue which just sends Quantum back to do.
The purpose of this loop is to check whether the contents of four fields are greater than 3000, and
if so to reset those columns to blank. The first time through the loop, t5=125. When substituted into
the if statement it yields:
if (c(125,129).gt.3000) c(125,129)=$ $
The next statement is continue which sends us back to the top of the loop. t5 is now pointing to the
second value in the list, 130. The if statement reads:
if (c(130,134).gt.3000) c(130,134)=$ $
This process is repeated until t5 has taken all values in the list. There is no need to include
statements which check the value of t5 and jump out of the loop when the last value is reached:
Quantum keeps a count of how many values there are and it knows that once the last value has been
reached it should continue with the statements following the loop.
Sometimes there will be a pattern to the numbers in the list: for example, they may increase in steps
of 5. You may list them all individually if you prefer, but it is quicker to enter them as a range with
a start, end and incremental value (in our example, 5) separated by commas. The start value must
be smaller than the end value, and the increment must be positive. Quantum checks the start and
end values and if the start is larger than the end value, the statements inside the loop will not be
executed at all. If the increment is negative, the loop will be executed for the start value only.
do 20 t5 = 125,145,5
if (c(t5,t5+4).gt.3000) c(t5,t5+4)=$ $
20 continue
This loop is very similar to that used in the previous section. It will be executed for all values of t5
between t5=125 and t5=145 where the value is incremented by 5 each time. The loop says:
if
if
if
if
if
(c(125,129).gt.3000)
(c(130,134).gt.3000)
(c(135,139).gt.3000)
(c(140,144).gt.3000)
(c(145,149).gt.3000)
c(125,129)=$
c(130,134)=$
c(135,139)=$
c(140,144)=$
c(145,149)=$
$
$
$
$
$
You may enter as many range specifications as you like on one line, as long as each one is separated
by a slash (/):
do 15 t1 = 25,35,2 / 50,62,3
if (numb(c(t1).gt.1) c(t1)
15 continue
This loop replaces eleven if statements: t1 will take the values 25, 27, 29, 31, 33, 35, 50, 53, 56, 59
and 62.
If the loop has only one range, and the incremental value is 1, the 1 may be omitted. If t3=11 and
t4=15:
do 15 t2 = t3,t4
if (numb(c(t2).gt.1) c(t2)
15 continue
checks that columns 11, 12, 13, 14 and 15 each contain no more than 1 code. If not, the column is
reset to blank.
do with codes
Quick Reference
To repeat a set of statements for all codes in a given range, type:
do label variable = code1,code2
To repeat a set of statements for each of a given list of codes, type:
do label variable = (code1,code2, ... )
Sometimes you will want to repeat a statement or set of statements for a given set of codes, rather
than for columns or other types of variable. The way to do this is to write a do statement which,
instead of naming an integer variable and whole numbers, defines a list of codes and a temporary
variable which points to each code in turn. When you want to refer to the current code, you simply
enter the name of the temporary variable and Quantum will substitute the value of the current code
in the statement before it is executed.
The format of a do statement for codes is therefore:
do label.num var.name = p1, p2
to execute statements for all codes in the range p1 to p2, where the sequence of codes is
&-01234567890&;
or:
do label.num var.name = (p1, p2, p3, ... )
to execute statements for the listed codes only.
In both formats, note that the variable name and the codes must all be enclosed in single quotes.
Additionally, you may not use the notation to indicate a blank code, nor may you use the
temporary variable in partial column moves (that is, in statements of the form c(1,4)=c(3,6)).
Here is an example which illustrates how to check for certain codes in a series of columns:
do 10 code = (1,3,5)
if (c110code .or. c111code) emit c180code
10 continue
This loop is executed three times, once for each of the three listed codes. The first time the loop is
executed, the statement will read:
if (c1101 .or. c1111) emit c1801
Nested loops
Loops may contain other loops: this is called nesting. Loops may be nested up to six levels deep,
but they must not overlap. Also, each loop must have a separate terminating statement. In other
words, they must always take the form:
do 60 t2 =
do 70 t3 =
do 80 t4 =
.
.
80 continue
70 continue
60 continue
or
do 60 t2 =
do 70 t3 =
.
70 continue
do 80 t4 =
.
80 continue
60 continue
What we are saying in this loop is that if a given column specified by t1 is single-coded (i.e.,
contains one code only) we set a spare column equal to 1 and send the record out of the loop. If not,
we set the column being checked to blank and return to the top of the loop to get the next value of
t1. This process continues until a single-coded column is found, or until all values of t1 have been
tried.
What is not permissible is:
if (c1763) go to 76
.
.
do 150 t1 = 125,145,5
76 if (numb(c(t1)).eq.1) c(t1+1)&
150 continue
If c1763, the program would jump into the middle of the loop and have an unidentified value for
t1. An error message will be printed under the offending statement.
Normally all records are passed straight from the edit to the tabulation section regardless of whether
or not they contain errors. The reject statement tells Quantum to continue editing the record but
not to include it in the tables. The record is also rejected from the weighting and where split is used,
it is rejected from the clean file and may be found in the dirty file.
to reject records in which column 73 contains an 8 from the tabulations but not from the rest of
the edit. Therefore, even if c738, the record is still checked for a 1 in column 80 and if one is
found, t5 is incremented.
Whenever a record is rejected the variable rejected_ becomes true. You may use this variable in
your program to deal with rejected records in a different way to accepted records. For instance, we
may wish to write all rejected records out in the file rejfil for later inspection and correction:
if (rejected_) write rejfil
The variables rec_rej and rec_acc count the total number of records rejected and accepted so far.
You may wish to check these variables and terminate the run if too many records are rejected.
There is an example of how to do this in section 9.9, Canceling the run below.
If you are working with hierarchical (levels or trailer card) data, reject at a given level will reject
all data at that level. Additionally, data at a level higher than that currently being edited may be
rejected from tables for instance, in the edit of data at the item level, you may reject all data at
person level. The syntax for this is:
reject levelname
where levelname is the name of the parent level to be rejected.
When used with split, reject at any level rejects the whole record from the clean file.
For more information about levels data, see chapter 3, Dealing with hierarchical data in the
Quantum Users Guide Volume 3.
The word return in Quantum bears no relation to the same word in English. It does not mean go
back to the start of the edit or anything like that, rather it means terminate the edit immediately
and jump to the tabulation section. Once the record is tabulated Quantum reads in another record
as usual. If there is no tabulation section, the next record is read in straight away.
The return keyword is often used with reject to reject a record without finishing the edit. For
example:
if (c738) reject; return
if (c801) t5=t5+1
end
Here any records in which c738 are rejected from the tables, but, because reject is followed by
return which sends records to the tabulation section, editing is terminated immediately. Thus, only
records in which c73n8 will be tested for a 1 in column 80. Compare this example with the one
in section 9.6, Rejecting records above.
Do not put reject after return because it will never be reached. Once the return is read, the edit
is terminated immediately and the record is passed to the tabulation section without the rest of
the statement ever being read:
if (c738) return;reject
On some surveys you may want to run test tables on a few records only. This can be done using the
word stop.
stop tells Quantum to stop the run and print tables once editing has been completed on the current
record. For example, we may want test tables for 50 people who own goldfish, so we set up a
counter and terminate the run when it reaches 50:
/* gfish counts those owning goldfish
if (c1135) gfish=gfish+1
if (gfish.gt.49) stop
If we did not wish to restrict ourselves to goldfish owners, and were satisfied with just the first 100
respondents, we could use the reserved variable rec_count in our test and stop when it reached 100:
if (rec_count.eq.100) stop
Alternatively, to be sure that we stop when 100 records have been accepted for tabulation, we could
write:
if (rec_acc.eq.100) stop
When the stop statement is executed, the reserved variable stopped_ becomes true.
A variation of stop is
stop n
where n is the number of times the statement is to be executed. If stop is part of a routing pattern
in the edit, it may be necessary to read in more than the n records to execute the statement n times.
As an example, here is another way of counting goldfish owners:
/* only deal with goldfish owners
if (c113n5) goto 20
- - other statements - stop 50
/* everyone comes here
20 continue
Here, the stop statement is only executed whenever we find someone who owns a goldfish. We may
need to read data for 72 respondents before we reach our target of 50 goldfish owners.
When either form of stop is used, editing and tabulation is completed for the respondent at which
the condition is fulfilled, and no more records are read. Therefore, if we have to process 72
respondents in order to find 50 goldfish owners, a holecount requested by the edit would include
72 records and errors in those 72 records would be included in the error listings.
The word cancel, which is similar in format to stop, terminates the run immediately, producing
tables only for those respondents already passed to the tabulation section. It is often used to halt a
run when too many errors have been detected in the data. For instance, to cancel the run when more
than 100 errors have been found, we might have:
/*
if
if
if
To cancel the run when more than 50 records have been rejected, we could write:
if (rec_rej.gt.50) cancel
Alternatively, cancel may be followed by a number indicating that the run should be cancelled
when the statement has been executed a specific number of times:
cancel 100
cancels the run when this statement has been executed 100 times.
As with stop, holecounts and error listings will only contain information about records read prior
to the cancellation condition being fulfilled. If 400 records are read before 101 errors are found, we
will see the errors for those 400 records.
The process statement is used when you need to tabulate portions of a record more than once. For
example, if our survey asks shoppers about the brands of bread they purchased the last four times
they visited the shops, our data may be set out as follows:
c134 : Brand purchased first time (1=Brand A; 2=Brand B; 3=Brand C; 4=Brand D)
c135 : Number of loaves purchased at that time
c136 : Brand purchased second time
c137 : Number purchased second time
c138 : Brand purchased third time
c139 : Number purchased third time
c140 : Brand purchased fourth time
c141 : Number purchased that time
Suppose we wish to create a table showing the total number of loaves of each brand bought by all
(or selected groups of) respondents during their four trips to the store. The simplest way to do this
is to set up an axis of the form:
l brd;inc=c135
n23Number of Loaves Bought
col 134;Brand A;Brand B;Brand C;Brand D
in the edit at the point you want to tabulate the record for the first brand.
The next set of edit statements will be:
c(134,135)=c(136,137)
process
This overwrites the information about the first purchase with information about the second
purchase, and the record is processed a second time. The total number of loaves bought on the
second trip will be added to the total number of loaves bought on the first trip.
The statements continue:
c(134,135)=c(138,139)
process
c(134,135)=c(140,141)
process
When we finish, the total number of loaves of each brand bought by all respondents during those
four visits will be contained in the relevant cells of the axis.
In a situation like this we would probably put the process statements in a loop at the end of the edit,
although this is not strictly necessary. For example:
do 10 t1 = 134,140,2
c(134,135)=c(t1,t1+1)
process
10 continue
This performs exactly the same task as the list of statements shown earlier; it is just a more efficient
way of writing them.
Be careful if process is the last statement in your edit: the record will be passed to the
tabulation section by process and then again by the end statement. If this is not what you want,
omit the last process.
For another example of process, see Incrementing tables more than once per respondent in
chapter 4, More about axes in the Quantum Users Manual Volume 2.
10 Examining records
There are a number of ways of examining your data once it has been read into the C array. You
may:
Create a frequency distribution reporting the different values found in a column or field of
columns.
Write out specific records and examine them individually, as discussed in chapter 7, Writing
out data.
10.1 Holecounts
Holecounts are used to obtain an overall picture of the data before you write your edit program. For
each column they show:
A distribution of the codes for example, how many respondents have a 2 in column 56.
The density of coding how many respondents have one, two, or three or more codes in each
column.
There is an example of a holecount on the next page. The first column tells us the columns for
which codes are being counted; in this case it is columns 1 to 16 of card 1. The numbers across the
top are the individual codes, and the total in the top left-hand corner is the total number of
respondents (records): our data has 605 respondents.
As you can see, there are two numbers in each cell; an absolute figure and a percentage. The former
tells us how many records were found with a specific code in a column and the latter tells us what
percentage of the total data that is.
For example, there are 169 records with a code 1 in column 14 and this is 27.9% of the total.
Similarly, 32 records have a code 4 in column 15 which is 5.3% of the total records. Notice that
when the cell total is zero, no percentage figure is printed: this all makes it easier to see the pattern
of coding in each column.
The four right-hand columns of the holecount show the density of coding in each column. the
columns headed Den1 shows the total number of records with only one code of any sort in the
column. Den2 is the number of records with two codes in the column, and Den3+ tells us how many
records were multicoded with three or more codes in that column. The TOTAL is the total number
of codes in that column that is, the sum of Den1, Den2 and Den3+.
Lets look at column 115. 162 records have one code only in that column; six have two codes and
one has three or more codes. The total number of codes in this column is 177, and each card has an
average of 0.29 codes in this column.
The holecount is the starting place in your search for errors. There are many holecounts in which
it is immediately apparent that the presence of certain codes indicates an error. It is also clear
whether or not the column should be multicoded.
Creating a holecount
Quick Reference
To create a holecount, type:
count c(start_col, end_col) [$text$]
where text is the holecount title.
Quantum itself accepts double quotes in the holecount heading, but the C compiler which processes
the code that Quantum creates from your specification does not. Generally, it will issue an error
message that refers to a missing ) symbol at the point the double quote occurs. To prevent this
happening, precede the double quote with a backslash. For example:
count c(101,116) $Demo for \"Quantum Users Guide\"$
You may count as many or as few columns as you like, as long as the columns to be counted are
consecutive: to count, say, columns 135 to 140 and columns 160 to 180 you will need two
statements, one for each field.
Records are counted at the stage they are when the count is read. If you have previously altered any
columns, say, with assignment or emit statements, the count will refer to the columns as they are
after the alterations rather than as they were in the original data file. Similarly, any changes which
are effected after the count are not reflected in the output.
If you place a count statement in a loop, Quantum sums the counts for all the columns in the
statement and reports the total number of codes as the count for the first column only.
Filtered holecounts
A filtered holecount is one in which only records fulfilling a specific condition are counted. They
can be created using the if statement to define the occasions when a record should be counted.
For example, suppose we only wish to include male respondents in our holecount. Our statement
might be:
if (c1061) count c(101,108) $Demonstration Survey Males$
We can also create filtered holecounts of trailer cards based on characteristics of the individual
cards. Suppose we have a trailer card for each store visited, in which the store is identified in c79.
The trailer card is the 5-card. We would write:
if (c5791) count c(501,580) $Harrods$
Multiplied holecounts
Quick Reference
To create a multiplied or weighted holecount, type:
count c(start_col, end_col) [$text$] c(m_start, m_end)
where text is the holecount title and c(m_start,m_end) is the field in the C array containing the
multiplier or weight for each record.
In ordinary holecounts, the cells are simply counts of records: each time a record is read with a
specific code in a given column, the relevant cell in the holecount is incremented by one. If 231
records have a 7 in column 79, the figure in that cell will be 231.
Holecounts may also be created by incrementing each cell by the value found in a column field in
the record. This value is the records multiplier. If the multiplier is 15, and the record has a 6 in
column 152, the count for c1526 will be incremented by 15 rather than by 1 for this record. You
may hear this type of holecount referred to as a weighted holecount because multiplying a record
by a given value is the equivalent of weighting it.
If the multiplier is being calculated during the run, it must be placed in the C array using wttran
before the holecount is requested.
For further details on weighting and wttran, see section 1.9, Copying weights into the data
in the Quantum Users Guide Volume 3.
The figures used to create the multiplied holecount would then be 22.4, 12.7, or 11.9, depending
upon the contents of c104 in each record. Suppose we have 27 home owners (that is, 27 people have
c1042), the count for a 2 in column 4 of card 1 would be 612.9 (27 22.4), which would appear
in the output file as 613.
Other points to notice are:
Since we are copying a real number into a field of columns we use the notation cx to refer to
the columns and follow them with the number of decimal places required.
Because the word count is written in lower case it may start in column 1. If it had been written
in upper case it would need to start in a column other than 1 to prevent it being read as a
comment.
The sum of factors that is, the sum of all wholly numeric items (values which occur more
than once are counted as many times as they occur).
The mean for the numeric items listed (that is, the sum of factors divided by the number of
numeric items).
If the field is numeric and the run has missing values processing switched on, fields that are nonnumeric will contain the value missing_. This value is counted as zero by the sum of factors, mean
and standard deviation lines of the report.
Statements are provided for requesting a frequency distribution sorted in alphabetic or numeric
order only.
A frequency distribution, as shown in the example on the next page, is created with the list
statement, as follows:
list c(m,n) [$text$]
where c(m,n) is the column field whose contents are to be listed and text is the heading to be printed
at the top of each page. If no heading text is given, the heading Frequency Distribution is used
instead.
The list statement, as shown above, produces both the alphabetic and numerically-sorted
distributions. To request an alphabetic distribution only, type:
lista c(m,n) [$text$]
and for a ranked distribution only, type:
listr c(m,n) [$text$]
The first example produces a frequency distribution of the contents of c(107,108) sorted in numeric
order; the second example generates a list of car brands which will be sorted in alphabetic order.
Additionally, we are using subscripts to represent the column numbers. If t1 has a value of 36,
Quantum will list the values found in columns 36 to 40.
The rules for double quotes in the text are the same as for holecounts, that is, you must precede
them with a backslash.
The list in the diagram below shows a frequency distribution for the column field c(123,125). It
was created by the statement:
list c(123,125) $PRICE PAID$
Since it was run on a data file containing 200 respondents, the total is 200.
Lets start with the first table the alphabetical sort. The figures in the column headed string are
the values found in columns 123 to 125, in this case, the price paid for a bottle of mineral water.
The next column (item) tells us how many times each code occurred in those columns that is,
how many people paid each price. We can see the actual number of people and also what
percentage of the total sample that is. For instance, 31 respondents paid 111p which is 15.5% of the
total (200).
The columns labeled cumulative show accumulated totals and percentages for each value found.
There are 86 respondents who paid between 111p and 114p, and these are 43.0% of the total
respondents.
The second table shows exactly the same information presented in rank order, with the most
frequently occurring value first. The example shows that this is 212, and that 41 respondents or
20.5% of all the respondents paid 212p for a bottle of mineral water.
Unlike count, if list is part of a loop, it will be executed once for each pass through the loop. All
values found will be entered in the same list: Quantum does not create a separate listing for each
pass through the loop.
PRICE PAID
Total = 200
Alphabetical Sort
string
111
112
113
114
121
122
123
124
211
212
213
214
311
312
item
31
29
17
9
17
21
4
1
3
41
1
3
9
14
cumulative
15.5%
14.5%
8.5%
4.5%
8.5%
10.5%
2.0%
.5%
1.5%
20.5%
.5%
1.5%
4.5%
7.0%
31
60
77
86
103
124
128
129
132
173
174
177
186
200
15.5%
30.0%
38.5%
43.0%
51.5%
62.0%
64.0%
64.5%
66.0%
86.5%
87.0%
88.5%
93.0%
100.0%
Number of categories = 14
Number of numeric items = 200
Sum of factors = 32218.00
Mean Value
=
161.09
Std deviation =
67.97
PRICE PAID
Total = 200
Rank Sort
string
212
111
112
122
113
121
312
311
123
211
214
124
213
item
41
31
29
21
17
17
14
9
4
3
3
1
1
cumulative
20.5%
15.5%
14.5%
10.5%
8.5%
8.5%
7.0%
4.5%
2.0%
1.5%
1.5%
.5%
.5%
41
72
101
122
139
156
170
188
192
195
198
199
200
20.5%
36.0%
50.5%
61.0%
69.5%
78.0%
89.5%
94.0%
96.0%
97.5%
99.0%
99.5%
100.0%
For further information about weighting and wttran, see section 1.9, Copying weights into
the data in the Quantum Users Guide Volume 3.
11 Data validation
In earlier chapters, we discussed ways of examining the data for a set of records (with count) or for
an individual record (with write). In general, however, we want to check the validity of the data for
individual records by putting in the edit a set of testing sentences which will tell us not only whether
a record contains an error but also what that error is.
There are two types of checking sentence. The first involves checking whether a column contains
the correct type of coding (single-coding/ multicoding) and whether the codes in that column are
valid. Take the question on a respondents sex which may be Male, coded c1061, or Female,
coded c1062. c106 must be single-coded because a person cannot have two sexes, and the only
codes which may appear in that column are 1 and 2. Any record in which c106 is not single-coded
with a 1 or a 2 will be flagged as incorrect.
The second type of checking involves making sure that columns whose contents depend on the
contents of other columns contain the correct codes. For instance, suppose the questionnaire asks
whether the respondent has ever used a particular brand of washing up liquid. The answer is coded
into c125 as 1 for Yes and a 2 for No. If the answer is Yes, the next questions concerning price
and quality are asked. If c1252 indicating that the respondent has not used that brand of washing
up liquid, the following columns must be blank. Conversely, if c1251, the following columns
must be coded according to the codes on the questionnaire.
11.1 require
Both tasks listed above can be carried out using if but sometimes they can become very complicated
and repetitive. Therefore, Quantum has an additional testing statement, require, specifically
designed to increase the efficiency of this checking process.
For more information on the if statement, see section 9.1, Statements of condition if.
The require statement is used in three different ways:
Column validation. Tests columns against a given set of characteristics and deals with records
not meeting the requirements according to a specified action code.
Testing the validity of a logical expression. Tests a logical expression and, if it is true,
continues with the next statement. If the expression is false, the record is dealt with according
to the given action code.
Testing the equivalence of logical expressions. Compares the logical value of a group of
logical expressions. If all are true or all are false, the run continues with the next statement, but
if the expressions yield a mixture of values the specified error action is carried out.
The actions which are carried out when the stated conditions are violated are determined by an error
action code defined either in the require statement itself or in a global statement placed at the start
of the edit.
For information about the error action code, see The action code in the following section.
The require statement has three forms, depending upon the function it performs, and these are
described in the subsequent sections. Each one must start with the word require which may be
abbreviated to r.
Our example checks that columns 110 and 125 are not blank (nb). Any records in which this is not
the case are written out to a new file and rejected from any tables that may be produced (/5/).
Lets deal with each of these items separately.
144 / Data validation Chapter 11
The action code is a number between 0 and 7 which tells Quantum what to do with records that do
not match the required conditions (for example, records which are blank but which should contain
codes). The action code may either be entered as a parameter on each require statement or, if it is
the same for all statements, on an rqd statement.
Action codes are:
0
Print a summary of errors only records are not listed individually, but a count is kept of the
number of records failing each require statement. This is printed out at the end of the run.
Print the record and reject it from the tables. This is the default.
Write the record into the output data file, punchout.q and reject it from the tables.
Print the record in the print file, out2, and write it into the output data file, punchout.q.
To write a statement which would print out incorrect records but include them in the tables, we
would write:
r /2/ ....
Similarly, to have all incorrect records printed in the print file, written into the output data file and
rejected from the tables, we would write:
r /7/ ....
In both cases the action code is part of the individual require statement, but where the same action
applies to all requires, it is quicker and more efficient to define the action code on an rqd statement
at the beginning of the edit. For instance, if all erroneous records are to be written out and rejected
we would write:
rqd 5
The default action is to print the record out and reject it from the tables:
r /3/ ....
or
rqd 3
Blank
nb
sp
spb
Single-coded or blank
One of these types must follow the word require since it tells Quantum what to check for.
All that remains is to say which columns are to be inspected; just list each column or field of
columns at the end of the statement. If more than one column or field is defined, each one must be
separated by a comma.
Here are some examples in which the record to be checked is:
----+----1----+----2----+----3----+----4----+
002411123481231&*1927235537*&& 1 1 1
The statement:
r nb c10, c(25,35)
checks that columns 10, and 25 to 35 inclusive are not blank they may contain any number of
codes. This record satisfies both conditions so it passes on to the next statement in the edit.
The statement:
r sp c11, c15, c23, c41
looks to see whether columns 11, 15, 23 and 41 are single-coded. In our record they are, but if this
were not the case (say c11123) the record would be printed out and rejected from any tables that
may be produced. Additionally, Quantum would tell us Column 11 is 123.
Be careful when using field specifications with require: the condition applies to each column
individually, not to the field as a whole. For instance:
r sp c(1,4)
means that each of columns 1, 2, 3 and 4 must contain one code. It does not mean that the field
must contain one code overall. To check that a field contains one code only, use numb.
When incorrect records are printed out, require automatically prints a short text describing the
error. Normally, it tells you what codes were found in the column which is wrong, but if this is not
what you want, you may define your own error text by entering it enclosed in dollar signs at the
end of the statement. This text will then be printed in place of the default text when errors are found.
For example, if c329 is multicoded when it should be single-coded, the statement:
r sp c329
will print the whole record and tell us which codes were found in that multicode:
Column 329 is 13
Data validation Chapter 11 / 147
Instead of being told which codes the column contains, you may prefer to see a message linking the
error to a question on the questionnaire. In this case you will need to add your own error text as
follows:
r sp c329 $q21a not sp$
Sometimes it is not sufficient to check just the type of coding, and you will want to know whether
the codes found are valid for that column. To do this, we use the information given in the previous
section as a base, and add on our first optional extra.
To check whether a column or field of columns contains specific codes, follow the column
specification with the codes to be checked, enclosed in single quotes. For example:
r /5/ sp c2231/5
tells us that column 223 should be single-coded within the range of codes 1 through 5. Any other
codes in this column are ignored. Thus, a record in which c22314 is incorrect because it contains
two of the listed codes, whereas a record in which c22327 is correct because it contains only a 2
from the range 1/5. Of course, any record which does not contain a 1, 2, 3, 4 or 5 at all is also
incorrect, regardless of whether or not it is single-coded: c2239 is just as wrong as c223789&.
Codes may also be defined with all other code types, thus:
r /3/ nb c1562/6
If c156 does not contain at least one of the codes 2 through 6 (regardless of anything else it may
contain) the record is printed out. Column 156 may be multicoded as long as at least one of the
codes is within the required range.
----+----6
1
2
and
----+----6
2
7
8
and
-----+----6
2
5
8
This statement tells Quantum that column 134 must never contain any of the codes 1 through 8:
only 09-& or blank are acceptable. This is the opposite of r sp and r nb, both of which list valid
codes. Any record failing this condition will be printed and rejected via the default action code 3.
Exclusive codes
Quick Reference
To check that a column or field contains no codes other than those listed, type:
r [/err_code/] condition col1codes1o
If col1 contains any codes other than those given in codes1, the test is false.
Now that you know how to check codes, the next thing to discuss is how to check that all other code
positions are blank.
We have said that statements of the form:
r sp cap
accept all records containing only one of the codes p in column a, regardless of what other codes
are also present. To check that a column contains only the listed codes and nothing else, follow the
code specification with the letter o (for only) in upper or lower case. For example, to indicate that
c356 must be single-coded in the range 1/5 and that all other positions (6/&) must be blank, you
should type:
r sp c3561/5o
Any of the following would cause the record to be printed and rejected:
c35634
c35659
c3568
c356
The require statement may define conditions for more than one column. Just follow each column
with the code positions to be checked and separate each set with a comma:
r sp c16412-, c1651/70, c1661/3, c1671/9-, c1681/5
Here the columns to be checked are consecutive but have been listed separately because they each
have different sets of valid codes. If all columns could be single-coded in the range 1 to 7 we might
abbreviate this to:
r sp c(164,168)1/7 $q10a/e$
since this notation means that each column in the field must be single-coded within the given range
rather than that the field as a whole may contain only one of those codes.
As you know, records found to have errors are printed, coded and/or rejected according to the error
action code. When the run is finished you will look at these records and, if possible, correct the
errors by using the on-line edit or correction file facilities.
For information about on-line editing and the corrections file, see chapter 12, Data
correction.
Occasionally you will know in advance what to do with certain types of error; say, for instance, the
respondents sex has been miscoded. You may decide or be told to recode this person as a 3 in
the appropriate column indicating that the sex was not known. The way to do all this in one go is
to write the normal require statement that checks columns and codes, and to follow the code
specification with a colon (:) and the replacement code (in this case 3) enclosed in single quotes,
thus:
r /2/ sp c10612 :3
Any record in which c106 is not single-coded with either a 1 or a 2 will have the contents of
c106 overwritten with a 3.
The equivalent using if and an assignment statement would be written:
if (numb(c10612).ne.1) c1063;
+write $c106 incorrect$
If we have:
+----4----+
1927
If you use this facility, remember that the replacement code is an alteration to the data, and as
such is operative only as long as each record is in the C array. If you want to save these
modifications you must include a statement in your edit which will write records to another
file. Statements which write out new data files are split and write. Alternatively, you can use
one of the action codes which writes records to the output data file.
For information about split, see section 12.4, Creating clean and dirty data files.
For information about write, see section 7.1, Print files.
By now you will have guessed that require statements can become lengthy things, especially when
specific codes have to be checked, replacement characters defined and error texts entered. In many
cases some, if not all, of these items will be common to the majority of the columns listed in the
statement; for instance, several non-consecutive columns may have the same set of valid codes.
When this happens you may enter these common items at the beginning of the require statement
as defaults for that statement. There are several ways of doing this, so lets take the statement:
r spb c1270/9o, c1290/9o, c1310/9o, c1330/9o
Both statements check whether columns 127, 129, 131 and 133 are single-coded n the range 0 to 9
or are blank. If the or & codes appear in any of these columns, or if the columns are multicoded,
the offending records will be printed and rejected.
Defaults defined at the start of a require may be overridden for an individual column or field by
following that item with the new specification. For example:
r sp 1/5 c10, c12, c15, c201/3
tells us that columns 10, 12 and 15 must be single-coded in the range 1 to 5 while column 20 must
be single-coded in the range 1 to 3.
Here is another example which uses the Only operator:
r sp 1/5o c10, c12, c15, c201/7, c24
This checks that columns 10, 12, 15 and 24 are single-coded in the range 1 to 5 and that none of
the codes 6/& are present in those columns. Column 20 has its own code specification which
overrides not only the default codes but also the Only operator. Quantum will check that c20
contains only one of the codes 1 to 7, but it will ignore anything it finds in the range 8/&.
Finally, lets look at one more statement:
r sp 1/5o :& c10, c12, c201/7, c24
This is exactly the same as the previous example except that we have added a replacement code to
be used when errors are found. This code refers to all columns named with this require, even
though column 20 has a different set of valid codes.
Items 1, 2 and 4 are exactly as described in section 11.3, Validating logical expressions
above.
For further information about logical expressions, see chapter 5, Expressions.
For example:
r /3/ (c1334 .and. c140n5) $Cols 33/40 incorrect$
says that c133 must contain a 4 and c140 must not contain a 5. If one or other or both expressions
are false, Quantum prints the record out with the message Cols 33/40 incorrect and rejects it from
the tables.
This type of require statement is often used to check the number of codes present in a column or
group of columns. For example, if the questionnaire specifies that the respondent should name no
more than three products in his answer, you might write:
r (numb(c139).le.3)
causing any record in which column 39 is multicoded with more than 3 codes to be printed and
rejected. This statement has no error text, so any records printed will be followed by the require
statement itself.
Require can evaluate groups of expressions and perform given tasks depending on whether all
expressions are true or all are false. When all the expressions have the same value (i.e., all true or
all false) Quantum continues with the next statement in the program, whereas if some are true and
some are false, the record being tested will be dealt with according to the given (or default) error
action code.
which says that to be accepted, a record must either have a 2 in column 125 and blanks in columns
126 to 145, or something other than a 2 in c125 with at least one code somewhere in c(126,145).
The following data is designed to clarify this.
----+----3----+----4----+----5
2
15
----+----3----+----4----+----5
15 42674 262&03 37 73
9
4
0
----+----3----+----4----+----5
2
6
8 15
is accepted, so is
but
is rejected, so is
----+----3----+----4----+----5
3
635
The first example is accepted because both expressions are true, the second is accepted because
both expressions are false. The third and fourth expression are both rejected because one expression
is true and the other is false.
Note that in this example, if column 125 does not contain a 2 we are only checking that columns
126 to 145 contain at least one code; we are not checking whether those codes are correct.
The test for failure is made on the last require statement executed for the current record. This may
not always be the most recent require statement in the program, and it may not be the require
statement you intend Quantum to execute. If you write:
r sp c1121/5
if (c1151) r b c116
if (failed_) set c116
the test for failure could apply to either of the previous statements. If column 115 does not contain
a 1, the second require statement will not be executed and failed_ will be True if column 112 is
not single-coded in the range 1/5. If column 115 contains a 1, then failed_ will be True if column
116 is not blank.
You can get around this potential problem by setting failed_ to zero (the equivalent of False) just
before the require statement you wish to test. For example:
r sp c1121/5
failed_ = 0
if (c1151) r b c116
if (failed_) set c116
if the respondent didnt try Brand A, the columns associated with it must be blank, or
if he tried Brand A, there must be a code in at least one of the associated columns.
This says that if the respondent did not try Brand A, all columns associated with it must be blank,
but if he tried the product we expect those columns to be single-coded in the range 1/7 or blank.
One can also make require statements apply to smaller sets of data by having records for which
they would be irrelevant go around the statements. Lets say c112 records whether there are
children in the household. If c1121 there are children and c113 and c114 must contain answers.
We could write:
if (c112n1) go to 30
r nb c(113,114)
30 continue
This means that all irrelevant records (respondents without children) would not be tested.
This system makes sense when there are several requires and you want to avoid a whole set
of identical if statements. Its more efficient and its easier to follow. Remember, as well, that
you can put in comments to remind yourself what you are doing and why.
12 Data correction
It is always possible to deal with data which has been incorrectly coded and/or entered. If the errors
themselves cannot be corrected because correct codes cannot be determined, the incorrect data can
be collected under some miscellaneous heading in the tabulations.
However, a cleaner data set can be obtained by correcting or removing invalid data whenever
possible.
There are four ways to correct data:
Replace the incorrect codes with specific codes using edit forcing statements.
Write a file of corrections to be merged with the original data when it is read in by a Quantum
program.
Changing the contents of the original data file is not a function of Quantum: you will need to use
the data editing program, ded, for this. If you do need to edit the original data file, you should
always take a copy of it first in case your editing does not have the desired effect.
For further information about ded, see the SPSS MR Utilities Manual.
This rejects the record from the rest of the edit and the tabulation section as well.
This statement should be at the beginning of the edit to avoid unnecessary editing of a useless
record.
Columns within a record can be removed by blanking them out or setting them to a common reject
code, often a minus or ampersand.
For example:
if(c125n12) c125&; c(126,145)=$ $
All records in which c125 contains neither a 1 or a 2 will have the contents of that column replaced
with an ampersand, and whatever is in c(126,145) blanked out. As a real-life example, suppose a 1
in c125 means that the respondent visited the market, and a 2 in that column means he did not.
Information about purchases made at the market are stored in c(126,145). If column 125 contains
neither a 1 or a 2, we cannot clearly establish whether or not the respondent visited the market so
we set c125 to a special code and blank out any information about purchases.
Inserting correct data is generally more difficult than removing invalid data, because you very often
dont know what the correct data is. However, if you do know, you can correct the data record by
record, or make the same correction for any record which is incorrect. For instance:
if(c(101,104)=$2222$) c1122; c(113,114)=$ $
corrects the record whose serial number is 2222 by setting a 2 into c112 and blanking out
c(113,114).
If you do not know what the correct data is, you may decide to replace the incorrect code or codes
with a valid code chosen at random. For example:
if (c(101,104)=$3625$) c145=rpunch(1/5)
replaces whatever was in column 145 with one of the codes 1 through 5 for the record whose serial
number is 3625.
When correcting data on a record-by-record basis, it is more convenient to use the methods
outlined below.
On-line correction is a method whereby Quantum interrupts processing when incorrect records are
found, so that corrections, if any, may be made interactively. The record may then be re-edited to
check for further errors straight away.
When an incorrect record is found, the current contents of the C array are written to the print file,
out2, as usual, and a message is displayed on your screen indicating the records position in the data
file. Any messages associated with the write or require statement finding the error are also
displayed, and you then have the opportunity to accept the record as it is, reject it, correct it or reedit it. The record itself is not displayed unless you request it.
To use this facility, enter the word:
online
in the edit at the point you want to be able to correct records.
You may put in as many online statements as you like, but as long as there is one online statement
in the edit, on-line editing will be possible both at the point where the statement occurs and also at
the end of the edit. If there are no errors to be corrected, Quantum ignores the online statements.
Once an incorrect record has passed through the on-line edit, you may leave it to continue through
the rest of the standard edit until it reaches the end statement or you may return it to the start of the
edit to be retested. If you prefer, you may name a statement to which records should return simply
by giving that statement a label number and following online with that number. For example:
online 45
Runs containing on-line edits must be run from a terminal rather than in the background until
the edit section is finished; otherwise you will not know when there is a record awaiting
correction.
Any corrections made during on-line editing are effective only during the current run unless
your edit contains one of the commands split or write to create a new data file. If your program
calls the on-line editor but does not contain split or write, a warning message will be displayed
when your program is checked.
Those which terminate on-line editing either for the individual record or for the file as a whole.
As we said in the introduction to on-line editing, Quantum displays any messages associated with
the write or require statement finding the error, but does not automatically display the record itself.
It also displays an arrow prompting you for a command. To display the full record in its current
state, type display or di. The whole record is displayed underneath a ruler, as with the write
statement.
Sometimes it is easier to see the error if you print out the incorrect column or columns separately
rather than looking at the whole record. To see a column or field only, just follow the di command
with the numbers of the columns you wish to see. For example:
di c10
displays column 10
di c(115,130)
Column fields may be entered as just two column numbers separated by a comma, the parentheses
and the C being optional. Thus, the second example could equally well be written:
di 115,130
When a single column is displayed, the individual codes comprising a multicode are shown, but
when fields are displayed, a ruler is printed and multicodes appear as asterisks (*). Here is an
example:
-> di 25,35
+--- 3 ---+
613*9 2 144
-> di 28
159
->
In the first example, the asterisk represents a multicode, whereas in the second example where only
one column is displayed, the codes 1, 5 and 9 are a multicode in column 28.
Correcting records
Quick Reference
To overwrite the current contents of a column or field with a new code or string, type:
[s] column(s) codes
To insert additional codes into a column or field, type:
e column(s) codes
To delete codes from a column or field, type:
de column(s) codes
In all cases, columns are defined as numbers only, without c or parentheses.
The words used for correcting records are set, emit and delete which are usually abbreviated to s,
e and de. They work in exactly the same way as their counterparts in the ordinary edit section:
s overwrites the original contents of a column or field with new information; e appends a single
code to the codes that are already in a column and de removes one or more codes from a column
leaving the remainder intact.
There are many variations of these commands, all of which are equally correct. Just choose the one
that you find most convenient. Here are some examples. The first group are set statements for
overwriting the contents of a column or field with the given code or string of codes.
set c57
s c5=7
s 5=7
s 5 7
set c945&
s c9=45&
s 9=45&
s 9 45&
s 123,126=4567
s 123,126 $4567$
set c(123,126)=$4567$
If you want to overwrite a single column with a single code, use one of the four formats on the first
line. In all cases you may type in the full command word (set) or the abbreviation (s). All four
variations replace whatever is currently in c5 with a code 7.
The examples on the second line are for overwriting a single column with a multicode. Notice that
if you use the = notation, the single quotes enclosing the multicode are optional.
The last line illustrates how to overwrite a field of columns with a string in this case to replace
the current contents of columns 123 to 126 with the codes 4, 5, 6 and 7 respectively.
In all on-line set statements you may omit the set or s at the beginning of the command, thus:
c5=7
9=45
123,126 $4567$
When it comes to adding codes to columns, the on-line editor has an option that the ordinary editor
does not. Whereas the ordinary emit statement only allows you to specify single columns, the online editor also allows you to emit strings of single-codes into a field of columns. Thus, the syntax
of the on-line emit statement is:
emit c3217
e 321=7
e 321 7
emit c(16,17)=$77$
e c(16,17) $77$
e 16,17 77
The same notes apply to deleting codes: the online edit allows you delete codes from a single
column or a field:
delete c1237
de c123 7
de c123 7
delete c5434
de c54 34
de 54 34
delete c(16,17)=$77$
de c(16,17) $77$
de 16,17 77
In all the examples we have just shown, the c, equals sign, single quotes and dollar signs are
optional as long as the components of each statement are separated by spaces. Additionally,
in assignments, set (or s) is optional.
Whenever you alter columns with set, emit or delete, the on-line edit checks that the columns you
are editing are within the range of the C array for the current job. If you are using the default array
of 1,000 cells, c1001 and above are out of range for editing.
The following commands may be used to determine a records path through the remainder of the
edit section and the tabulation section:
ac (accept)
Accepts the record up to the point at which the online statement occurs, whether
or not it has been corrected. The record continues on through the rest of the edit
and will only be re-presented for correction by other online statements or at the
end of the edit if other errors are found. Records accepted in this way are written
to the clean data file if split or write are used.
rt (return)
Terminates the edit for that record: that is, the record is assumed to have reached
the end statement. If split or write has not yet been reached, the record will not be
written to the clean data file even though it will be included in any tables produced
by the run.
rj (reject)
Rejects the record. The record continues through the edit unless it is terminated
with rt. The record is copied to the dirty data file.
The add command adds new cards to the output data file and rm removes cards from it. To add a
card type, type add or ad followed by the number of the card type to be added. If you are adding
several different cards at once, separate the card type numbers by spaces. Quantum will then set the
appropriate thisread variable to be true so that the new card type will be written out with the rest of
the data. Thus:
-> ad 3 4
will set thisread3 and thisread4 to be true so that the new cards 3 and 4 will be written out. Each
card will contain as many columns as the record length defined for the current run. If the C array
already contains data for a card 3 or 4, Quantum issues an error message to this effect.
Removing cards is exactly the same, except that the appropriate thisread variables are reset to false
to prevent the unwanted cards from being written out. It does not alter the data in your original data
file. If you try to delete a card that is not currently in the C array (i.e., the thisread variable is already
false) an error message is displayed.
The edit command (abbreviation, ed) re-edits the record by sending it back to the start of the edit
or to the statement number given with online. If no more errors occur, the record is copied to the
clean data file.
If you prefer, you may hit the return key instead of typing ed.
cancel (abbreviation, ca) cancels on-line editing but continues passing records through the standard
edit program. Any errors found subsequently are not displayed on the screen for correction, but
records are still placed in the clean or dirty files as appropriate.
To find out about this file, see section 1.9, Customized texts in the Quantum Users Guide
Volume 4.
Clean and dirty data files are the terms used to refer to files of correct and incorrect or rejected
records created automatically by the edit statement split.
Each time a record is read and reaches split, it is written out to the appropriate file in its current
state. If any changes have been made with assignment statements, emit, delete, priority, require or
the on-line edit, they will be saved in the clean data file if the record is now correct or in the dirty
data file if the record still contains errors or has been rejected.
Split may occur several times in the edit, but each record will be written out once only. In the
example below, the second split is redundant since all records will have been written out by the first
one. The data to be checked is:
Card 1
Card 2
Card 3
+----5---+ .... 3----+----4 .... +----1----+
5
2
3
Lets suppose that the record has reached the require statement without error. Since c2342 and
c3093, the record is correct so it is copied to the clean file. However, when the next statement is
read and the contents of c146 are checked, we find that it contains a 5 which means that it must
be rejected and should be copied to the dirty file by the second split. This does not happen because
it has already been written out by the previous split. For this example to place the record in the dirty
file instead, it should read:
r sp c2341/5,c3091/5-& :&
if (c14612) emit c1801;else; reject
split
Split is often used at the end of an edit after online. This causes all records found in error by write
and require statements to be offered in the on-line edit for correction and then saved in the clean
or dirty file according to the type of on-line commands you use. For example, if a record is flagged
as incorrect and you correct those errors, the record will be placed in the clean data file. The same
is true if you use ac to accept the record even if you do not make corrections. If you reject the record
with rj, the record will be placed in the dirty data file. By putting both statements at the end of the
edit you can be sure of seeing all erroneous records and of saving all records in their final state.
If some records are rejected from the run using reject;return, these records will not be included in
the clean or dirty files unless the data is split before the records are rejected:
split
if (c132n1/9) reject; return
In this example, because split appears in the edit before reject;return, all records will appear in one
or other of the clean or dirty files (depending on whether or not they contain errors) even though
records in which c132 does not contain any of the codes 1 through 9 have their edit terminated and
are rejected from the tables.
Here, because split appears after reject; return, only records in which c132 contains any of the
codes 1 through 9 will appear the clean or dirty files. Again, which file the records are written to
depends on whether or not they contain errors.
For further information about using reject, see section 9.6, Rejecting records.
For further information about using return, see section 9.7, Jumping to the tabulation
section.
By default, an intermediate data file is created for splitting. The name of this file is clean.q. If the
run does not contain statements which alter the data (for example, recoding with assignment
statements or creating new columns) then this file will be identical to the original data file. In such
cases, you may save disk space during the run by splitting the original data file instead with the
statement:
split only
When we talk about the original data file, we do not mean that Quantum alters your original data
file in any way; merely that it reads records directly from this file and allocates them to the clean
and dirty files rather than taking a backup copy of this file and reading records from there.
You may not use split only when the datapass reads input from another program (for example,
when you use a corrections file to correct records rather than writing a forced edit or using the
on-line edit). Instead, you should run Quantum using the corrections file only and write all
records to a new data file. Then run the datapass on this new data file.
If you do an on-line edit but forget split or write, your changes will not be saved. Also if you
have created new cards and have not made thisread true for the new cards (for example,
thisread3=1 for a new card 3), they will not be written out.
If you use split on a levels (trailer card) job, splitting is switched on for all levels and must
therefore be part of the top level edit. Additionally, it must appear once only and must not be
part of an if statement. A reject statement at any level rejects the whole record and writes it to
the dirty file.
The last method of correcting errors is to create a file of corrections which will be merged with the
original data when it is read by a Quantum program. The correction file must exist in the directory
or partition in which you will be running your job.
Corrections are made by comparing the serial number of the record currently in the C array with
the serial number given with each correction in corrfile. Consequently, all serial numbers in corrfile
must be in the same order as those in the data file. The format for a correction record is:
serial ; corrections
for non-trailer card records, and
serial /n ; corrections
for records containing trailer cards. In both cases, serial is the record serial number and corrections
are the corrections to be made. The /n in the trailer card format is the read number defining the
trailer card to be corrected; it can be found from the error listing. For example, if our data contains
a card 1, three card 2s and a card 3, and we want to correct an error on the third card 2, the read
number would be /3 because the third card 2 is read into the C array during the third read. If /n is
omitted, the read number is assumed to be 1.
Corrections are entered as follows:
s cn = p
To overwrite a column.
e cn = p
d cn = p
de cn = p
As in the on-line edit, the s and the equals signs may be omitted. If the correction refers to a field
of columns, you may define a string of codes in place of a single code.
170 / Data correction Chapter 12
Any number of corrections may be specified for a record as long as each correction is separated by
a semicolon. The data to be corrected may be a single column or a field, and the corrections may
be single-codes or multicodes enclosed in single quotes or strings enclosed in dollar signs. If the
data variable is larger than the string it is to contain, the string will be right-justified and padded
with blanks. If the string is longer than the data variable, a warning message is issued.
Here is part of a sample corrections file:
0010; s c1121; e c2123 ; c314=34 ; de c1153
0123 /4; c2243 ; c2124
0246 c(316,318)=$123$
0555; c(140,180)=
The first record to be corrected is that with serial number 10. Column 112 is to be overwritten with
a 1, a 3 is to be added into column 212, column 314 is to be overwritten with the multicode 34
and the 3 in column 115 is to be deleted.
The second correction is to the cards in the C array after the fourth read for serial number 123. Both
corrections involve overwriting the original data with new codes.
Correcting data with a corrections file is considerably faster than using a forced edit of the
form:
if (c(101,103)=$123$) c1092
Corrections in corrfile are made before the statements in the edit section of your program are
executed. If you are rerunning your previous job to correct errors and you have not altered the edit
in any way, you may save more time by telling Quantum to read the data but not to recompile and
load your program. This is done with the option r on the Quantum command line.
For further information about options for Quantum runs, see chapter 16, Running Quantum
under Unix and DOS.
2.
3.
(8)
(GOTO Q.3)
(GOTO Q.3)
If the respondent replies no to question 1 or does not answer it at all, question 2 is not asked and
columns 9 and 10 are left blank. If the respondent replies yes to question 1 then question 2 should
be coded either with a numeric value or, perhaps, with && for a dont know answer. The blank data
and && are missing values.
You may also find missing values when a numeric field is incorrectly coded with a combination of
numbers and letters. This is usually the result of mistyping when the data is entered and can often
be corrected by looking at the questionnaire itself and then cleaning the data within the edit section
of the run.
Manual assignment of the special value missing_ to variables of your choice within the edit.
You may use these statements any number of times in the edit to toggle between using and not using
the missing values features.
The missingincs statement is always executed wherever it appears in the edit. This means that
although the compiler will accept statements of the form:
if (....) missingincs 1
Quantum will, in fact, switch on missingincs for the rest of the edit or until a
missingincs 0 statement is read. It does not switch on missingincs selectively for only
those records that satisfy the expression defined by the if clause.
If a job contains an edit and a tab section and missing values processing is used in the edit, the
setting of missingincs carries forward from the edit to the tab section. If the edit uses missing values
processing but the tab section does not require it, remember to end the edit with a
missingincs 0 statement.
Blanks in an otherwise numeric field are ignored, but totally blank fields are read as zero.
&s in an otherwise numeric field are ignored, but fields full of &s are read as zero.
Multicodes in an otherwise numeric field are ignored, but a field in which all columns are
multicoded is read as zero.
If you switch on missing values processing these rules are modified so that any field that is not
totally numeric or a combination of numbers and blanks is counted as missing.
Missing values are represented by the special value missing_.
Here is a table showing samples of data in a numeric field and the difference missing values
processing makes to the way that data is interpreted:
Data in numeric field
missingincs 0
missingincs 1
123
13
10
123
13
10
zero
1
zero
zero
11
zero
123
13
10
missing_
missing_
zero
missing_
missing_
missing_
ABC
1AB
000
&&&
1&1
three blanks
If you print variables whose values are missing_ in a report file or write them out to a data file,
Quantum will show their values as 1,048,576 rather than as the word missing_.
If an arithmetic expression uses a variable whose value is missing, the value of the expression
differs depending on whether or not missing values processing is switched on. If missing values
processing is switched on the value of the expression is always missing_. If it is switched off, the
value of the expression is always zero. For example, if c(1,3) contains the string ABC:
missingincs 1
t1 = c(1,3) * 100
sets t1 to zero.
to test whether a variable has the special missing value. Instead, use the function:
ismissing(variable_name)
For example:
if (ismissing(t4)) ....
To use any subroutine, enter the call statement at the point at which the routine is required. The call
statement simply says:
call routine[(arguments)]
where routine is the name of the subroutine to be used and arguments are any other items of
information required by the routine. These will differ from routine to routine and are clearly
explained in the appropriate section below.
Sometimes you will have additional information available that is not part of each respondents data
record but that nevertheless needs to be read into the C array for use in the analysis. For instance,
suppose we did some additional work on a chocolate purchasing survey and collected information
about the cost of various types of chocolate bars. We can transfer this information to the array in
two ways. We can either write an edit to check which brand has been bought and then copy the
appropriate price into the record using if and an assignment statement, or, in a much simpler
operation, we can put the costs into a look-up file and call them up as required with the fetch
statement.
The first line must contain exactly two whole numbers anywhere on the line. The first is the
key length, the second is the total record length including the key.
All other lines must start with the key which may be followed by any other information as
necessary.
The look-up for our chocolate survey is named costs and is as follows:
1
1
2
3
4
4
14
15
21
17
The first line tells us that the key is 1 character long and that the record length is four characters
long (the space in column 2 is part of that information). The other lines refer to the individual
chocolate bars. Brand A (coded 1) costs 14 pence, Brand B (coded 2) costs 15 pence, Brand C costs
21 pence, Brand D costs 17 pence.
When the first record is read, Quantum inspects c135 and compares its contents with the first field
of the look-up file. If c1351 (brand A was bought) and a matching key is found in costs, the
information associated with that key is copied into the C array starting at c136. In our example,
brand A chocolate bars cost 14 pence so c(136,138) will contain $ 14$. If a matching key cannot
be found in costs, the destination area c(136,138) will be blanked out.
Calls for the second and third purchases would be entered as:
call fetch($costs$,c150,c151)
call fetch($costs$,c165,c166)
When you read additional data in from fetch files, Quantum writes a summary of what it has done
to the file out2. The format of the report is as shown here:
Records
7
3
Used
5
3
Unused
2
0
Calls
893
196
Hits
869
196
Misses
24
0
File
cost1
cost2
This tells you that the run used two fetch files. The first file, cost1, contained seven keys; five were
present in the data and two were not. The file was called 893 times altogether and 869 times the
key in the data was found in the fetch file. The 24 misses refer to keys that were present in the data
but not in cost1.
The second file was called cost2 and contained three keys all of which were present in the data. The
file was called 196 times and every time the key in the data was found in the cost2.
Nine digits are allowed for each column making the maximum count in a column 999,999,999.
So, to load data from a fetch file called costs and to see a list of used and unused keys, you would
type:
call fetchx($costs$,c150,c151,3)
Used
5
- key
- key
- key
- key
- key
- key
- key
Unused
2
unused
unused
used
used
used
used
used
Calls
893
Hits
869
Misses
24
File
cost1
If you use fetchx more than once, the key listings are printed after the summary line to which they
refer. If the listing goes over onto a new page the column headings are repeated at the top of the
page.
When Quantum converts multicoded data into single-coded data, it takes the codes in the multicode
and transfers each one to a separate column in the data, thus creating a single-coded field of
columns in addition to the original multicode. You may choose which codes should be exploded in
this manner, and also the start column of the single coded field.
This conversion is done by the subroutine explode which is formatted as follows:
call explode(mc_start_col,num_cols,codes,sc_start_col)
where mc_start_col is the first multicoded column to be converted, num_cols is the number of
sequential columns to be converted, codes are the codes to be written out as single codes, and
sc_start_col is the first column in the single-coded field.
Codes are exploded in the order 1234567890&. If the first code specified in codes is present in
the multicode, that code will be copied into the first column of the single-coded field. If the code
is not present, the column is blank. For instance, if our data is:
----+----5
1
/
4
and we write:
call explode (c144,1,1/4,c151)
we will have:
----+----5----+
1
1234
/
4
If we write:
call explode (c132,2,1/5,c140)
then:
----+----4
14
25
46
7
becomes
----+----4----+----5
14
12 4
45
25
46
7
The explode statement says explode codes 1 to 5 in the two columns starting at column 132 into a
field starting at column 140. Quantum copies a 1 into c140 because there is a 1 in c132, and a
2 into c141 because there is also a 2 in c132. Column 142 is blank because there is not a 3 in
c132, and so on. Notice that the 7 in c132 and the 6 in c133 have been ignored because they are
not part of the code specification with explode.
If explode is called for any record in the data file, Quantum prints a map in the out2 print file listing
the contents of the multicoded columns and the columns into which the codes were transferred. If
explode is not called for any record, no map is produced.
Writing subroutines in C
Quick Reference
To write C subroutines, either type them into a file called private.c in the project directory, or insert
them in the Quantum run immediately before or after the edit section as follows:
#c
C statements
#endc
You can also include executable C statements in the edit section itself, as long as you enclose the
code within #c and #endc statements.
Subroutines written in the C language must be filed in the file private.c in the current directory so
that they will be compiled automatically with the rest of your Quantum program. If you have
already compiled your subroutines before doing your Quantum run, the compiled version must be
stored in the file private.o in the current directory.
Alternatively, you can insert complete C functions immediately before or after the edit section as
long as you enclose the code between #c and #endc statements as shown here:
#c
/* C code
#endc
Here are some examples of how to include a function, square, that calculates the square root of a
number. The Quantum edit that calls this function may look something like this:
real square 1f
ed
cx(181,190):4 = square(cx(1,3))
filedef srdata data
write srdata
end
When calling C functions, be sure to add the f option (where f stands for function) to the end of the
declaration as shown above. If you omit this, Quantum will not recognize the function name and
will issue a syntax error.
If the function you are calling does not return a value, or if you do not need to save the return
value, you can use call to call the function and you do not need to declare it.
For further information about call, see section 13.1, Calling up subroutines.
The first example is a C function in the private.c file:
#include <math.h>
double square(double dval)
{
return (sqrt(dval));
}
In the second example, the C code for the function has been included directly after the end
statement in the Quantum run, and is enclosed by #c and #endc statements.
real square 1f
ed
.
end
#c
#include <math.h>
double square(double dval)
{
return (sqrt(dval));
}
#endc
It is also possible to include executable C statements directly into the Quantum edit section. Again,
the code must be surrounded by #c and #endc statements. Here is an example that calls the standard
C sqrt function directly and assigns the result to the Quantum variable x1.
ed
#c
#include <math.h>
x1 = sqrt(2.0);
#endc
cx(181,190):4 = x1
filedef srdata data
write srdata
end
In addition, any standard C library function, such as sqrt, can be declared and used directly in
Quantum. So the above example can also be written as:
real sqrt 1f
ed
x1=2
cx(181,190):4 = sqrt(x1)
filedef srdata data
write srdata
end
For more details, see Calling functions from C libraries, later in this chapter.
Subroutines written in Quantum must be placed at the end of the edit section, before the end
statement, and preceded by a return, thus:
/* main edit program here
return
/* subroutines here
end
Each subroutine starts with a subroutine statement and ends with a return. The format of the
subroutine statement is:
subroutine name[(var1, var2, ... ) ]
where name is the name of the subroutine. If you define more than one subroutine their names must
be unique within the first six characters of the name so, for example, sqroot and sqrt are acceptable
whereas sqroot and sqroot1 are not.
var1, var2, and so on, are variables which the subroutine will use. These variables are generally
referred to as the arguments of the subroutine.
For more information on the variables file, see chapter 14, Creating new variables in this
volume, and chapter 1, Files used by Quantum in the Quantum Users Guide Volume 4.
Variables defined in the variables file or before ed are called external variables and may be
accessed and changed by statements within a subroutine. Variables defined after ed or inside a
subroutine are local variables and cannot be changed by a subroutine. For example:
real cost 1
int items 1
ed
int nshop 1
/* edit statements
return
/* subroutines
end
The variables cost and items are defined before the ed statement. This means they are external
variables and can have their values changed by a subroutine. The variable nshop is defined after ed
so it is a local variable. This means it cannot have its value changed by the subroutine, even though
its value can be passed to the subroutine for use by it.
Information stored in external variables is always available within a subroutine, and may be
accessed and changed regardless of whether you pass it as an argument to the subroutine. For
example, if we define an integer variable called items in the variables file, we can read its contents
and change them in the subroutine even if we do not include items as part of the call statement.
We might write:
call sub1
return
subroutine sub1
if (items.gt.5) emit c1341
return
end
This checks, inside the subroutine, whether the value of items is greater than 5 and, if so, inserts a
1 in column 134. We do not pass the value of items to the subroutine because it is an external
variable which is available to the subroutine as a matter of course. Because items is an external
variable we could change its value in the subroutine if we wished. For instance, we could reset it
to zero.
Local variables which are required in the subroutine must be passed to the routine as arguments.
If the items variable was defined after the ed statement we would have to name it on the call
statement and on the subroutine statement thus:
ed
int items 1
call sub1(items)
return
subroutine sub1(items)
if (items.gt.5) emit c1341
return
end
This example performs the same task as the previous one. The difference is that this time items is
a local variable, so we must pass it to the subroutine. Once inside the subroutine, we cannot change
the value of items in any way.
In neither example is it necessary to pass c134 as an argument as all cells in the C array are external
variables.
When you use a subroutine which requires arguments, be sure that you call it with as many
arguments as are listed on the subroutine statement for that subroutine. If you give too many or too
few arguments, errors will occur.
For example:
call conv(gallons,liters)
.
subroutine conv(gallons,liters)
is correct because we call the subroutine with the same number of arguments as there are in its
definition, but:
call conv(aa,bb,cc)
.
subroutine conv(aa,bb,cc,dd)
is incorrect because we are calling conv with one argument fewer than its definition specifies.
When you return to the edit from a subroutine, any changes made to external variables will still
exist, but values assigned to local variables defined in the subroutine will not be accessible from
the main edit program. For example:
call sub1
return
subroutine sub1
int doneit 1
if (items.gt.5) emit c1341
items = 0
doneit = 1
return
end
Once the subroutine has been executed and control has returned to the edit, the value of items will
be zero but doneit will have no value at all.
Arguments
Generally, subroutines only need arguments when you are passing the values of local edit variables
to the subroutine. All arguments on the call statement must have a corresponding argument of the
same type on the subroutine statement. This is because Quantum does not compare the names of
the arguments on the call and subroutine lines. It simply passes the value of the first argument given
with call to the first argument named with subroutine and so on. For instance, if gallons and liters
are local edit variables and we want to use their values in the subroutine calc, we might write:
int gallons 1s
real liters 1s
ed
call calc(gallons,liters)
.
subroutine calc(input,output)
int input
real output
Here, the value of gallons is passed to input while the value of liters is passed to output. Input and
output are variables used solely within the subroutine so they are defined in the subroutine.
However, if you have a subroutine that is called more than once with different external variables,
you would represent them with local variables in the subroutine. For instance:
if (numb(c119).le.2) call pchk(c120,total)
if (numb(c119).gt.2) call pchk(c220,tot2)
.
subroutine pchk(n1,n2)
data n1
data n2
.
return
Here, n1 represents c120 or c220 and n2 represents total or tot2. n1 and n2 are local to the
subroutine so they are defined after the subroutine statement.
Single data variables (columns in the C array or user-defined data variables with one cell only) are
passed to a subroutine by naming the variable on a data statement as shown here:
subroutine chk(flav,prefb)
/* flavors bought
data flav
/* brand preferred
data prefb
Any multicodes present in this field are ignored. If you have a multicoded field and you want to be
able to access the codes in each multicode, you must treat the field as a series of single data
variables and pass each one separately, using a data statement, rather than passing the field as a
whole. When variables are passed with call they are written in exactly the same way as you would
write them anywhere else in your edit. For example:
call sub1(c15,gallons,cost,c(20,28))
passes the address of the data variable c15, and the integer values of the variables gallons and cost
and the field c(20,28).
Here is a chart summarizing how to define variables for subroutines:
Main definition
Call argument
Subroutine argument
Subroutine definition
int item 1
item
purch
int purch
int shop 5s
shop3
shop
int shop
real cost
cost
cost
real cost
data c 1000s
c(10,11)
week
int week
data c 100s
c15
pref
data pref
data tried
tried
tried
data tried
Notice that in the main definitions the size of the variable is defined, whereas in the subroutine
definition no size is required since all values are passed as integer values or, in the case of a single
data variable, as an address.
As our comments show, the fields to be checked are c(21,23) for those already subscribing to the
satellite network and c(24,26) for non-subscribers. Both calls to the subroutine subchk name the
columns in the field individually. This is because we want to look at the codes present in each
column. We have not defined the data variables at the start of the edit because they are read
automatically from Quantums variables file. This means that they are external variables and can
have their values accessed by the subroutine.
The subroutine statement uses local variables with names describing the contents of the variables
they represent. The variable high represents c21 and c24 which tell us how likely the respondent
would be to take the new station if it cost $20 a month. Similarly the variable low represents c22
and c25 and dep represents c23 and c26. All local variables are defined in the subroutine as the
name of the variable they represent.
The require statement simply checks whether each column is single-coded in the range 1/59.
If you glance back at the example, youll notice that although were talking about columns in the
data, weve actually treated them as integers. The call to the subroutine simply gives the column
numbers without a preceding c. The subroutine itself defines its arguments as integers and then
uses them as pointers into the C array. There are two reasons for this:
First, it allows Quantum to report the column numbers correctly if it finds records which fail
the require statement. Passing columns to a subroutine as data variables causes Quantum
always to refer to column 0 in the output from require regardless of the true column number
which is in error.
Second, it enables you, if you wish, to set new codes into the columns used in the subroutine.
Normally, any changes made to the C array inside a subroutine are forgotten when control
passes back to the main program. Referring to the columns as pointers into the C array, as in
this example, causes any changes to the C array to be remembered when the subroutine
finishes.
The C runtime and maths libraries contain a number of general-purpose functions, some of which
may be useful in Quantum programs. For example, if you want to square a number or calculate a
square root, you will almost certainly find functions that do this in one of the C libraries.
Before you use a C function in Quantum, read the documentation on that function to find out what
parameters it needs, and of what type. Having done this, you then need to provide this information
in a format Quantum understands. In order to explain how you do this, well use the pow function
which raises a value to a given power.
The Unix documentation for pow( ) states that the function expects two arguments, both of which
are double precision real variables. This means that your Quantum program will need to hold the
value and the power (exponential) in x variables:
x1 = 5
x2 = 2
x3 = pow(x1, x2)
Even if one of the arguments is a constant, as both are in this example, you must assign the values
to variables as Quantum will not accept real constants within the functions parentheses.
pow( ) returns a value which you want to use in your Quantum program. In order to do this, you
must define the function in the variables section of your run (that is, in the variables file or at the
top of your program, before the ed statement). The functions type must be set to the type of data
the function returns. pow( ) returns a double precision value so we define it as:
real pow 1f
1f
cx(11,14)
2.0
pow(x1, x2)
char
int
short
int
int
int
long
int
unsigned char
int
unsigned short
int
unsigned int
int
unsigned long
int
float
real
double
real
When looking things up in this table, bear in mind the following points:
Quantum uses long integers, so all integer variable types except unsigned long can be
accommodated.
Quantum does not support unsigned values, but this is only a problem with unsigned long
variables.
If you are not interested in the value the function returns, or the function does not return a value at
all, you can treat it as a subroutine and run it using call, as you would for the standard Quantum
functions. For example:
call printf($Print this text$)
Quantum stores all names in lower case. So if you want to reference an external function
whose name includes upper case characters, you need to define a function in private.c using a
name in lower case, to call the external function.
For more information about private.c, see section 1.12, C subroutine code file in The
Quantum Users Guide Volume 4.
However, if you create a data variable whose name ends with a number by writing:
data safe1 15s
Quantum does not recognize safe112 as column 12 of the data. So you have to write:
safe1(12)
So, to avoid unexpected conflict statements during a Quantum run, it is probably simpler to name
your variables using A through Z, and the underscore characters only.
Before Quantum will recognize named variables in your program, you must say what type of
information the variable is to contain and how many cells it should have. If you wish to increase
the size of the C array, you must indicate how many cells you require.
There are three places that you can declare named variables:
In the variables file. Variables declared here are available in the edit and tab sections of your
program and also in subroutines, and may be changed by the edit or by a subroutine.
At the start of your program before the ed statement. Variables declared here are available in
the edit and tab sections of your program and also in subroutines and may be changed by the
edit or by a subroutine.
In the edit after the ed statement. Variables declared here are available in the edit section only
and may only be changed there. They are unknown to the tab section and to subroutines.
data
int
real
The variable name: C, T or X to increase the number of data, integer or real variables available;
any name for a new variable.
The variable size. This is generally the number of cells the variable is to have.
increases the size of the C array to 1500 cells. This provides space for records with up to 14 cards
per respondent.
int number_of_trips 5
creates an integer variable called number_of_trips which can store up to five whole numbers.
real price 10
Increasing the C array with a data, int or real statement does not cause Quantum to clear the
extra cells between records. However, when you increase the C array by using the max=
option on the struct statement, Quantum automatically clears the entire array between records.
For further information on max=, see Highest card type number in chapter 6, How Quantum
reads data.
When we first talked about variables we said that the individual cells of an array may be referenced
by following the name of the array by the cell number enclosed in parentheses. Therefore:
meals(3)
c(100)
We also mentioned that you may omit the parentheses when you are referring to a single cell in the
C array so that c100 means the same as c(100).
To make this possible you must follow the variable size with the letter s. This is particularly
important when you are increasing the size of the C array as, without it, any references to, say, c15
will cause errors. For instance, if we write:
data c 1200s
we are increasing the size of the C array to 1200 cells enough for 11 cards per record. Because
the array size is followed by s we can write c1056 when we mean c(1056): Quantum will
substitute the parentheses automatically.
The dimension of the C array will be taken automatically from the value of max= on the struct
statement if this is greater than the dimension requested in the variables file or at the start of your
program file.
For example, if you have:
int c 1300s
in your program, the C array will be increased to 1600 cells to accommodate card type 15.
Do not confuse a declaration of the form:
int brand 1s
The former creates the variable brand as an array, and you can refer to it in your program as
brand1. The latter creates a single named variable that must be referred to as brand.
This gives you the 1000 data variables, 100 real variables and 200 integer variables mentioned in
chapter 4, Basic elements.
The second statement (colreal cx c) informs Quantum that variables referred to as cx are, in fact,
data variables whose contents are to be treated as real numbers.
For further information about external and local variables, see Passing information between
the edit and a subroutine in chapter 13, Using subroutines in the edit.
15 Data-mapped variables
Data-mapped variables can be used to store the answers to questions, both numerical and
categorical. When storing numerical information, a data-mapped variable can be treated in the
same way as other numerical variables. Categorical values are generally stored and retrieved as text
strings, that is, the response texts of a question.
As the name suggests, data-mapped variables are typically used in conjunction with one or more
data-mapping files and allow Quantum specs to be written without needing column and code
information. Instead, the Quantum can be written so that it automatically retrieves the information
it needs from the data-mapping files used. Using this technique, you can specify conditions in your
Quantum run by referring to the response texts that appear in your questionnaire, rather than having
to specify the columns and codes that are involved. An example of this could be:
n01Blue;
n01Green;
n01Red;
c=colors $Blue$
c=colors $Green$
c=colors $Red$
as opposed to:
c=opinions$I liked the first brand much more than the second$
However, you do not need to specify the whole response text, just enough to uniquely identify
it. (In addition, it is very likely that the specifications will have been automatically generated
rather than hand written.) The above example could therefore be written as:
c=opinions$I liked\$
You type in the characters which uniquely identify the text and then append the \ character to
ignore the remaining characters in the string. This is described in more detail later.
For more details about generating a Quantum specification automatically, see section 15.10,
Automatically generating a Quantum spec.
All this does not mean that you must have data-mapping files to use data-mapped variables.
However, without a data-mapping file, you would have to manually load values into the datamapped variables, which removes many of these advantages.
For numerical fields, the location of the fields in the data file (that is, the card and column
specifications).
For categorical fields, the response text and location (that is, the card, column, and codes) and,
possibly, the unique ID for each category.
Additional information that is not used by Quantum such as the limits of a numerical range.
As a normal variable definition, using the mapvar variable type. The syntax for which is:
mapvar variable_name [size]
For example:
mapvar my_variable 1
Where data-mapped variables are defined using mapvar, the following should be noted:
If you define an array of mapvar variables (that is, by specifying a size greater than 1), the
actual size of the array is determined by its use and not by the size specified. For example,
if you have the question Which colors did you paint the walls of each room, you could
specify an array of rooms as:
mapvar rooms 2
Using the *usemap statement to introduce a map file. The syntax for this statement is:
*usemap mapfile_name
For example:
*usemap project.qdi
If you use a *usemap statement to introduce a map file, a variable is automatically defined for
each item in the file.
Although you may use the same name for a mapvar name or a *usemap file, you cannot use
the same name for a data-mapped variable and any other type of variable. For example, you
could assign the name preferences to both a data-mapped variable and a map file, but you
could not use then use this name for a data, integer or real variable.
Test the value using logical operators (that is, .eq., .ne., .lt., .le., .ge. and .gt.), for example:
r(my_mapvar .ne. 0)$Zero value given$
Analyze the value using the var statement (described later), for example:
var my_var;=;base=Total;1;2;3;i;4-5;6+
For data-mapped variables storing categorical data, its use is similar to using data variables that are
storing categorical data. In this case, you can:
Analyze the value using the var statement (described later), for example:
var my_mapvar;base=Total;Coke;Pepsi;Other=$_other$;DK=rej
In addition to normal response code names, packages such as Quancept allow certain special
responses in the data. In order to check or set these names, the following special response texts are
recognized by Quantum:
Exclusive responses
Non-exclusive responses
$_null$
$_dk$
$_ref$
$_other$
Response groups
Miscellaneous
$_base$
$_possible$
$_answered$
$_normal$
$_precode$
$_special$
$_na$
$_uniqid$
When using data-mapped variable arrays, you can refer to the array element just as you would any
other variable array, that is, by specifying the element using the numerical index. However, if the
data-mapping file contains names for the array elements (note that using a qdi file which was
generated by Quancept will create arrays for variables that are iterated and the elements are named
after the iterations), then you can use those names to reference specific array elements. For
example, if you had the variable array wrate in the Quantum run that stores the rating given to the
widget suppliers Wilsons Wonderful Widgets and Just Widgets, you could refer to the rating for
each supplier as:
wrate(1)
wrate(2)
or:
wrate($wilsons wonderful widgets$)
wrate($just widgets$)
Also, if the array elements have unique IDs associated with them, then these too may be used to
refer to the elements. As with other uses of unique IDs, the ID text is converted to a response text
format by placing it within parentheses and prepending the underscore character. Therefore, using
the same example as above, you could write this as:
wrate($_(wilsons)$)
wrate($_(justwids)$)
If you are not using a mapping file, or the data-mapped variable is not represented in the mapping
file, then each array element will be created when it is first used.
*usemap map_filename
*usemap map_filename
analysis specs
*include data_filename
Of course, if you have a second data file with a different mapping scheme, you would:
In your run file:
*usemap map_filename1
*usemap map_filename1
analysis specs
*include data_filename1
*usemap map_filename2
*include data_filename2
You can see above how your Quantum run specifications do not change at all. Also note that you
only need to introduce one of the maps in your Quantum specifications. This is because you are just
using the map file to define your variables. If the same items exist in both files, you do not need to
define them twice.
There are various reasons why you may wish to explicitly assign values to data-mapped variables,
so naturally you can set values into data-mapped variables. Below is a summary of how you can
achieve this:
For example:
q23 = t1 + 7
If the variable is either clear or already holds a numerical value, then the result of the arithmetic
expression is stored as a numerical value.
If, however, the variable is already set to one or more categorical responses, then Quantum
attempts to set the categorical response that corresponds to the result of the arithmetic
expression. For example, if the result of the expression is 5, then Quantum sets the 5th
categorical response and all other categorical responses are cleared. Looking at the categoric
question:
Q.23 Which of these newspapers do you read regularly?
1. Times
2. Telegraph
3. Independent
4. Mail
5. Other
Y. Dont know
You may then have a data-mapped variable called Q23. Typically, you would expect the
variable to be tested for the exclusive response Mail as:
if (q23=$Mail$) ...
However, you could refer to it by its numeric value (that is, 4):
if (q23.eq.4) ...
So, if you wanted to explicitly set Q23 to be $Mail$, you can do it in one of two ways:
q23 = $Mail$
q23 = 4
However, since the variable is associated with a list of responses, then the following would
give a data error:
q23 = 7
This is because there is not a seventh response in the list. If, however, the variable were a true
numeric type, this would be fine.
Assign a response.
You can assign a specific response either by using the method for assigning a numerical value,
or by using the following syntax:
variable_name = $response_text$
For example:
q1 = $Once a week$
If unique ID texts are defined for responses in the data-mapping file, you can assign a response
using its unique ID text. The syntax for this is:
variable_name = $_(unique_ID_text)$
For example:
q1 = $_(Once a week)$
If either the source variable or the target variable holds a numerical value, then the numerical
value of the source variable is copied to the target variable. If this is not the case, then the
categorical responses are copied.
Categorical responses are transferred by matching the text. This means that the target variable
may not contain the same positional value for a response text as the source. For example, if the
variable drank_most_recently contained the response texts:
$Coke$
$Pepsi$
then here the response text $Coke$ would be referenced as response number 1.
then here, the response text $Coke$ could be referenced as response number 2. If a respondent
had $Coke$ as the answer to drank_most_recently, then the statement:
drank_at_all = drank_most_recently
would result in both variables having the value $Coke$. This would, however, be response
number 1 in drank_most_recently, but response number 2 in drank_at_all.
Responses are transferred over using the response text as described above. Bear in mind that
the following special responses are exclusive and can only appear in the absence of all other
responses:
$_ref$
$_dk$
$_null$
$_na$
If an assignment results in a combination of any of these exclusive codes and one or more other
responses, Quantum removes the exclusive special responses from the target variable. If an
assignment results in more than one exclusive special response and no other responses,
Quantum removes all but one of the exclusive special responses using a defined order of
precedence. The order of precedence is $_ref$, $_dk$, $_null$, $_na$. So, if an assignment
results in $_null$ and $_dk$, Quantum removes the $_null$ response and leaves the $_dk$.
Collecting the logical AND of the responses from several data-mapped variables.
In a similar way to the OR function, you can use the AND function to collect only responses that
appear on every one of the specified list of variables. You can assign the result to a variable as
follows:
target = AND(source1, source2 [, source3, ...])
For example:
tried_both_times = AND(tried_first, tried_second)
As with the OR function, exclusive special responses are unset if they are not valid.
Collecting the logical XOR (exclusive OR) of the responses from several variables.
Again, similar to the OR function, you can use XOR to collect responses that appear on only one
variable of a specified list of variables. This means that if a response is not mentioned on any
of the variables, it is not collected. In addition, if the same response is mentioned on two or
more of the variables, that response will also not be collected. You can assign the result to a
variable as follows:
target = XOR(source1, source2 [, source3, ...])
For example:
tried_once_only = XOR(tried_first, tried_second)
As with the OR function, exclusive special responses are unset if they are not valid.
If the data-mapped variable contains a numerical value, then the value of the variable is tested.
If the data-mapped variable contains a single categorical response, then the value of the
variable is the response number (that is, the first response is counted as 1).
Data-mapped variables Chapter 15 / 211
If the data-mapped variable contains several categorical responses, the value of the variable is
zero.
Finally, if the data-mapped variable is unset, then the value of the variable is zero.
To test if a data-mapped variable has only one categorical response, and it is the one specified,
you can use the = operator.
To test if a data-mapped variable has one or more categorical response stored, including the
one specified, you can use the & operator.
Testing for categorical response texts is achieved by specifying the variable name, the test operator
and the response text. The syntax is very similar to the way the presence of response punch codes
are tested when using standard data variables. However, it is not possible to test for the presence of
several response texts in a single test. The following examples show how you might check for
responses using standard data variables with punch codes, and using data-mapped variables with
response texts:
Standard data variables
c1231
q23$Yes$
c146n7
c109=7
q9=$Portable CD Player$
c12734
q27 $Yesterday$.or.q27$Today$
In the same way as using standard variables, you can omit the test operator, in which case & is
assumed. You can also combine or negate tests using the logical operators: .or, .and., and .not. and
adjust the order of evaluation using parentheses.
In addition to using just the response texts associated with a given variable, you can also:
Use one of the special response texts described earlier (that is, one of $_base$, $_normal$,
$_dk$, $_ref$, $_other$, $_na$, $_null$, $_precode$, $_special$, $_answered$ and
$_possible$).
Use the unique ID associated with a response using the syntax $_(unique_ID)$.
If you have specified the unique ID on an element (using uniqid=keyword), then you may use
the special response text $_uniqid$ as a shorthand for $_(unique_ID)$.
You may specify only as much of the response text as is needed to uniquely identify it. When
doing so, you must append a \ character to the text. For example, $Very impressed$ may
become $Very\$ and $Quite impressed$ may become $Quite\$ (or, in fact, just $V\$ and
$Q\$ if these strings are unique).
You may also specify response texts (or the unique ID) using either uppercase, lowercase
characters, or any such combination.
By default, the var statement uses the element text as the response text to create the condition
required. If this is not correct (as with the Coffee maker element), you can specify the required
response text using the = operator.
When analyzing data-mapped variables that contain numerical values, you can use either the
var or val statements. For example, the following two statements are equivalent:
val q17;=;base=Total;hd=Number Bought;hd=-------------;
+1;2;3;4;5;6;7;8;9;10 or more;Dont know/Not answered=rej
var q17;=;base=Total;hd=Number Bought;hd=-------------;
+1;2;3;4;5;6;7;8;9;10 or more;Dont know/Not answered=rej
Whether analyzing numerical or categorical data, var has an added advantage over col and val
equivalents in that you can combine several variables on one statement. This ability extends
the power of the var statement so that it becomes an equivalent to the fld statement too. To
combine two or more variables, simply place a comma-separated list of variables where you
would normally specify the single variable. For example, if the variable q25 held information
about the first appliance purchased and the variable q26 on the second purchased, you could
use the var statement to combine them as follows:
l q25_26
var q25,q26;base=Total;hd=Appliances purchased;hd=--------------+Fridge;Freezer;Microwave oven;
+Coffee maker=$filter coffee\$;Toaster;
+None of these;Dont know/Not answered=rej
Often, lists of items such as these are specified in the data using a code number. Therefore, if,
instead of the actual names, a numerical code is given to each type of appliance and the codes
are assigned to q25 and q26, the above example can be written as:
l q25_26
var q25,q26;base=Total;hd=Appliances purchased;hd=-------------+Fridge=134;Freezer=135;Microwave oven=102;
+Coffee maker=117;Toaster=203;
+None of these=0;Dont know/Not answered=rej
If you are using a data-mapped variable array, then you must follow the var(#)= keyword by a
specific array element. For example:
l rates
n01Rating for Wilsons Wonderful Widgets;var(1)=wrate($wilsons\$)
n01Rating for Widgets R Us;var(1)=wrate($widgets\$)
side
var var1;=;1;2;3;4;5;Dont know/No Answer=rej
This is a useful feature in that if two or more tests are to be performed on the same operation,
then it may be better to assign the result to a new variable and test that. This saves you repeating
the AND, OR or XOR operation many times.
For example:
t1 = numb(q25, q26)
n01Average number of purchases;inc=numb(q23)
flt ;c=numb(q23).gt.1
numb counts only precoded responses. This means that it will include all user-defined
responses and also the $_other$ response; it does not include the $_dk$, $_ref$, $_na$ or
$_null$ responses.
Using an existing Quancept qdi file, you can automatically generate a data-mapped Quantum
specification. The qdi file contains details of all the variables defined, their possible values (that is,
the possible responses), and where in the data records the information for particular variables is
located. The generated Quantum run file includes the necessary statement to the qdi file which is
then referred to for information during the Quantum run. You may need to manually adjust the
generated specifications, but in general, this automatic creation can save you a great deal of time
(especially for new spec writers) and reduces the likelihood of errors.
The main body of the Quantum specification is generated using the qdiaxes program. This program
reads the qdi file and creates the following files:
A run file containing a struct statement, a *usemap statement for the specified qdi file, *include
statements for the tab and axes, and a dummy breakdown axis.
A table specification file containing a tab statement line for each data item in the qdi file.
The Quancept utility, qditum, can also generate a basic Quantum specification from a qdi file.
However the specification that it creates does not use the data-mapping feature.
To create a Quantum spec file from an existing Quancept qdi file, type:
qdiaxes [a][t n] input_qdi_file output_filename
where:
a
This option causes qdiaxes to remove all text strings of the form < ... > from
question texts. This is useful for Quancept Web projects where the text may
contain embedded HTML directives.
Note that text-formatting codes resulting from a Quancept CAPI script are
always removed since they are meaningless to Quantum.
t n
input_qdi_file
The name of the qdi file with or without the qdi suffix.
output_filename
The base name for the output files. qdiaxes appends the relevant suffix to each
Quantum output file.
reads the qdi input file holidays.qdi and generates the corresponding Quantum files (that is,
holidays.run, holidays.tab and holidays.axs). Since the t parameter is not specified, the response
texts (written to the holidays.axs file) are truncated to 12 characters.
These formatting codes serve no purpose in the generated Quantum texts and so are always
removed by the qdiaxes program.
Quancept Web: In addition to the formatting control provided with Quancept CAPI, the Quancept
Web product allows the scriptwriter to embed HTML directives into texts. Such directives are
usually enclosed in angle brackets and can optionally be removed from the qdiaxes output by using
the a option.
218 / Data-mapped variables Chapter 15
CAPI
and Quancept
If the t option was set to a number greater than the longest response text, say 50, then the
conditions generated would read:
c=Q1
c=Q1
c=Q1
c=Q1
c=Q1
&
&
&
&
&
If, however, the text was truncated to the minimum number of unique characters, in this case 3, then
the conditions would read:
c=Q1
c=Q1
c=Q1
c=Q1
c=Q1
&
&
&
&
&
$I l\$
$I t\$
$I h\$
$The\$
$Not\$
The \ character informs Quantum to ignore the remaining characters in the string.
But, by applying the default truncation length of 12, they would then read:
c=Q1
c=Q1
c=Q1
c=Q1
c=Q1
&
&
&
&
&
$I liked the \$
$I thought th\$
$I have no re\$
$The product \$
$Nothing at a\$
You may prefer the texts to be shorter or longer. Either way, the t option on the command line can
accommodate your preference.
Note that response texts are never reduced below a minimum threshold; that is, either the limit
set by the t option, or a default of 12.
The *usemap statement instructs Quantum to refer to the qdi file for all the relevant variable
information. The two *include statements tell Quantum to read in the contents of the generated tab
and axis files.
The last two statements form the dummy breakdown by which all axes are tabbed. This enables top
lines to be produced immediately.
These statements act as a template for your specification; you can add to, delete and amend the
statements accordingly.
These can easily be stream-edited to use any standard analysis breakdown which you may wish to
define. Data items with multiple iterations produce grid axes which are tabbed as usual, that is:
tab item_name GRID
Where item_name is the name of a data item in the qdi file and XXXXXX or GRID is a dummy name
against which to tabulate the first axis.
You will, no doubt, need to make changes to these statements in the axes file.
Following this specification, a side statement is generated. (side is used to separate the
column definitions from the row definitions.)
Data-mapped variables Chapter 15 / 221
Categoric items will produce an n01 element for each category as follows:
n01response_text;c=item_name & $truncated_text$
Any category that has a unique ID associated with it will also have the appropriate
Quantum uniqid= keyword generated. For example:
n01response_text;c=item_name & $truncated_text$;uniqid=unique_id
Under DOS, the version to use is set at the time the software is installed. For information on how to
switch between versions, see your installation instructions.
Each release of Quantum comes with two command files:
quantum
This version silently deletes all temporary files created during a run unless you
include the option k on the command line.
quantumx
In the examples of commands in the rest of this section, we will use the word quantum to mean
quantum or quantumx.
At installations where automatic deletion of temporary files is not desirable, you may find that the
administrator has renamed the files so that quantumx is called quantum, and vice versa. You should
check this before you run your first job.
For further details on file deletion, see section 3.1, Tidying up after a Quantum run, in the
Quantum Users Guide Volume 4.
Run only one section of the job such as the compilation stage or the table creation stage.
Define a run ID when you want to do more than one run in the directory.
Define the names of directories in which Quantum should look for program and data files or
create intermediate files.
Convert the Quantum program and data files into a Quanvert database.
They are:
c
id
lo
pd
Name the directory in which Quantum should look for program and data files.
Return the duration time for various sections of the run (Unix).
td
Names the directory in which Quantum should create its intermediate files.
The option to create a Quanvert database is only available if the Quanvert Database
Administration software is installed.
For further information about creating a Quanvert database, see chapter 7, Creating and
maintaining Quanvert databases in the Quantum Users Guide Volume 4.
For further information about dummy data files see Reading non-standard data files in
chapter 10, Include and substitution, in the Quantum Users Guide Volume 2.
The first step in any Quantum run is to check the syntax of your Quantum specification and to
convert it into C code. We call this compilation. You can run the compilation stage by itself by
typing the quantum command with the c option:
quantum c [program_file]
The compilation creates many files, the most important of which are:
out1
The program file listing made as the program is checked. If errors are found,
Quantum marks them in this file.
colmap
A listing of all the columns and codes referred to by all non-ignored axes.
For further information on the contents of these files, see chapter 2, Files created by
Quantum, in the Quantum Users Guide Volume 4.
After a successful compilation, Quantum converts the C code created by the Quantum compile into
a program and, if there are no problems, reads the data. We call this program the datapass program.
You can run this stage as a separate task on Unix systems by typing:
quantum lo data_file
or:
quantum ld data_file
If you are working on DOS, type:
quantum l data_file
This stage also creates a number of files, most of which are normally deleted at the end of the run.
The file you need to know about is:
qtm_ex_
dirty.q
hct_
Holecount output
lst_
out2
punchout.q
sum_
The weighting program, weight, weights records according to the figures given in your Quantum
program file. If the run has no weighting, the weighting program is ignored.
The accumulation program, accum, builds a file containing the cell values for each table.
If your job uses row or table manipulation, Quantum runs a program called manip. This carries out
your manipulation requests and creates a second file of cell values. Note that this file contains
values for all tables whether or not they are the result of manipulation.
You cannot run the weighting, accumulation or manipulation stages in any way except as part of a
complete Quantum run.
Quantum creates the following files, amongst others, during these stages:
weightrp
nums
nums.man
The final step in most runs is to take the cell values and use them to create tables. Quantum reads
the page and table headings and positions them as requested. If tables are to be sorted, added or
placed side by side, the relevant figures are rearranged or combined.
To change the table layout without changing the cell counts (for example, to print more decimal
places for percentages, or to use special characters for absolute zero or rounding) you may rerun
just the compilation and output stages using the command:
quantum o [program_file]
Files created during this phase which you should know about are:
out3
tab_
Tables
If you want to rerun a single table only, you may run the Quantum output program by name rather
than via the Quantum shell script. Type:
qout o tab_file t table_num
where tab_file is the name of the file to which the table will be written and table_num is the number
of the table you wish to reprint. For example, to rerun table 10 and save it in the file tab_10 you
would type:
qout -o tab_10 -t 10
The notes in this section do not apply to DOS Quantum since these facilities are not available on
that platform.
Quantum normally runs interactively. With large jobs, this can lock up your terminal for a
considerable time, so you may wish to use facilities provided with your operating system to run
your jobs in the background. This then frees up your terminal for other uses.
When you run jobs in the background, they still write messages to your screen unless you redirect
them to a log file. Quantum provides for this on the systems which need it with the l option. This
writes any messages which would normally appear on your screen into a file called log instead. You
use it on the quantum command in addition to any other options required for the job. For example,
to run a complete job in the background under Unix, you might type:
quantum -l run1 data &
On some systems your system manager may prefer you to run large jobs via the batch system.
You may run more than one job in a directory without overwriting existing files by assigning a
unique suffix to each run. All files created during this run will have names which end with a dot
and the given string. For example:
quantum -id abc run1 data
Quantum can create its temporary work files in a directory other than that in which the job is
running. The directory is named using the option td on the command line:
quantum td temp run1 data
This example tells Quantum to create temporary files in a subdirectory called temp in the project
directory.
Creating temporary files in a different directory is one way of improving the performance of large
jobs running under DOS. When the number of files associated with a job rises above 500, youll find
that the job runs more quickly if the temporary files are created in a different directory. Youll also
find it more convenient to scan directories contents when the number of files in each one is
reduced.
Running Quantum under Unix and DOS Chapter 16 / 231
You may also find that using td when creating a Quanvert database helps to keep the project
directory clean of unwanted files. It is also useful if you need to do multiple Quantum runs to create
the database. As long as you use a different temporary directory for each run, you can then combine
the directories with qvmerge to create the Quanvert database.
The option to create a Quanvert database is only available if the Quanvert Database
Administration software is installed.
For further information about creating a Quanvert database, see chapter 7, Creating and
maintaining Quanvert databases in the Quantum Users Guide Volume 4.
Quantum normally reads its program, data and include files from the directory in which you are
running the program, and creates permanent output files such as print or report files in that
directory. If you want to use a different directory, define it on the command line with the option
-pd. An example using Unix pathname notation is:
quantum -pd /usr/barbara/qjobs run1 data
The exceptions are filedef and include with absolute pathnames. In these cases Quantum uses the
directory named in the pathname.
Index
This index covers all four volumes of the Quantum Users Guide. The page references consist of the volume
number followed by the page number; for example 2-6 is page 6 of Volume 2, 3-166 is page 166 of Volume 3,
and so on.
A
a, global tabulation parameters 2-8
in tabcon format file 3-190
options on 2-9
Absolutes
decimal places with 2-113
position of percentages relative to 2-18, 2-117
print character before/after 2-41, 2-114
print characters next to 2-81
requesting in tables 2-16
side by side with percentages 2-17
suppress small 2-21, 2-117
ac, accept codes in online edit 1-165
Access rights on files in Quanvert Text 4-81
Accum error messages 297, 4-163
Accum program 1-229
acr100, 100% on base row 2-10, 2-32
Action codes with require 1-145, 1-146
ad, create cards in online edit 1-166
add, add tables 2-182
dummy elements with 2-186
example of 2-184
options with 2-186
Quanvert 4-71
sample program for 2-183
with offsets 2-183
Adding table 2-182
Addition 1-26
Aided and unaided awareness, example of 2-230
Alias file for qvpack/qvtrans 4-128
Aliases for Quantum statements 4-6
allread, cards read for current respondent 1-50
with write 1-66
alp files 4-94
Alpha variables, for Quanvert 4-73, 4-74
Alphanumeric card types 1-58
alter, texts in Quanvert Text 4-83
Analysis levels see Levels
Analysis of variance
Friedmans two-way 3-85
one-way 3-110
example 3-110
formula 3-119
.and., logical comparison (both) 1-39
and, axes for additional tables 2-178
with flt 2-218
and, logical operator for assignment 1-100
Index / 233
Axes
analysis level 2-40, 3-53
bases in 2-56, 2-113
blank lines in 2-58
column and code map for 1-226
column width 2-41, 2-113
column, nested subheadings in 2-63
creation of, in Quanvert (Windows) 4-96
creation of, in Quanvert Text 4-83, 4-96
declare weighting in 3-14
defining for Quanvert 4-68, 4-69
double spacing in 2-41
elements per create in Quanvert Text 4-83
flag as single coded 2-44
generated from qdi file 1-221
grids 2-238
introduction to 2-39
long element texts in 2-66
maximum characters per axis 4-9
maximum per run 4-9
mutually exclusive elements 3-72
naming 2-39, 4-15
naming of files 4-15, 4-95
no double spacing in 2-45
no sorting 2-45
on tab statements 2-171
reflip incorrect 4-98
require single coding 2-40
reset flags between trailer cards 2-41
restrict access to, in Quanvert Text 4-84
sorting 2-44
special characters for laser printing 3-199
subaxes within 2-76
subheadings 2-41
axes.inf files 4-94
axes=, maximum number of axes per run 4-9
Axis names, table titles from 2-10
Axis subgroups 2-76
Axis-level statistics, list of 3-68
axreq=, axis coding requirements 2-25, 2-40
axtt, table titles using axis names 2-10, 2-32
B
b, breakdown element for manipulation 3-41
baft, print base titles last 2-10
Banners see Breakdowns
Base
creating 2-56, 2-113
effective 2-119, 2-153, 3-147
enclose in parentheses 2-81
flag cells with small for stats 2-20
force export to SAS or SPSS 2-114
minimum effective for T statistics 2-29, 3-150
percentage against redefined 2-16
print base title last 2-10
234 / Index
Base (continued)
redefining 2-103
required for statistics 3-71
small for special T statistics 2-20, 3-150
sort on element other than 3-126
suppress elements with small 2-21
suppress percentages with small 2-196
suppress statistics with small 2-196
suppress tables with small 2-21
use to define segments in an axis 3-69
base, base element 2-113
binasc.dat, octal punch codes for ASCII character set
4-171
bineas.dat, octal punch codes for extended ASCII
character set 4-171
bintab, convert extended ASCII character set 4-174
bintab.qt, characters in extended ASCII character set
4-171
bit arguments per fld statement 2-267, 4-133
bit files 4-94
bit, elements with numeric codes 2-97
inc= with 2-99
when better than fld 2-99
Blank lines
after column headings 2-14, 2-162
before column headings 2-14, 2-162
in tables 2-58
Blanks
allowing in arithmetic tests 1-39
with col 2-84
bot, titles at bottom of page 2-210
Quanvert 4-71
with flt 2-218
with hitch/squeeze 2-191
boxe, end of box 3-207
Boxes in tables 3-206
boxg, box above G texts 3-207
boxl, draw line inside box 3-207
boxs, start of box 3-207
Brackets, print multicodes in 1-80
Break points, define in element texts 2-163, 3-199
Breakdowns, example of 2-167
btx files 4-94
byrows, export grids row-by-row in Quanvert 2-40,
2-249
C
#c, start C code 1-183, 3-123
C array
columns 1-18
defining size of 1-198
increasing 1-196
C code in Quantum spec 3-123
C compiler error messages 296, 4-162
C library functions, calling 1-192
Chi-squared test
one dimensional 3-73
example of 3-74
formula 3-89
single classification 3-78
example of 3-80
formula 3-90
two dimensional 3-76
example of 3-77
formula 3-89
Clean data file 1-228, 4-16
clean.q, clean data file 1-228, 4-16
clear, reset variables to initial state 1-111
advantages over assignment 1-111
clear=, reset axis cells 2-41, 3-64
clevel
confidence level for special T stats 2-26, 3-156
test for significance with chi-squared test 3-78
Codes
/ with 1-15
adding into columns 1-102
checking exclusive 1-150
checking number in column 1-154
checking type of 1-146
checking with require 1-144, 1-148
comparing 1-31
copying 1-90
counting, in columns 1-28
deleting 1-103
entering 1-14
list of 1-13
replacing 1-92
set random into columns 1-107
symbolic parameters for 2-232
Coding, defining axis requirements 2-25
Coding, summarizing for axes 2-25
col, basic count elements 2-83
blanks with 2-84
conditions 2-86
semicolons in text 2-85
text-only elements 2-88
col, column element 2-115, 2-140
colmap, column/code map for axes 1-226, 4-16
colrep, check column and code usage 4-27
coltxt, print text in main body of table 2-61
Column and code map for axes 1-226, 4-16
Column and code usage, check 4-27
Column headings 2-159
blank lines after 2-14, 2-162
blank lines before 2-14, 2-162
defining for Quanvert 4-71
in laser printed tables 3-199
line titles up with start of 2-204
splitting long texts 2-163
suppress with squeeze=2 2-193
text differs from row text 2-118
underlining 3-203
using colwid= 2-164
Index / 235
236 / Index
Conditions (continued)
types of 2-48
with c= 2-26, 2-46
with col statements 2-86
Confidence level for special T stats 3-156
Constants
comparing 1-31
individual 1-13
strings 1-15
Continuation
elements in sorted tables 3-137
long element texts 2-66
long statements 1-9
continue, read next statement 1-119
Continuity correction for t-test 3-161
Copying weights into the data 3-24
Correcting data
forced edits 1-159
methods of 1-159
online 1-160
split 1-161
write 1-161
Corrections file 1-170, 4-4
corrfile, corrections file 4-4
count, create a holecount 1-135
crd=, card type location 1-55, 3-47, 4-2
Create new data files
split 1-167
write 1-69
Creating a table of contents 3-189
Creating new cards 1-70
Cross-referencing in panel studies 4-73
csort, sort columns 2-10, 3-127
Cumulative output summary file 4-22
Cumulative percentages 2-16
example of 2-34
Currency symbols, print next to absolutes 2-81
Customized text file, define 4-8
C-variables 1-18
D
d, delete codes in online edit 1-163
Data
automatic filtering of in Quanvert Text 4-84
C array 1-18
checking and verifying 1-4
compressed, reading 1-225
convert to Quanvert database 4-93
convert to SAS format 4-56, 4-65
convert to SPSS format 4-38, 4-44
converting multicoded to single coded 1-181
correcting 1-159
counting responses with numeric codes 1-108
define structure in levels file 3-47
merging cards from different files 1-59
Data (continued)
merging fields from an external file 1-61
merging files 4-4
non-standard format 1-63, 1-225, 2-250
output file for require 4-18
overlapping, with special T stats 2-30, 3-159
Quantum format 4-167
reading into C array 1-48
types of 1-47
write out fixed length records 1-69, 1-73
write out in user-defined format 1-84
Data files
#include with 2-227
define T variables in 1-113
non-standard 1-63, 1-225, 2-250
Databases
access Unix with PC-NFS 4-130
add variables to 4-99
convert unpacked files 4-130
copy packed 4-125
create 4-93
do not compress 4-124
files 4-94
icon 4-90
join split for unpacking 4-127
levels 4-72, 4-73
link similar 4-101
make secure 4-116
maximum size of packed file 4-124
new format 4-67
old format 4-67
pack and split 4-124, 4-129
Quanvert (Windows) 4-86
security level 4-117
split large packed 4-127
store variables in subdirectories 4-80
transfer format 4-125
transfer programs for 4-125
unknown file formats 4-128
unpack 4-126
weighted 4-71
see also Quanvert, Quanvert Text, Quanvert
(Windows), Multiproject databases
Data-mapped variables 1-201
assigning values to 1-207
defining 1-203
testing values of 1-211
using in analysis specifications 1-213
Data-mapping files 1-201, 1-203
Datapass error messages 297, 4-163
Datapass error summary file 4-18
Datapass program 1-227
date, print date on table 2-10, 2-32
db.ico file for Quanvert (Windows) 4-90
db.nts file for Quanvert (Windows) 4-90
db.ptf, translation file 2-176, 4-23, 4-77
dbhelp.msg file for Quanvert (Windows) 4-90
debug, intermediate figures for special T stats 3-157
E
e, insert codes in online edit 1-163, 3-124
#ed, start edit in tab section 3-124
ed, re-edit current record online 1-166
Index / 237
238 / Index
Examples
aided and unaided awareness 2-230
anlev= 3-53
brand awareness questions 2-52
breakdown 2-167
c=+ 2-134
chi-squared test 3-74
column percentages 2-57
cumulative percentages 2-34
data-mapped variables 1-201, 1-213, 1-215
div 2-187
editing with levels 3-51
Friedmans test 3-87
grids 2-239, 2-240, 2-241
hitch/squeeze 2-191
indices 2-35
Kolmogorov-Smirnov test 3-82
manipulation 3-40, 3-42
maxim and minim 2-37
McNemars test for differences 3-84
multidimensional tables 2-172
Newman-Keuls test 3-113
one sample T-test 3-102
one sample Z-test 3-94
one-way analysis of variance 3-110
paired T-test 3-102
percentaging against redefined base 2-103
percentaging with nets 2-73
process 1-130, 2-100
product tests 2-247
smbase= 2-199
subtotals 2-136
suppress percents with small bases 2-199
symbolic parameters 2-229, 2-232
table of means 2-36
table with inc= 2-136
total percentages 2-33
total rows in tables 2-121
totals 2-136
Exclude respondents from weighting 3-6
exp, exponentiation manipulation operator 3-27
explode, convert multicoded data to single coded
1-181
export, export element to SAS or SPSS 2-114
Exporting data, suppressing elements 2-115
exportmp, force an axis to be multicoded when
exporting to SPSS 2-41, 4-50
Expressions
arithmetic 1-25
combining arithmetic 1-26
combining logical 1-39
comparing data variables 1-31
comparing values 1-30
logical 1-30
manipulation 3-26
mixed mode arithmetic in 1-27
mixing logical operators 1-41
numb 1-28
Expressions (continued)
random 1-29
range 1-38
with table manipulation 3-34
Extended ASCII character set
defining 4-169
laser printed tables 3-212
octal punch code file 4-171
External data file, merge a field from 1-61
External variables 1-199
with subroutines 1-186
F
F and T values with nft 3-108
formula 3-117
fac=, factors for statistics 2-119, 2-138
in same axis as inc= 2-139
on col and val 2-120
on row elements, for T-test 3-101
percentiles 2-144, 2-145
with stat= 3-93
Factor weighting 3-2, 3-7
factor, factor weighting 3-7
Factors
decrementing by a constant 2-120
defining 2-119
incrementing by a constant 2-120
on col and val 2-120
percentiles from 2-144, 2-145, 2-146, 2-148
reverse sequential order for percentiles 2-146
scaling 2-117
switching off 2-120
failed_, action when require fails 1-156
fen, font encoding files 3-212
fetch, load data from a look-up file 1-178
fetchx, load data from a look-up file 1-180
field, count numeric codes across fields 1-108
fieldadd, count numeric codes 1-111
Fields
checking codes in 1-37
comparing 1-35
copying codes into 1-91
merging from an external file 1-61
referring to 1-18
figbracket, print characters around absolutes 2-41,
2-81, 2-114
figchar=, character to print next to absolutes 2-41,
2-81, 2-114
figpost, print character after absolutes 2-41, 2-81,
2-114
figpre, print character before absolutes 2-41, 2-81,
2-114
File formats, unknown for databases 4-128
filedef, define output file type 1-78
override ruler printing with ident 1-83
Index / 239
Files
aliases 4-6
alp 4-94
ax, axis information files 4-94
axes.inf 4-94
binasc.dat 4-171
bineas.dat 4-171
bintab.qt 4-171
bit 4-94
btx 4-94
C subroutine code 4-11
cell counts 3-38, 4-22
clean data 1-167, 4-16
column and code map 1-226, 4-16
comm.qsp 4-44
commands 4-65
commands.qsp 4-51
compilation listing 1-226, 4-13
compiled C subroutine code 4-22
compiled subroutines 1-183
compressed data 1-225
corrections 1-170, 4-4
created at compilation stage 1-226
created by flip 4-94
cumulative output summary 1-230, 4-22
customized table texts 4-7
data merge file 4-4
data.qsp 4-44, 4-51
data-mapping 1-201, 1-203
datapass error summary 4-18
default options 2-32, 4-3
deletion of temporary 1-223
descrips.inf 4-24, 4-94
dirty data 1-167, 4-16
fen 3-212
fli, inverted data files 4-94
flip.cnf 4-78
format file for table of contents 3-194
frequency distribution 4-17
generated by qdiaxes 1-220
graphics output 4-22
holecount 4-17
inc 4-94
intermediate figures for special T stats 3-157
levels 3-45, 4-2, 4-94
log 1-230
machine.def 4-128
manipulated cell counts 3-38
merges 4-4
merging data from different files 1-59
mul 4-75, 4-94
nums 1-229
nums.man 1-229
output data from require 4-18
PostScript 3-198
private.c 4-11
private.o 4-22
ptf, translation file 2-176, 4-23, 4-77
240 / Index
Files (continued)
qdi 1-201, 1-217
Quanvert
levels cross-reference 4-95
numdir.qv 4-80
required for 4-96
tstatdebug 4-76
Quanvert (Windows) 4-86
db.ico 4-90
db.nts 4-90
dbhlp.msg 4-90
qextras file 4-91
qnaire.txt 4-91
sound files 4-74
stats.ini 4-86
Quanvert Text 4-81
access rights 4-81
availang 4-84
foreign language prompts 4-81
mfwaves 4-111
profopts 4-82, 4-85
qotext.dat 4-82
qvtext.dat 4-82
users 4-83
records written by write/require 1-145, 4-17
rim weighting parameters 4-5
run definitions 3-38, 4-3
statdata 4-65
subroutine source 1-183
table of contents format 3-190
tables 1-230, 4-22
texts.qt 4-8
user-defined limits 4-9
variables 1-196, 4-1
weighting report 1-229, 4-19
Filtered holecounts 1-136
Filters
canceling 2-219
groups of tables 2-217
in grid tables 2-247
n00 in axis 2-104
named 2-11, 2-220
nested sections 2-221
on per-user basis in Quanvert Text 4-84
Quanvert 4-71
sample program 2-219
firstread, first card in record read 1-51, 3-64
Fixed length records, writing out 1-69, 1-73
fld, elements with numeric codes 2-94
bit argument limit 2-267, 4-133
options on 2-112
when to use bit instead 2-99
fli files 4-94
Flip, create Quanvert database 4-68, 4-93
configuration file 4-78
files created by 4-94
reasons axes excluded 4-69
remove files used by 4-97
Formulae (continued)
on column proportions 3-179
one sample 3-117
paired 3-117
two sample 3-117
Z-test
one sample 3-115
overlapping samples 3-116
subsample proportions 3-116
two sample on proportions 3-115
Frequency distribution file 4-17
Frequency distributions 1-138
alphabetic 1-139
double quotes in headings 1-140
missing values in 1-139
multiplied 1-142
ranked 1-139
weighted 1-142
friedman, two-way analysis of variance 3-85
Friedmans test 3-85
example of 3-87
formula 3-91
F-test see Analysis of variance, one-way, ANOVA
Functions, C library 1-192
G
g, layout column headings 2-165
combining groups of 2-166
in laser printed tables 3-199
sid statements with 2-181
spacing with 2-166
.ge., greater than or equal to 1-30
Generate Quantum spec from qdi file 1-217
go to, routing in edit section 1-118
graph=, create graphics input files 2-13, 2-32
files created by 4-22
Grid axes see Grids
Grid tables see Grids
grid, identify a grid table 2-244
Grids
#def with 2-243
components of 2-238
creating tables 2-244
data-mapped variables 1-215
example of 2-241
code symbolic parameters 2-240
column and code symbolic parameters
2-240
column symbolic parameters 2-239
export to SAS/SPSS from Quanvert 2-40,
2-249
filtered columns in 2-247
in levels jobs 2-245
increments in 2-243
inctext= invalid with 2-123
Index / 241
Grids (continued)
recognizing 2-238
rotated, op= with 2-245
weighted 2-246
group=, axis group for element 2-79, 2-114
groupbeg, start of subaxis 2-77
groupend, end of subaxis 2-77
Groups in Quanvert (Windows) 4-68
.gt., greater than 1-30
H
Harvard Graphics 4-32
hct_, holecount file 1-228, 4-17
hd=, axis subheading 2-41
hdlev=, nested subheadings for column axes 2-63
hdpos=, position of subheadings above columns 2-65
header=, header length in non-std data file 2-250
heap=, maximum number of characters per axis 4-9
Hierarchical data
process with 3-63
processing with clear= 3-64
processing with levels 3-45
see also Levels
Highest card type 1-57, 1-198, 3-47, 4-2
hitch=, print table on same page as previous table
2-13, 2-188
how Quantum compares table texts 2-194
numbering printed pages 2-19
paper saving mode 2-191
paste one table under another 2-195
print page numbers logically/physically 2-196
short tables with 2-190
table texts with 2-191
hold=, rows to reprint at top of continued tables
2-109, 2-114
Holecount file 4-17
Holecounts 1-133
basic 1-135
double quotes in headings 1-135
filtered 1-136
multiplied 1-136
weighted 1-136
hug=, space required at bottom of page 2-108
I
Icons for Quanvert (Windows) 4-90
ID text, and data-mapped variables 1-209
id, multiple runs in a directory 1-231
id=, manipulation id 2-115, 2-174, 3-36, 3-41
on n/col/val/fld/bit 3-28
242 / Index
J
Jobs
check whether sufficient disk space to run 4-177
compile only 1-226
complete run 1-224
create log file 1-230
create Quanvert database 4-93
creating tables 1-229
deletion of temporary files 1-223
load C code 1-227
modifying for Quanvert 4-68
multiple runs in a directory 1-231
read and process data 1-228
rerun compilation & output stages only 1-229
run in background 1-230
speeding up 1-45
stages in 1-223
temporary space for 4-178
Join split databases 4-125
Jumping to tab section 1-126
Justification
column headings in laser printed tables 3-199
row text in laser printed tables 3-203
K
keep, percentage differences 2-123, 2-126, 2-175
Kolmogorov-Smirnov test 3-81
formulae 3-90
ks, Kolmogorov-Smirnov test 3-81
L
l, name an axis 2-40
Labels 1-4
with do 1-119
with go to 1-118
lang=, specify the language 2-13, 2-176, 4-77
Languages
Quanvert (Windows) 4-77
Quanvert Text 4-81, 4-84
SAS 4-56, 4-64
specify 2-13, 2-176
SPSS 4-38, 4-44, 4-54
tables 2-176
Large numbers, printing 2-27
Laser printed tables
fonts for 2-11
justification of column headings 3-199
justification of row text 3-203
personalized PostScript code 3-213
printing extended ASCII characters in 3-212
special characters with 3-201
suppressing border 3-206
lastread, last card in record read 1-51, 3-65
lastrec, last record in file read 1-52
.le., less than or equal to 1-30
Least significant difference test 3-175
formula 3-182
len=, change the record length 1-78
levbase, increment base at anlev=level 2-124, 3-57
level, edit for a specific level 3-50
Levels
analysis level for tables 2-10
cross-reference files for Quanvert 4-94, 4-95
cross-tabulating axes at different levels 3-53
define data structure in level file 3-47
defining in levels file 3-45, 4-2
defining with struct 3-48
example of edit 3-51
grids with 2-245
how tables are produced 3-51
inc= on flt statements 2-218
introduction to 3-45
levels file 3-45
maximum allowed 3-45
maximum cards per record 4-2
maximum sub-records per record 3-48
naming in edit section 3-50
numeric variables 3-59
preparing for Quanvert 4-72, 4-73
process with 3-63
record length 3-48
special T statistics 3-62, 3-149, 4-88
statistics with 3-61
updating bases in uplev= tables 2-124, 3-57
updating cells with anlev= 3-53
updating tables at higher level than axes 3-54
weighting 3-12
Index / 243
M
m, create a manipulated row 3-25
define manipulation expression 3-26
options on 3-25
machine.def, qvpack/qvtrans alias file 4-128
244 / Index
Means (continued)
table of means 2-28, 2-36
test difference between 3-110, 3-112, 3-165
test for specific values 3-101
test paired differences between 3-101
t-test on column 3-164
two sample T-test for comparing 3-105
with fac= 2-140
with inc= 2-142
median, median values of inc= 2-124
Medians, see percentiles
medint=, interpolation method for percentiles 2-28,
2-151, 2-152
mergedata, merge data from an external file 1-61
merges file 1-59, 4-4
mflip program 4-107
mfwaves file 4-111
min, minimum manipulation operator 3-26
minbase=, very small base for T stats 2-29, 3-150,
3-151
minim, minimum values of inc= 2-29, 2-124
example of use 2-37
Minimum weight, defining 3-8, 3-18
minwt=, minimum weight 3-8
Missing values
assignments 1-173
checking for 1-175
counting with val 2-94
exporting as missing_ 2-124
in arithmetic expressions 1-173
in frequency distribution 1-139
processing in the edit 1-172
Quanvert 4-74
switch on/off in edit 1-172
switch processing on/off in tab section 2-29,
2-32
treat other values as 2-30, 2-42, 2-124
when found 1-172
with inc= 2-122
with n25;inc= 2-142
with pre/postweights 3-8
missing=, treat other values as missing 2-30, 2-42,
2-124
missing_, missing values 1-173, 1-174
missingincs, switch missing values processing on/off
1-172, 2-29, 2-32
with if 1-173
missingval, export missing data as missing_ 2-124
mul files 4-75, 4-94
Multicard records
definition of 1-47
more than 100 columns per card 1-63
reading 1-49
writing 1-66
Multicodes
convert to single codes 1-181
entering 1-14
printing 1-80
N
n statements, options on 2-112
n00, filtering within an axis 2-104
example of use 2-241
with n04 and n05 2-134
with redefined base 2-103
n01, basic counts 2-50
percentiles with inc= 2-144, 2-149
n03, text only 2-58
n04, total 2-133, 2-134
example of 2-121
n05, subtotal 2-133, 2-134
n07, average 2-137
n09, start new page 2-108
n10, base 2-57
n11, base, non-printing 2-57
in Quanvert Text 4-85
n12, mean 2-140
analysis levels with 3-61
suppress if has small base 2-20, 2-196
with ANOVA 3-110
with T-tests 3-101
with two sample T-tests 3-105
n13, sum of factors 2-144
n15, basic counts, non-printing 2-56
n17, standard deviation 2-136
suppress if has small base 2-20, 2-196
with ANOVA 3-110
with T-tests 3-101
with two sample T-tests 3-105
n19, standard error of the mean 2-136
alternative formula for 2-143
calculate using weighted figures 2-31
suppress if has small base 2-20, 2-196
with ANOVA 3-110
with T-tests 3-101
with two sample T-tests 3-105
n20, error variance of the mean 2-136
suppress if has small base 2-20, 2-196
Index / 245
246 / Index
O
One dimensional chi-squared test 3-73
formula 3-89
One sample T-test 3-101
example 3-102, 3-103
formula 3-117
One sample Z-test 3-93
example 3-94
formula 3-115
One-way analysis of variance 3-110
example 3-110
formula 3-119
Online edit
accepting records 1-165
canceling 1-167
correcting data 1-163
creating new cards 1-166
delete codes from column 1-163
deleting cards 1-166
displaying columns 1-162
e 1-163
ed 1-166
insert codes in column 1-163
overwrite column 1-163
redefine command names 1-167, 4-7
re-edit current record 1-166
reject record in 1-165
rt 1-165
s 1-163
split 1-161
terminate for current record 1-165
write 1-161
online, interactive data correction 1-160
op=, output types 2-15, 2-117
A/B percentage differences 2-124, 2-126
order of printing with 2-17
separate tables for different output types 2-17
with rotated grid axes 2-245
Open ended responses 4-74
Options
defining run defaults 2-32
on a 2-8, 2-9
on add 2-186
on col 2-112
on div 2-187
on fld 2-112
on flt 2-9, 2-217
on l 2-40
on m 3-25
on n statements 2-112, 2-117
on sectbeg 2-9
on sid 2-180
on tab 2-9, 2-174
on und 2-180
on val 2-112
on wm 3-7
switching off 2-32
Index / 247
P
p, position cell counts 2-167
Packed databases 4-124
join split database 4-127
maximum size of 4-124
split file 4-127
unpack packed file 4-126
Packing databases 4-129
extra files for Quanvert (Windows) 4-91
<<pag>>, page numbers on tt statements 2-213
pag, page numbers 2-213
Page break
suppress between all tables 2-191
suppress between split wide tables 2-190
suppress between tables 2-190
Page length 2-18
Page numbers
switching off 2-213
user-defined, positioning with tt statements
2-213
with and 2-178
with hitch/squeeze 2-191
with multidimensional tables 2-174
Page width 2-18
set for Quanvert Text 4-85
suggestions for Quanvert 4-71
page, automatic page numbering 2-18, 2-32
248 / Index
Pages
center tables on 2-22
number of lines on 2-18
numbering 2-18, 2-213
print more than one table on 2-188
start new 2-108
suppress numbering 2-18, 2-32
width of 2-18, 4-71, 4-85
Pagination
automatic 2-105
order in split tables 2-107
precedence of rows & columns 2-19
paglen, page length 2-18
pagwid, page width 2-18
Quanvert Text 4-85
Paired preference test 3-170
formula 3-181
P-values for 3-173
Paired T-test 3-101
example 3-102, 3-103, 3-104
formula 3-117
Panel studies
cross-referencing levels in 4-73
flip individual waves 4-112
link waves in 4-113
weighting in 4-113
Paper saving output 2-191
Parentheses, with data variables 1-197
Partial column replacement 1-92
pc, print percent signs 2-18, 2-32
PC-NFS, access Unix databases with 4-130
pcpos=, position of percentages 2-18, 2-117
pcsort, sort on percentages 2-18, 3-128
pczerona, print NA for percents with zero bases 2-19
pd, directory for permanent files 1-232
Penetration tables
creating with celllev= 3-58
creating with clear= 3-65
Percentage differences 2-125
flag table for 2-175
order of op= options with 2-127
Percentages
100% on base row 2-10, 2-16
against redefined base 2-16
column 2-16
example of 2-57
suppress small 2-21, 2-117
cumulative 2-16, 2-34
decimal places 2-11, 2-113
forced rounding to 100% 2-19
nets 2-73
position relative to absolutes 2-117
print NA for percents with zero bases 2-19
print percent signs 2-18, 2-22
printing flush with absolutes 2-11
redefined bases, example of 2-103
row 2-16
suppress small 2-21
Percentages (continued)
side by side with absolutes 2-17
sorting 2-18, 3-128
suppress if have small base 2-20, 2-196
suppress percent signs 2-18, 2-32
suppressing for a single row 2-118
total 2-15
example of 2-33
suppress small 2-21, 2-117
with sid and und 2-182
Percentiles
factors in reverse sequential order 2-146
from absolute values 2-30, 2-144, 2-149
from factors 2-144, 2-145, 2-146, 2-148
interpolation method 2-28, 2-146, 2-151
Permanent files, directory for 1-232
physpag, page numbering with hitch and squeeze
2-19, 2-32
Position of cell counts in tables 2-167
post=, postweighting 3-8, 3-16
inctext= invalid with 2-123
Postprocessors for Quanvert Text 4-82, 4-85
PostScript
personalized code for laser printed tables 3-213
printing tables with 3-198
special characters in axes 3-199
suppress table of contents 3-211
user-definable characters 3-201
#postscript, start PostScript code 3-213
Postweights 3-6, 3-16
Pounds signs in tables 3-198
ppt, paired preference test 3-170
pre=, preweighting 3-8, 3-16
inctext= invalid with 2-123
Precoded response, check for 1-206
Prevent access to unweighted data in Quanvert 4-116
Preweights 3-5, 3-16
Print files
define default output for 1-81
PostScript 3-198
turn off default parameters for 1-83
printed_, current record has been written out 1-67
Printing | and ! in element texts 3-202
Printing DNA and NA for missing values 4-74
Printing multicodes, output options 1-80
Printing records
ident 1-81
qfprnt 1-84
require 1-145
write 1-65
printz, print all-zero tables 2-19
priority, force single-coding 1-104
private.c, C subroutine code file 4-11
private.o, compiled C subroutine code file 4-22
process, tabulate record 1-129
effect on Quanvert databases 4-75
example of 1-130, 2-100
position in edit 1-131
Index / 249
Proportions (continued)
test of differences
between overlapping samples 3-99
between subsamples 3-97
t-test on column 3-160
two sample test of difference 3-95
pstab, create PostScript tables 3-198
ptf, translation file 2-176, 4-23, 4-77
Punch codes, ASCII equivalents 4-175
punch()=, symbolic parameters for codes 2-232
punchout.q, records written out by require 1-228,
4-18
pvals, print P-values for special T stats 3-159
P-values
Newman-Keuls test 3-165
paired preference test 3-173
significant net difference test 3-169
t-test on column means 3-164
t-test on column proportions 3-163
Q
q2cda, Quantum tables to CDA 2-82, 4-32
column headings 2-169
options with 4-35
qdi files 1-201, 1-217
qdiaxes, generate Quantum spec 1-217
qextras.lst file for Quanvert (Windows) 4-91
qfprnt, write out data in user-defined format 1-84
qnaire.txt file for Quanvert (Windows) 4-91
qotext.dat 4-82
qout, output program 1-230
qqhct, holecount file 4-17
qsj, split or join databases 4-125, 4-127
QTAXES, maximum number of axes per run 4-10
qteclean, delete files created by edit-only run 4-25
QTEDHEAP, to adjust edit statement complexity
4-10
QTELMS, max number of elements per axis 4-10
qtext, convert Quantum data to text format 4-167
QTFORM define special characters for laser printing
3-201
QTHEAP, max number of characters per axis 4-10
QTHOME, Quantum home directory 1-223
QTINCHEAP, max number of characters for inc=
variables 4-10
QTINCS, maximum different inc= per run 4-10
QTINLISTHEAP, adjust definelist complexity 4-10
qtlclean, delete temporary compilation files 4-25
QTLEXCHARS, max size of long text strings 4-10
qtm_ex_, datapass program 1-227
QTMANIPHEAP, max size of expressions 4-10
QTNAMEVARS, max num of named variables 4-10
QTNOPAGE, suppress blank page 4-23
QTNOWARN, suppress license expiry warning 4-11
qtoclean, delete files created by quantum -o 4-25
250 / Index
R
Random code, set into column 1-107
Random numbers, generating 1-29
random, generate random numbers 1-29
range, check arithmetic value of field 1-38
rangeb, test arithmetic value of field, with blanks
1-39
Ranges as conditions 2-92
Ranking see Sorting
Ranks in Friedman test 3-85
Raw counts in secure databases 2-45, 2-118, 4-116,
4-118
read=, how to read data 1-54
Real numbers 1-16
copying into columns 1-98
saving in integer variables 1-96
significant figures with 1-16
Real variables 1-21
defining in subroutines 1-189
reset to zero 1-111
Reals and integers in the same expression 1-27
rec_acc, number of records accepted 1-125
rec_count, number of records read so far 1-52
rec_rej, number of records rejected 1-125
reclen=, record length 1-54, 2-250, 4-2
Record length 1-54, 1-78
in levels data 3-48
in non-std data files 2-250
with levels 4-2
Record structure, defining 1-53
Record type, defining 1-53
Records
counting by axis name 2-25
distribute one element across the axis 2-129
examining with list 1-138
last in file, checking for 1-52
maximum cards in, in levels jobs 4-2
maximum sub-records per, in levels data 3-48
multicard with more than 100 cols per card 1-63
number read in so far 1-52
printing 1-145
rejecting from tables 1-145
types of 1-47
writing out parts of 1-68
Redefined base, percentaging against 2-16
Reformatting data 2-53
Refused, data-mapped variables 1-205
rej=, excluding elements from the base 2-125
reject, omit record from tables 1-124
with require 1-126
rejected_, current record has been rejected 1-125
Rejecting records from tables 1-124, 1-145
rep=, repeated card types 1-56
Repeated card types
defining 1-56
in unusual order 1-52
missing 1-52
Index / 251
252 / Index
S
s, assignment in online edit 1-163
s, side element for manipulation 3-41
Sample Quantum job 2-253
Sample tables
cumulative percentages 2-34
hitch/squeeze 2-191
inc= 2-136
indices 2-35
means 2-36
multidimensional tables 2-172
Index / 253
254 / Index
Statements
aliases for 4-6
continuation of 1-9
length of 1-4
statistical 2-136
Statistical elements, in sorted tables 3-141
Statistical statements, list of 2-136
Statistics
analysis levels with 3-61
exclude missing values from 2-142
F and T values 3-108
factors for 2-119
flag cells with small bases 2-20
general notes about 3-71
more than one per axis 3-69
more than one per table 3-71
Quanvert (Windows) 4-86
sorted summary tables of 3-142
spechar with 2-22, 2-137
squared weighting elements for 2-30, 2-49,
2-143, 3-147
summary table of requirements 3-72
table-level 2-31, 3-70
triangular array of 3-71
see also Special T statistics
stats.ini file for Quanvert (Windows) 4-86
stop, terminate the edit 1-127
stopped_, stop statement executed 1-127
Storing your program 1-3
Strings of data constants 1-15
Strings, semicolons in 1-90
struct, define record structure 1-53
with levels data files 3-48
Subaxes
end of group 2-77
naming groups on elements 2-79, 2-114
start of group 2-77
tables from 2-80
Subdirectories, store variables in 4-80
Subheadings
in sorted tables 3-137
in tables 2-62
nesting in column axes 2-63
positioning above columns 2-65
underline 2-63
Subroutines
arguments with 1-188
convert multicoded data to single coded 1-181
defining variables in 1-189
explode 1-181
fetch 1-178
fetchx 1-180
load data from look-up file 1-178, 1-180
using 1-177
writing your own 1-182
Subscription 1-23, 1-91
subsort, start secondary level sorting 2-118, 3-134
T
T and F values with nft 3-108
T statistics see Special T statistics
T variables, define in data file 1-113
t1, one sample/paired T-test 3-101
t2, two sample T-test for comparing means 3-105
<<tab>>, table numbers on tt statements 2-211
Tab section, jump to from edit 1-126
tab, name axes for table 2-171
options on 2-9, 2-174
tab_, tables file 1-230, 4-24
font numbers on right side 2-12
suppressing blank page 4-23
tabcent, center tables on the page 2-22
tabcon 3-189
Table numbers 2-210
justification of 2-211
suppress 2-15
switching off 2-211
user-defined, positioning with tt 2-211
with and 2-211
Index / 255
Tables (continued)
paste one under the other 2-188, 2-195
placing side by side 2-180
position of cell counts in 2-167
position on page 3-211
pounds signs in 3-198
precedence of rows & columns when paginating
2-19
print base title last 2-10
print date on 2-10
print output type on 2-24
print text in main body of 2-61
reprint rows at top of continued 2-109, 2-114
row text width 2-20
separate for different output types 2-17
sorted means 3-141
sorted summary statistics 3-142
sorting 2-21, 3-125
suppress column headings 2-193
suppress if base less than given value 2-21
suppress numbering 2-15
suppress output type on 2-24
suppress page break between 2-190
suppress the base on continuation pages 2-57
suppressing all-zero 2-19, 2-32
suppressing printing 2-15
texts 2-5, 4-7
titles and other texts with hitch/squeeze 2-191
titles at bottom of page 2-210
titles for 2-181, 2-203
titles from axis names 2-10
titles from hd= text 2-22
titles to print first 2-23
titles to print last 2-24
types of data in 2-3
unsorted where default is sorted 2-32
updating cells at higher level than axes 3-54
using dummy data 3-43
using subaxes 2-80
vertical lines in 2-167
Tables file 4-22
tabn.syl, graphics files 4-22
Tabulation section
C code in 3-123
components of 2-7
editing in 3-124
hierarchies in 2-8
Tabulation statements, format of 1-5
Tags, internal variable names 4-15, 4-24, 4-94
Target weighting 3-2, 3-7
target, target weighting 3-7
tb, table numbers 2-210
tba, left justify table numbers on first page 2-211
tbb, right justify table numbers on first page 2-211
tc.def, table of contents format file 3-194
td, directory for temporary files 1-231
Temporary disk space for a run 4-178
256 / Index
Temporary files
delete 4-25
directory for 1-231
summary of 4-23
Terminating the edit 1-127
Terminating the run 1-128
with tables 1-127
without tables 1-128
termwid, output width in Quanvert Text 4-85
Testing values of data-mapped variables 1-211
Text
at the bottom of tables 4-71
break points 2-163
continuing in axes 2-66
indent element when split 2-115
numeric variables 2-27, 2-42, 2-123, 4-42
prevent alteration of, in Quanvert Text 4-83
print in body of table 2-61
row, indenting folded 2-13
symbolic parameters for 2-234
table titles 2-203
underlining on elements 2-119
Text files, convert to Quantum format 4-167
Text strings, limit for 4-10
Text variables, for Quanvert 4-73, 4-74
textconv, translate Quanvert Text prompts 4-82
textdefs, number of text symbolic parameters per run
4-9
Text-only elements 2-58
sorted tables 3-137
with col/val/fld/bit 2-88
textq, convert text to Quantum data format 4-167
texts.qt, customized text file 4-8
thisread, cards read during current read 1-50
title, table titles from axis titles 2-22, 2-32
Titles 2-203
altering default order 2-205
at bottom of page 2-210
creating from axis names 2-10
default printing order 2-205
defining for Quanvert 4-68
footnotes on tables 2-208
in laser printed tables 3-205
justification of 2-203, 2-215
order of 2-24
prevent alteration of in Quanvert Text 4-83
print base last 2-10
suppress automatic for special T statistics 2-15
T statistics 3-151
table description, customizing 4-7
table, from hd= 2-22
underlining 2-207
which to print first 2-23
which to print last 2-24
with hitch/squeeze 2-191
with nested filter sections 2-221
with sid and und 2-181
topc, percent signs at top of column 2-22, 2-32
U
u, underline column headings 2-168
und, tables one under the other 2-180, 2-181, 2-182
Underlining
column headings 2-168
column headings with pstab 3-203
element texts 2-119
for separate column texts in q2cda 2-169
in laser printed tables 3-203
in table of contents 3-190
subheadings 2-63
titles 2-207
Uniform distribution, test for 3-73
uniq_id, unique respondent numbers for Quanvert
4-121
uniqid=, in element texts 4-105
axes generated by qdiaxes 1-222
Unique ID text, and data-mapped variables 1-209
Unknown file formats for databases 4-128
unl, underline text 2-63, 2-119, 2-207
Unpack databases 4-126
Unweighted data, prevent Quanvert access 4-85,
4-116
uplev=, axis update level 2-45, 3-56
comparison with celllev 3-58
example of intermediate file with 3-57
statistics with 3-61
update base for all records at anlev= level
2-124, 3-57
with grids 2-245
useeffbase, use weighted counts for standard error
2-31, 2-32, 2-143
*usemap, define data-mapping file 1-204
User-definable limits 4-9
Quanvert Text 4-83
Users file, for Quanvert Text 4-83
Index / 257
258 / Index
Weights (continued)
minimum 3-18
switching off 3-23
using 3-23
Whole numbers 1-16
Wide tables, print all on one page 2-190
Width of terminal display for Quanvert Text 4-85
Wildcard characters with quclean, qteclean, qtoclean
& manipclean 4-26
Windows-based Quanvert see Quanvert (Windows)
wm, define a weight matrix 3-7
wm=, weighting matrix to use 2-31, 2-125, 3-23
wmerrors, weighting error handling 2-31, 2-32,
3-10
write, write out records 1-65
as part of another statement 1-66
correcting errors from 1-160
creating data files 1-69
default output file 1-67
define default print parameters for 1-81
defining the file type 1-78
file of records failing 4-17
override use of ruler with ident 1-83
specifying an output file 1-67
turn off default print parameters 1-83
with explanatory texts 1-67
writing selected fields only 1-68
wtfactor=, factor weighting 3-15
wttarget=, target weighting 3-14
wttran, copy weights into data 3-24
Z-test (continued)
overlapping samples 3-99
example of 3-100
formula 3-116
subsample proportions 3-97
example of 3-96, 3-98
formula 3-116
two sample on proportions 3-95
formula 3-115
X
xor, logical operator for assignment 1-101
X-variables 1-22
Z
z1, one sample Z-test on proportions 3-93
z2, two sample Z-test on proportions 3-95
z3, Z-test on subsample proportions 3-97
z4, Z-test on overlapping samples 3-99
Zero
exclude from averages 2-137
special characters for 2-21
suppressing columns 2-15
suppressing elements 2-32
suppressing rows 2-15
suppressing tables 2-19, 2-32
Z-test
one sample 3-93
example of 3-94
formula 3-115
Index / 259