Compiling And Executing The DATA Step

Looking Behind the Scenes
The DATA step is processed in two phases:  compilation  execution. Data flight; infile 'raw-data-file'; input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20; run;

Looking Behind the Scenes
At compile time, SAS creates  an input buffer to hold the current raw data file record that is being processed
1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0

a program data vector (PDV) to hold the current SAS observation Flight Date Dest FirstClass Economy $ 3 $ 8 $ 3 N 8 N 8

 the descriptor portion of the output data set.
Flight Date $ 3 $ 8 Dest $ 3 FirstClass Economy N 8 N 8

infile 'raw-data-file'...Compiling the DATA Step data flight. run. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20. . .

.Compiling the DATA Step data flight. run. . infile 'raw-data-file'. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20.. Input Buffer 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 .

run. infile 'raw-data-file'. Input Buffer 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 PDV Flight $ 3 .. ..Compiling the DATA Step data flight. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20.

Compiling the DATA Step data flight. infile 'raw-data-file'. .. run. Input Buffer 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 PDV Flight Date $ 3 $ 8 .. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20.

. infile 'raw-data-file'. Input Buffer 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 PDV Flight Date $ 3 $ 8 Dest $ 3 .. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20.. run.Compiling the DATA Step data flight.

run. infile 'raw-data-file'. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20... Input Buffer 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 PDV Flight Date $ 3 $ 8 Dest $ 3 FirstClass N 8 . .Compiling the DATA Step data flight.

..Compiling the DATA Step data flight. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20. run. infile 'raw-data-file'. . Input Buffer 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 PDV Flight Date $ 3 $ 8 Dest $ 3 FirstClass Economy N 8 N 8 .

.Compiling the DATA Step data flight. infile 'raw-data-file'. .. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20. Input Buffer 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 PDV Flight Date Dest FirstClass Economy Flight Date $ 3 $ 8 Flight descriptor portion Dest $ 3 FirstClass Economy N 8 N 8 . run.

infile 'raw-data-file'. 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 Input Buffer PDV Flight Date Dest FirstClass Economy Flight Flight Date Dest FirstClass Economy . run. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20.Executing the DATA Step data flight.

. Flight Flight Date Dest FirstClass Economy .data flight. 43912/11/00LAX 20137 run.. . input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20.. 92112/11/00DFW 20131 11412/12/00LAX 15170 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 Raw Data Input Buffer PDV Flight Date Dest FirstClass Economy . infile 'raw-data-file'.

. Flight Flight Date Dest FirstClass Economy .data flight. 43912/11/00LAX 20137 run. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20. infile 'raw-data-file'.. .. 92112/11/00DFW 20131 11412/12/00LAX 15170 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 4 3 9 1 2 / 1 1 / 0 0 L A X 2 0 1 3 7 Raw Data Input Buffer PDV Flight Date Dest FirstClass Economy .

.. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20. 92112/11/00DFW 20131 11412/12/00LAX 15170 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 4 3 9 1 2 / 1 1 / 0 0 L A X 2 0 1 3 7 Raw Data Input Buffer PDV Flight Date 439 Dest FirstClass Economy .data flight. infile 'raw-data-file'. . Flight Flight Date Dest FirstClass Economy . . 43912/11/00LAX 20137 run.

.data flight. infile 'raw-data-file'.. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20. 92112/11/00DFW 20131 11412/12/00LAX 15170 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 4 3 9 1 2 / 1 1 / 0 0 L A X 2 0 1 3 7 Raw Data Input Buffer PDV Flight Date 439 12/11/00 Dest FirstClass Economy . 43912/11/00LAX 20137 run. Flight Flight Date Dest FirstClass Economy . . .

43912/11/00LAX 20137 run. . .data flight.. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20. 92112/11/00DFW 20131 11412/12/00LAX 15170 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 4 3 9 1 2 / 1 1 / 0 0 L A X 2 0 1 3 7 Raw Data Input Buffer PDV Flight Date 439 12/11/00 Dest LAX FirstClass Economy . Flight Flight Date Dest FirstClass Economy . infile 'raw-data-file'..

data flight. Flight Flight Date Dest FirstClass Economy . .. 92112/11/00DFW 20131 11412/12/00LAX 15170 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 4 3 9 1 2 / 1 1 / 0 0 L A X 2 0 1 3 7 Raw Data Input Buffer PDV Flight Date 439 12/11/00 Dest LAX FirstClass Economy 20 . 43912/11/00LAX 20137 run. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20.. infile 'raw-data-file'.

.. 43912/11/00LAX 20137 run..data flight. 92112/11/00DFW 20131 11412/12/00LAX 15170 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 4 3 9 1 2 / 1 1 / 0 0 L A X 2 0 1 3 7 Raw Data Input Buffer PDV Flight Date 439 12/11/00 Dest LAX FirstClass Economy 20 137 Flight Flight Date Dest FirstClass Economy . input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20. infile 'raw-data-file'.

92112/11/00DFW 20131 11412/12/00LAX 15170 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 4 3 9 1 2 / 1 1 / 0 0 L A X 2 0 1 3 7 Raw Data Input Buffer PDV Flight Date 439 12/11/00 Dest LAX Dest LAX FirstClass Economy 20 137 FirstClass Economy 20 137 Flight Automatic output Flight Date 439 12/11/00 . input Flight $ 1-3 Date $ 4-11 AutomaticDest $ 12-14 FirstClass 15-17 return Economy 18-20. 43912/11/00LAX 20137 run. infile 'raw-data-file'.. ..data flight.

..data flight. infile 'raw-data-file'. 92112/11/00DFW 20131 Reinitialize variables to 11412/12/00LAX 15170 missing 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 4 3 9 1 2 / 1 1 / 0 0 L A X 2 0 1 3 7 Raw Data Input Buffer PDV Flight Date Flight Date 439 12/11/00 Dest Dest LAX FirstClass Economy . 43912/11/00LAX 20137 run.. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20. FirstClass Economy 20 137 Flight . .

infile 'raw-data-file'. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20..data flight. 43912/11/00LAX 20137 run.. . . 92112/11/00DFW 20131 11412/12/00LAX 15170 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 4 3 9 1 2 / 1 1 / 0 0 L A X 2 0 1 3 7 Raw Data Input Buffer PDV Flight Date Flight Date 439 12/11/00 Dest Dest LAX FirstClass Economy . FirstClass Economy 20 137 Flight .

data flight.. . 43912/11/00LAX 20137 run. .. FirstClass Economy 20 137 Flight . 92112/11/00DFW 20131 11412/12/00LAX 15170 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 9 2 1 1 2 / 1 1 / 0 0 D F W 2 0 1 3 1 Raw Data Input Buffer PDV Flight Date Flight Date 439 12/11/00 Dest Dest LAX FirstClass Economy . input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20. infile 'raw-data-file'.

.data flight.. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20. 92112/11/00DFW 20131 11412/12/00LAX 15170 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 9 2 1 1 2 / 1 1 / 0 0 D F W 2 0 1 3 1 Raw Data Input Buffer PDV Flight Date 921 12/11/00 Flight Date 439 12/11/00 Dest DFW Dest LAX FirstClass Economy 20 131 FirstClass Economy 20 137 Flight . . 43912/11/00LAX 20137 run. infile 'raw-data-file'.

Flight Automatic output Flight Date 439 12/11/00 921 12/11/00 . 43912/11/00LAX 20137 run. input Flight $ 1-3 Date $ 4-11 AutomaticDest $ 12-14 FirstClass 15-17 return Economy 18-20.. infile 'raw-data-file'.data flight.. 92112/11/00DFW 20131 11412/12/00LAX 15170 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 9 2 1 1 2 / 1 1 / 0 0 D F W 2 0 1 3 1 Raw Data Input Buffer PDV Flight Date 921 12/11/00 Dest DFW Dest LAX DFW FirstClass Economy 20 131 FirstClass Economy 20 137 20 131 .

FirstClass Economy 20 137 20 131 15 170 Flight . 92112/11/00DFW 20131 11412/12/00LAX 15170 1 2 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1 4 1 2 / 1 2 / 0 0 L A X 1 5 1 7 0 Raw Data Input Buffer PDV Flight Date Flight 439 921 114 Date 12/11/00 12/11/00 12/12/00 Dest Dest LAX DFW LAX FirstClass Economy . 43912/11/00LAX 20137 run. . infile 'raw-data-file'. input Flight $ 1-3 Date $ 4-11 Dest $ 12-14 FirstClass 15-17 Economy 18-20.Executing the DATA Step data flight.

DATA Step Execution: Summary Compile Program Initialize Variables to Missing (PDV) Execute INPUT Statement Execute Other Statements Output to SAS Data Set End of File? Yes No Next Step .

The Compilation Phase In this phase.Overview of DATA Step Processing When you submit a DATA step for execution. SAS checks the syntax of the SAS statements and compiles them.e. . it is first compiled and then executed. automatically translates the statements into machine code and SAS identifies the type and length of each new variable. i.

During the compile phase. (When the DATA step reads a SAS data set. Note that this buffer is created only when the DATA step reads raw data. Input buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement. SAS reads the data directly into the program data vector. SAS creates the following three items: 1.) .

From here. one observation at a time. .2. When a program executes. SAS writes the values to a SAS data set as a single observation. SAS reads data values from the input buffer or creates them by executing SAS language statements. The data values are assigned to the appropriate variables in the program data vector. Program Data Vector (PDV) is a logical area in memory where SAS builds a data set.

. SAS does not write these variables to the output data set. _N_ and _ERROR_. The _N_ variable counts the number of times the DATA step begins to iterate.PDV Along with data set variables and computed variables. the PDV contains two automatic variables. or 1 (indicating that one or more errors have occurred). The value of _ERROR_ is either 0 (indicating no errors exist). The _ERROR_ variable signals the occurrence of an error caused by the data during execution.

names and data types (character or numeric) of the variables. the name of the data set and its member type.3. . and the number. It contains. the date and time that the data set was created. Descriptor Information is information that SAS creates and maintains about each SAS data set. including data set attributes and variable attributes. for example.

. 2. a new iteration of the DATA step begins. 3.The Execution Phase The flow of action in the Execution Phase of a simple DATA step is described as follows: 1. The DATA step begins with a DATA statement. and the _N_ automatic variable is incremented by 1. Each time the DATA statement executes. and writes that record into the PDV. SAS sets the newly created program variables to missing in the program data vector (PDV). SAS reads a data record from a raw data file into the input buffer.

and the values of variables created by INPUT and assignment statements are reset to missing in the program data vector. 6. and executes the subsequent programming statements for the current observation. SAS executes any subsequent programming statements for the current record. SAS counts another iteration. 7. 5. SAS writes an observation to the SAS data set. . reads the next record or observation. the system automatically returns to the top of the DATA step. The DATA step terminates when SAS encounters the endof-file in a SAS data set or a raw data file.4.

Sign up to vote on this title
UsefulNot useful