You are on page 1of 4

Gracefully Handling Empty Files in the SAS® System

Matt Kleinosky, Independent Consultant


O.V. Hanger, Nielsen Media Research

INTRODUCTION
This paper explores methods for gracefully handling empty . . An example of the more extreme events caused by an
files and data sets. In a multiple step SAS program, it is often nec- unantiCipated empty fiJe can be seen in an example involving SAS
essary to provide for the possibility of these situations. The conse- MACRO variables. If MACRO variables were to be created in the
quences of ignoring an empty file condition could be as slight as a first step and used in subsequent steps, the eHectof an empty input
SAS program n.ming to completion and creating no reports or as file is not so tidy.
severe as an abend in a production environment. The output from
the latter case could include a long list of confusing and, possibly. In Example 2 the macro variables do not get created in the
misleading SAS error (or warning) messages in the Log and abrupt first step so all references to them produce the SAS error messages
endings in Data steps involving SET, MERGE or UPDATE. There that surely indicate the error. But there is less emphasis in the log
are several ways to provide for the possibility of empty files. The first on the step which created the errant condition.
approach involves detecting the empty file in the Data step which
creates the SAS data set Another approach introduces a general- Example 2 - Source
ized Macro, %EMPTYCT, that not only provides flexibkJ handling of
empty files, but also can be used when a consolidated job summary
report of record and observation counts is needed. DATA EXAMP2A(KEEP=X STRING1); INFILE INDD1;
INPUT@l NUMI 2. @6 X 4. @11 STRINGI $6. ;
SOME EXAMPLES IF N=l THEN DO;
CALL SYMPUT('DIMENSN·,PUT(NUM1.2.));
Example 1 shows a simple SAS job that encounters an CALL SYMPUT('STRLEN·,
un~xpected empty raw data file. Here the DATA and PROC steps PUT(LENGTH(TRIM(STRINGI )).1.));
which use the empty SAS data sets fail to generate their normal END'
output. No special error or waming messages are produced in the RUN' .
data steps. A NOTE is produced for each PROC involving an empty DATA EXAMP2B; SET EXAMP2A;
SAS data set indicating 'NO OBSERVATIONS ... :: ARRAY TOTALS n
Al-A&DIMENSN;
RETAIN Al-A&DIMENSN 0;
Example 1 - Log ARRAY LABELS n
$ &STRLEN L1-L&DIMENSN;
RETAIN L1-L&DIMENSN;
DO I = 1 TO &DIMENSN;
1 DATA EXAMP1A; INFILE INDD1; TOTALS{I)=X'I;
2 INPUT@1 NUMI 2. @6X4. @11 STRINGI $6. LABELS{I}=SUBSTR(STRING1,I,l );
3 OUTPUT EXAMP1A; END'
4 RUN; RUN; .

NOTE: 0 LINES WERE READ FROM INFILE INDD1.


NOTE: DATA SET WORKEXAMPIA HAS 0 OBS ..... Example 2 - Log
NOTE: THE DATA STATEMENT USED 0.02 SECONDS AND
172K.

5 PROC SORT DATA=EXAMP1A; BY NUM1; 1 DATA EXAMP2A{KEEP=XSTRING1); INFILE INDD1;


6 RUN; 2 INPUT @1 NUMI 2. @6 X 4. @11 STRINGI $6. ;
3 IF N=l THEN DO;
NOTE: NO OBSERVATIONS TO BE SORTED. 4 CALL SYMPUT{'DIMENSN·.PUT(NUM1,2.));
NOTE: DATA SET WORK.EXAMPIA HAS 0 OBS ..... 5 CALL SYMPUT('STRLEN·.
NOTE:THE PROCEDURE SORT USED 0.Q1 SECONDS AND PUT{LENGTH{TRIM{STRING1)).1.));
44K 6 END;
7 RUN:
7 DATA EXAMP1B; SET EXAMP1A;
8 IF NUMI < 60; NOTE: 0 LINES WERE READ FROM INFILE INDDI.
9 IF NUM1 < 9 THEN PUT 'LOW VALUE FOUND' NUM1=; NOTE: DATA SET WORKEXAMP2A HAS 0 OSS .....
10 ELSE PUT 'NORMAL VALUE 'NUM1=; NOTE: THE DATA STATEMENT USED 0.02 SECONDS AND
11 RUN; 176K

NOTE: DATA SET WORK.EXAMPI B HAS 0 OSS ..... 8 DATA EXAMP2B; SET EXAMP2A;
NOTE: THE DATA STATEMENT USED 0.02 SECONDS AND 9 ARRAY TOTALS n
Al-A&DIMENSN;
172K.
1301
12 PROC PRINT DATA=EXAMP1B; BY NUM1; 147
13RUN; 144
102
NOTE:NOOBSERVATIONSINDATASETWORK.EXAMPIB 10 RETAIN Al-A&DIMENSN 0;

NOTE: THE PROCEDURE PRINTUSEDO.02 SECONDS AND 1301


172K. 147
144

896
109 subsequent steps are compiled and a dear error message appears
11 ARRAY LABELS {.} $ &STRLEN Ll-L&DIMENSN; in Ihe SAS Log:

1 3 0 1 Example 3 - Log
1301
102 147
144 1 DATA EXAMP3A(KEEP=X STRING1);
102 INFILE INDDl EOF=EOFRTN;
119 2 INPUT@l NUMl 2. @6X4. @11 STRINGl $6.;
12 RETAIN U-L&DlMENSN ; 3 IF N=l THEN DO;
4 CALL SYMPUT('DIMENSN',PUT(NUM1,2.»;
1301 5 CALL SYMPUT('STRLEN',
147 PUT(LENGTH(TRIM(STRING1»,l.»;
144 6 END;
109 7 RETURN;
13 DO I = 1 TO &DlMENSN; 8 EOFRTN:
9 IF N=l THEN DO;
1301 10 PUT '55·"+"
307 11 '"+ END OF FILE ON FIRST OBS - EMPTY FILE"@55
14 TOTALS{I}=X·I; 12 '"+ SAS DATA SET EXAMP3A HAS ZEROOBS.'@55
15 LABELS{I}=SUBSTR(STRING1,I,l); 13 '55·"+"·
16 END; 14 ABORT RETURN 22;
17RUN; 15 END;
16 RETURN;
ERROR 147: THE WORD DOES NOT END IN A NUMBER 17 RUN;
ERROR 144: ALPHABETIC ROOTS MUST MATCH.
ERROR 102: WORD DOES NOT START WITH A LETTER '1' l' 11111111111111111111' 1111111111' 111111111111++++++
OR UNDERSCORE. + END OF FILE ON FIRST OBS - EMPTY FILE
SEE USER'S GUIDE, "SAS NAMES". + SAS DATA SET EXAMP3A HAS ZERO OBSERVATIONS
ERROR 109: SHOULD BE A NUMBER I 1 I 1 I II I I 1 11 1 11 1 I 1 II I 1 1 1 I I I II I I I 1 I 11 1 I II I I II I II 1 I I II 11 I
ERROR 179: ALL VARIABLES IN ARRAY LIST MUST BE ERROR: ABORT STATEMENT EXECUTED.
THE SAME TYPE, IT SPECIFIED THE RETURN OPTION.
I.E., ALL NUMERIC OR ALL CHARAC- NUM1=. X=. STRING1= ERROR=l N=l
TER. NOTE: SAS SET OPTION OBS=D AND WILL CONTINUE TO
ERROR 307: UNRECOGNIZED. CHECK STATEMENTS.
WARNING 1301: APPARENT SYMBOLIC REFERENCE NOT THIS MAY CAUSE NOTE: NO OBSERVATIONS IN
RESOLVED. DATASET.
NOTE: SAS STOPPED PROCESSING THIS STEP BECAUSE NOTE: a LINES WERE READ FROM INFILE INDD1.
OF ERRORS. NOTE: DATA SET WORK.EXAMP3A HAS 0 OBSERVATIONS
NOTE: SAS SET OPTION OBS=D AND WILL CONTINUE TO AND 2 VARIABLES.
CHECK STATEMENTS. NOTE: THE DATA STATEMENT USED 0.03 SECONDS AND
THIS MAY CAUSE NOTE: NO OBSERVATIONS IN 176K.
DATASET.
NOTE: DATA SET WORK.EXAMP2B HAS 0 OBSERVATIONS
When dealing with an already existing SASdata set in a data
step using a SET statement, we can utilize the NOBS options to
NOTE: THE DATA STATEMENT USED 0.03 SECONDS AND detect an empty file condition. The NOBS= option of the SET state-
48K. ment puts the observation count of all SAS data sets on the SET
statement into a variable AT COMPILE TIME. Since the NOBS=
18 PROC PRINT DATA=EXAMP2B; value is provided at compile time, it can be used in the DATA step
19 RUN; before the SET statement is actually done. Example 4 demonstrates
a DATA step taking advantage of this feature to print a standard
NOTE: THE PROCEDURE PRINT USED 0.02 SECONDS AND empty file Log message and abort the job if NOBS=<>:
176K.
Example 4 - Log

The extensive error messages all result trom the fact that the
input file is empty, This is reported by two NOTE messages in be- 1 DATA EXAMP4A: INFILE INDD1;
tween source lines 10 and 11. While these NOTEs are accurate and 2 INPUT@l NUMl 2. @6 X 4. @11 STRINGl $6. ;
complete, they do not have the prominence of the error messages 3RUN;
littering the subsequent step.
NOTE: a LINES WERE READ FROM INFILE INDD1.
SOLUTIONS NOTE: DATA SET WORK.EXAMP4A HAS a OBS .
NOTE: THE DATA STATEMENT USED 0.02 SECONDS AND
The first step in Example 2 could be modified to indude an 176K.
End Of File routine that tenninates the job if the file is empty. Ex-
ample 3 demonstrates how this can be done. The statements 4 DATA EXAMP4B;
follOWing the EOFRTN label are executed by an implicit GOTO 5 IF OBSCNT=0 THEN DO;
executed when the IN FILE statement attempts to read from a file that 6 PUT '55·"+"
has no records. Of course, the ABORT statement could be left out 7 '"+ ENDOF FILE ON FIRST OBS - EMPTY FILE"@55
and the example program would simply issue the error message. 8 '"+ SAS DATA SET EXAMP4A HAS ZERO OBS."@55
The statements that assign the MACRO variables, DIMENSN and 9 /55*"+";
STRLEN, their values would only be executed if the input file had one 10 ABORT RETURN 22;
or more observations. In this case, the program ends before the 11 END;

897
12 SET EXAMP4A NOBS=OBSCNT; PUT 11/155""+"
13 RUN; /"+===> - STEP &STEP" @55"+"
/"+ - - STEP COMPLETE OK ---'"@55"+"
III1III1II1I11111 J 1111111111111111111111111111111++++++ 155·"+"
+ END OF FILE ON FIRST OBS - EMPTY FILE /"+ &TEXT - &SASDS TOTOBS ===>" N @55 "+"
+ SAS DATA SET EXAMP4A HAS ZERO OBS. 155"6+";
I I f I I I I 11 I I I I I I I I II I I I II I I I I I I I I I I I I I I I I II I I I 1 I I I I I I I I I END; r
end ELSE"/
ERROR: ABORT STATEMENT EXECUTED. STOP;
IT SPECIFIED THE RETURN OPTION. SET &SASDS END=ENDSET NOBS=N POINT=N;
OBSCNT=<l NUM1=. X=. STRING1= ERROR=l N=l RUN;
NOTE: SAS SET OPTION OBS=O AND WILL CONTI NU E TO ..... %MEND EMPTYCT;
THIS MAY CAUSE NOTE: NO OBSERVATIONS IN
DATASET.
NOTE: DATA SET WORKEXAMP4B HAS 0 OBSERVATIONS %EMPTYCT adds the folJowing functions to empty file
AND ... handling:
NOTE: THE DATA STATEMENT USED 0.02 SECONDS AND
176K 1. Optional job abort if the data set is empty.
2. Optional macro variable is created with the
count of observations in the tested data set
While Examples 3 and 4 do solve our stated problem of 3. Standard message printed with reoord count
dealing with the possibility of an empty file condition, on, they or empty file notification.
demonstrate a coding scheme which requires code which to be
added to each individual data step concerned. Another disadvan- %EMPTYCT could be used immediately after every DATA
tage of these methods is the difficulty in retrofitting this feature into step which creates a SAS data set from an extemal file or subsets
existing programs. To solve this challenge, let's examine the macro (or copies) an existing SAS data set The record count macro
%EMPTYCT: variable feature could be used for each data set processed and the
values listed in a job summary report. If permanent SAS data sets
"$ MACRO EMPTYCT - - - - - - are involved and a SAS program consists only of PAOe steps,
"$ THIS MACRO IS TO TEST A SAS DATA SET HAS NO OBS. , %EMPTYCT could be used to verify all SAS data sets in the job
"$ IT CAN BE USED TO ABORT OR JUST PRINT A MESSAGE. before any processing is done. The standard log message would
provide record counts and the ABORTCC=feature could be used to
;$ A MACRO VARIABLE W' THE OBS COUNT IS ALSO CRE- alert an operator of empty files and terminate the program:
ATED. ;
"$ MACRO EMPTYCT------- Example 5 - Log

%MACRO EMPTYCT(
SASDS=, r REQUIRED: NAME OF SAS DATASET "' 53 DATA EX5; DO X=l TO 10; OUTPUT; END;
TEXT=, r
OPTIONAL: TEXT DESCRIPTION OF FILE"' 54 RUN;
MCOUNTV= ,r
OPTIONAL: MACRO VARIABLE NAME "'
r TO HOLD DATA SET RECORD COUNT "' NOTE: DATA SET WORKEX5 HAS 10 OBSERVATIONS
STEP=OOO, r
OPTIONAL: STEP NAME IF THEY ARE "' AND ..
r NUMBERED OR IDENTIFIABLE "' NOTE: THE DATA STATEMENT USED 0.05 SECONDS
ABORTCC=<lO rOPTIONALABORT RETURN CODE, "' AND 104K.
r
IF NOT USED (OR 00) THEN NO "'
r ABORT IS DONE, AND JUST THE "' 55 DATA EMPTY; STOP;
r THE WARNING MSG IS PRINTED "I); 56 RUN;
DATA NULL;
r if macro variable desired '*/ NOTE: DATA SET WORKEMPTY HAS 0 OBSERVA-
%IF "&MCOUNTV- NE .... (>/0 THEN %00; TIONSAND ..
CALL SYMPUT("&MCOUNTV",LEFT(PUT(N,ZS.))); NOTE: THE DATA STATEMENT USED 0.Q1 SECONDS
%END; AND 104K
IF N = 0 THEN DO; if SASDS is empty"' r
PUT /1/1 55""+" II I II II J II I I I I I I I II III I I I I I I II I I I III II II II I 11 I I I 1++++++
%IF "&STEP" NE "000" % THEN %DO; + ===> - STEP 000 +
I .. + ===> - STEP &STEP" @55"+" + STEP COMPLETE OK - - - - +
%END; I III 111111 I I III I I I II II II II I II I II II II I II I I I II II I 11++++++
'"+ - EMPTY DATASET ---"@55"+" + TEMP FILE - EX5 TOTOBS ===> 10 +
/"+ THE SAS DATA SET &SASDS," @55"+" 1111111111111111111111111111111111111111111111111++++++
/"+ &TEXT" @55"+" NOTE: THE DATA STATEMENT USED 0.04 SECONDS
/"+ HAS NO OBSERVATIONS!" @55"+" AND 168K.
155*"+"
%IF "&ABORTCC" NE "00" %THEN %00; 57 %EMPTYCT(SASDS=EX5,TEXT=TEMP FILE,
/ "+ JOB WILL NOW ABORT !!I!" @55 "+" MCOUNTV=RC5,ABORTCC=<l0)
%END; 97%EMPTYCT
%ELSE%OO;
/"+ WARNING ONLY- PROCESSING CONTINUES. ." I I I I I I II I I I I II I I I II I111 I I I I I II I 11 I III I I I II II II I 11++++++
@55"+" + EMPTY DATA SET +
%ENO; + THESAS DATA SET EMPTY, +
/"+"@55"+"/55""+"; + THIS FILE EMPTY +
%IF "&ABORTCC" NE "00" % THEN %DO; + HAS NO OBSERVATIONS ! +
ABORT ABEND RETURN &ABORTCC; 111111111111111111 J 11111111111 1111 J 111111II1I11 J 1++++++
%END; + WARNING MESSAGE ONLY- PROCESSING CONTINUES"
END; rendIFN=O"/ + +
ELSE DO; r SASOS has at least one observation *' 111111111111111111111111111111111111111111111111111111

898
NOTE: THE DATA STATEMENT USED 0.04 SECONDS
AND 168K.

98 (SASDS=EMPTY,TEXT=THIS FILE IS EMPTY,


MCPUNlV=RCMT,ABORTCC=OO)
1:38 %EMPTYCT

++++++++++ I I II II I I I II I 1'111 I 11 I II I I I I I J+-l-H-+t-l-I I 1++++++


+ EMPTY DATA SET +
+ THE SAS DATA SET EMPTY, +
+ THIS FILE.EMPTY +
+ HAS NO OBSERVATIONS! +
1111111111111 r I J II r 11111 f II f II Jill, III J II111111 J++++++
+ JOB WILL NOW ABORT II!! +
+ +
111111111111111111111111111111111111111111111111 t++++++
ERROR: ABORT STATEMENT EXECUTED.
IT SPECIFIED THE ABEND OPTION.
NOTE: SAS ENDED DUE TO ERROR.

CONCLUSION
Ignoring the possibility of empty raw data files orempty SAS
data sets can result in confusing log messages (and NOTEs) and
unexpected results if an empty file is encountered. Standardizing
the Log message used for an empty file condition can make debug-
ging considerably easier and avoid waste of computer resources.
Finally, implementation ofa MACRO such as %EMPTYCTprovides
not only consistent and "graceful" empty file processing but, also, a
start towards standardizing user friendly log messages. If desired,
the macro variable record count feature is available forany summary
reporting.

SAS is a registered trademark of SAS Institute Inc.,


Cary, NC, USA.

899

You might also like