You are on page 1of 11

INFILE Statement:

INFILE statement can be used to read data from an external file to SAS. The infile statement
must be followed by input statement that reads the input data variables. INFILE will identifies
the file name and location and defines the file structure.
Infile statement can read the data from notepad, tab, csv and word files.

Syntax:
INFILE ‘<file-specification>’ <options> ;
Here file-specification is an external file or in-stream data. In-stream data is nothing but internal
data (datalines or cards);

The following statements can be used to read data from an external data file into a new
SAS dataset.

DATA <dataset name>;


INFILE '<external location of the file>' <options>;
INPUT var1 var2 …var <n> ;
RUN;

Getting the data from text files to SAS:


The following example shows how to get the data from note pad (text document) to SAS.
The infile statement tells SAS the text document with name demog located in the directory D:\.
Input statement reads the variables. The following example creates the dataset demo in the work
libray.

EX:
data demo;
infile "D:\demog.txt";
input ptid gender$ age;
run;
proc print data=demo;
run;

Text Document which is located in D drive


Output:

Getting the data from comma delimited text file to SAS:


To read the data from text document with comma separated values, use dlm option to specify the
comma. dlm [delimiter] = ‘,’ tells SAS that the data is delimited by commas. The following
example creates the dataset demo1 in the work libray.

EX:
data demo1;
infile "D:\demog.txt" dlm=',';
input ptid gender$ color$;
run;
proc print data=demo1;
run;

Comma delimited text file which is located in D drive


Output:

Getting the data from TAB delimited text file to SAS:


Sometimes you may have the data in text document with TAB space separated values. To read
the data from text document with tab separated values, use the option dlm = ’09’x. (’09’x is the

hexadecimal representation of the tab character, it is way of specifying TAB as the delimiter).
The following example creates the dataset demo2 in the work libray.

EX:
data demo2;
infile "D:\demog.txt" dlm='09'x ;
input ptid name$ age;
run;
proc print data=demo2;
run;

TAB delimited file which is located in D drive


Output:

Getting the data from CSV file to SAS:


If you want to get the from Excel file with infile statement, then convert Excel to CSV (comma
separated values) file. Infile statement cannot read the data directly from the Excel file. In CSV
file, default delimiter is comma. So use the option DLM=’,’
The following example creates the dataset demo3 in the work libray.

EX:
data demo3;
infile "D:\demog.csv" dlm=','; ;
input ptid name$ clor$;
run;
proc print data=demo3;
run;

CSV file which is located in D drive


Output:

Options in INFILE Statement:


There are number of options that you can use on the infile statement. Following are the
commonly used options.

DLM or DELIMITER:
By default SAS specifies blank space as delimiter. DLM can be used to specify the delimiters
(other than blank) which separate the data values in raw data.

EX:

data demo;
infile cards dlm = ', *';
input pid name$ age ;
cards;
100, henry, *78
101, james, *84
;
run;
proc print data = demo;
run;
Output with DLM option:

Output without the DLM option:


DSD (Data Separated/Sensitive Delimiter):
It can used to specify the delimiter comma and to strip off the quotation marks that surround the
character data values. If the data values are separated by comma, You can either DSD option or
DLM=’,’. If any other delimiters separate the data values you need to use DLM option.

EX:
data demo1;
infile cards dsd;
input pid name$ age color$;
cards;
100, 'henry', 78, 'white',
101, 'james', 84, 'black',
;
run;
proc print data = demo1;
run;
Output:

Output without DSD option:


MISSOVER:
This option prevents SAS to read next observation if the current observation have missing values
at the end.

EX:
data physical;
infile cards missover;
input pid age ht wt;
cards;
100 78 160
101 84 170 57
;
run;
proc print data = physical;
run;
Output:
Output without MISSOVER option:

FIRSTOBS=N:
This option indicates SAS to start reading the observation at the specified record (observation)
number, rather than beginning with the first record. N is the record number. To read the data file
with the variables at the first line, use the option FIRSTOBS=2. So it starts reading the data from
the second line of observation.
EX:
data phys;
infile cards firstobs= 4;
input pid age ht wt;
cards;
100 78 160 56
101 84 170 57
102 68 175 60
103 71 180 65
104 58 174 59
105 62 163 70
;
run;
proc print data = phys;
run;

Output:

OBS= N:
This option indicates which line in your raw data file should be treated as the last record. For
example, If you want to read 100 records from the raw data file containing 200 records, you
might use OBS=100. N is the record number

EX:
data phys2;
infile cards firstobs= 4;
input pid age ht wt;
cards;
100 78 160 56
101 84 170 57
102 68 175 60
103 71 180 65
104 58 174 59
105 62 163 70
;
run;
proc print data = phys2;
run;

Output:

You might also like