You are on page 1of 129

SAS

(Statistical Analysis System)

By:
Kirtikrushna
SAS
(Statistical Analysis System)

It was developed by James Good knight.

1970-It was package


1980-Language
1990-Software
SAS

Technical Techno Functional Functional


| | |
SAS/BASE SAS/Ware house admin SAS/Stat
SAS/MACRO’S SAS/ETL Studio SAS/Graph
SAS/ACCESS SAS/OLAP SAS/OR
SAS/AF
In which domains SAS can be used:

CLINICAL
BANKING
INSURANCE
INTRODUCTION TO THE SAS SYSTEM :

SAS is an integrated system of software


solutions that enables you to perform the
following tasks:

data entry, retrieval, and management


report writing and graphics design
statistical and mathematical analysis
business forecasting and decision support
applications development
Base SAS software provides you with essential tools for
the basic data-driven tasks that you commonly perform
as a programmer:
Accessing Data:
you can access data that is stored almost anywhere,
whether it is in a file on your system, or data that is
stored another database system.

In almost any format, including raw data, SAS data sets,


and files created by other vendors' software.
Managing Data :
After you have accessed your data, you can use the SAS
programming language to manipulate it.

Format your data ,create variables (columns), use operators


to evaluate data values ,use functions to create and recode
data values, subset data, perform conditional processing,
merge a wide range of data sources, create, retrieve, and
update database information.
Analyzing Data and Presenting Information
Once your data is in shape, you can use SAS to analyze data
and produce reports. Your SAS output can range from a simple
listing of a data set to customized reports of complex
relationships.
Analysis:
Base SAS provides powerful data analysis tools. For
example, you can produce tables, frequency counts, and
cross-tabulation tables

create a variety of charts and plots

compute a variety of descriptive statistics, including


the mean, sum, variance, standard deviation and more

compute correlations and other measures of association,


as well as multi-way cross-tabulations and inferential
statistics.
Presentation

For reporting and displaying analytical results,


SAS gives you an almost limitless number of visually
appealing output formats, such as an array of markup
languages including HTML4 and XML output that is
formatted for a high-resolution printer, such
PostScript, PDF, and PCL files RTF color graphs that
you can make interactive using ActiveX controls or Java
applets.
SAS WINDOW ENVIRONMENT

Five windows in SAS

1.Editor window
2.Output window.
3.Log window
4.Result window
5.Explorer window

1.Editor window :
Editor window contains the list of programs which has
an extension of .SAS
We can type any no of programs in editor window
We can execute all programs at a time or individually
2.OUT PUT window
Results of program will be displayed in output window
which has an extension of .LIST

3.LOG WINDOW
Suppose if there are any errors or warnings in the program
those messages will be displayed in log window
It displays the licensed agreement of SAS Version no of
variables , no of observations

4.Result window:
It displays result of all the programs in editor window
No extension for result window
5.Explorer window
Contains Libraries and Mycomputer
SAS LANGUAGE :

The SAS language consists of statements,


expressions, options, formats, and functions
similar to those of many other programming
languages.

In SAS, you use these elements within one of


two groups of SAS statements:

DATA steps
PROC steps
DATA STEP:
A DATA step consists of a group of statements in
the SAS language that can

read data from external files


write data to external files
read SAS data sets and data views
create SAS data sets and data views.
Create multiple SAS data sets in one DATA STEP.
Combine existing data sets
Creating accumulating totals
Manipulate numeric and character values
Syntax:

DATA <DATA Set Name>

INPUT <var1><var2>….<varn>;

CARDS;

Data values
;
RUN;
EG:
data temp;
input name $ no;
datalines;
hari 102
ravi 104
ganesh 105
kiran 109
;
run;
SAS DATA SETS:
A SAS data set consists of the following:
-descriptor information
-data values.

The descriptor information describes the


contents of the SAS data set to SAS.
The data values are data that has been collected
or calculated. They are organized into rows,
called observations, and columns, called variables.
An observation is a collection of data values that
usually relate to a single object. A variable is the
set of data values that describe a given
characteristic.
SAS VARIABLES AND OBSERVATIONS
The below figure shows a SAS data set. The data
describes participants in a 16-week weight program
at a health and fitness club. The data for each
participant includes an identification number, name,
team name, and weight at the beginning and end of
the program.
PROC STEP:
Once your data is accessible as a SAS data set, you
can analyze the data and write reports by using a set
of tools known as SAS procedures.
A group of procedure statements is called a PROC
step.
SAS procedures analyze data in SAS data sets to
produce statistics, tables, reports, charts, and plots,
to create SQL queries, and to perform other analyses
and operations on your data. They also provide ways to
manage and print SAS files.
PROCEDURE STEP BLOCK

Syntax:
Proc <Procedure name>;
Statement 1
Statement 2
.
.
.
Statement n
;
Run;
EG:
proc print data=temp;
Run;

proc sort data=temp out=samp;


by name;
run;
Data Types in SAS System

1)Numerical Data(0-9)
2)Character Data (A-Z)

SAS System by default reads both numeric and


character data as numeric only
To read character data Use $ symbol.
The no of variables in SAS System is up to 32767
characters
SAS Reads Data values observation by observation
The no of observations in SAS Data set depends on
system configuration or hard disk space.
In SAS data set for missing value in output it shows
a period(.) for missing value and “blank space “ for
character value.
Different Data bases :

Best db storage:

Text
Excel
Access
DB2
Oracle
Tera Data
LIBRARIES:
There are 2 ways of creating libraries.
1.Menu driven
2.Programming coding
1.Menu driven
Explorer
|
Right click
|
New
2.Through programming
Editor window
LIBNAME <Name of library> <path>;
LIBNAME Hari “D:\Ganesh”;
Example:
To delete Library:
Libname Guru clear;
RULES FOR SAS STATEMENTS:
There are only a few rules for writing SAS statements:

_ SAS statements end with a semicolon.


_ You can enter SAS statements in lowercase,
uppercase, or a mixture of the two.
_ You can begin SAS statements in any column of a line
and write several statements on the same line.
_ You can begin a statement on one line and continue it
on another line, but you cannot split a word between two
lines.
RULES FOR MOST SAS NAMES:

SAS names are used for SAS data set names, variable
names, and other items. The following rules apply:

_ A SAS name can contain from one to 32 characters.


_ The first character must be a letter or an underscore (_).
_ Subsequent characters must be letters, numbers, or
underscores.
_ Blanks cannot appear in SAS names.
DATA STEP PROCESSING:
The DATA step is one of the basic building blocks of
SAS programming.
It creates the data sets that are used in a SAS
program’s analysis and reporting procedures.
OVERVIEW OF THE DATA STEP:

The DATA step consists of a group of SAS statements that


begins with a DATA statement. The DATA statement begins
the process of building a SAS data set and names the data
set. The statements that make up the DATA step are
compiled, and the syntax is checked. If the syntax is
correct, then the statements are executed. In its simplest
form, the DATA step is a loop with an automatic output and
return action.
DURING THE COMPILE PHASE:

When you submit a DATA step for execution, SAS checks


the syntax of the SAS statements and compiles them, that
is, automatically translates the statements into machine
code. SAS further processes the code, and creates the
following two items:

INPUT BUFFER
PROGRAM DATA VECTOR:
INPUT BUFFER:
Input buffer is a logical area in memory into which SAS reads
each record of raw data when SAS executes an INPUT
statement.

PROGRAM DATA VECTOR (PDV):

Is a logical area in memory where SAS builds a data set, one


observation at a time. When a program executes, SAS reads
data values from the input buffer or creates them by
executing SAS language statements. The data values are
assigned to the appropriate variables in the program data
vector. From here, SAS writes the values to a SAS data set as
a single observation.
The PDV contain two automatic variables:

1). _N_ It gives information about variables and observations

2). _ERROR_ classified as 2 types:

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program.


Creating the Input Buffer and the Program Data Vector:

When DATA step statements are compiled, SAS


determines whether to create an input buffer. If the
input file contains raw data (as in the example above),
SAS creates an input buffer to hold the data before
moving the data to the program data vector (PDV).
data total_points (drop=TeamName);
input TeamName $ ParticipantName $ Event1 Event2 Event3;
TeamTotal + (Event1 + Event2 + Event3);
datalines;
Knights Sue 6 8 8
Cardinals Jane 9 7 8
Knights John 7 7 7
Knights Lisa 8 9 9
Knights Fran 7 6 6
Knights Walter 9 8 10
;
run;
The following figure shows the Input Buffer and the
program data vector after DATA step compilation.
Position of the Pointer in the Input Buffer Before SAS
Reads Data

The INPUT statement then reads data values from the


record in the input buffer and writes them to the PDV
where they become variable values. The following figure
shows both the position of the pointer in the input buffer,
and the values in the PDV after SAS reads the first record.
Program Data Vector with Computed Value of the
Sum Statement
Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTS


Output SAS Data Set TOTAL_POINTS: 1st observation
SAS then returns to the DATA statement to begin the next
iteration. SAS resets the values in the PDV in the following
way:
􀀀 The values of variables created by the INPUT statement
are set to missing.
􀀀 The value created by the Sum statement is automatically
retained.
􀀀 The value of the automatic variable _N_ is incremented
by 1, and the value of _ERROR_ is reset to 0.
Compilation:
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if
reading raw data
Establishes an area of memory called the program Data
Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set.
Execution:
During the execution phase, SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional
processing
Writes the observation in the PDV to the output SAS
data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS
data sets to missing
Repeat the process
Each time the DATA statement executes, a new iteration
of the DATA step begins, and the _N_ automatic variable
is incremented by 1.

As SAS continues to read records, the value in TeamTotal


grows larger as more participant scores are added to the
variable. _N_ is incremented at the beginning of each
iteration of the DATA step. This process continues until
SAS reaches the end of
the input file.

The DATA step stops executing after it processes the last


input record.
Word
Compiler Scanner

Input stack

Data temp;
Input name $ no;
Cards;
Hari 101
;
%let list=name;
proc print data=temp;
var &list;
run;
Word scanner
compiler

Data temp

Input stack
;
Input name $ no;
Cards;
Hari 101
;
%let list=name;
proc print data=temp;
var &list;
run;
“The process that SAS uses to extract words and
symbols from the input stack to word scanner is
called tokenization.”
Tokenization is performed by a component of SAS called
the word scanner.
The word scanner starts at the first character in the
input stack and examines each character in turn.
Literal
a string of characters enclosed in quotation marks.
Number
digits, date values, time values, and hexadecimal
numbers.
Name
a string of characters beginning with an underscore or
letter.
Special
any character or group of characters that have special
meaning to SAS. Examples of special characters include:
What Are the SAS Language Elements?

Data set options


Informats and formats
Functions
Statements
SAS system options
Definition of Data Set Option:

Data set options specify actions that apply only to the


SAS data set with which they appear. They enable you
to perform operations such as these:

Renaming variables
Selecting only the first or last n observations for
processing
Dropping variables from processing or from the
output data set
Specifying a password for a data set.
Syntax for Data Set Options

Specify a data set option in parentheses after a SAS


data set name. To specify several data set options,
separate them with spaces.

(option-1=value-1<...option-n=value-n>)

These examples show data set options in SAS


statements:

data scores (keep=team game1 game2 game3);


data points (Keep=Event1 Event2);
input TName $ PName $ Event1 Event2 ;
datalines;
Knights Sue 6 8
Cardinals Jane 9 7
Knights John 7 7
Knights Lisa 8 9
;
run;
Formats and Informats
De.nition of a Format
A format is an instruction that SAS uses to write
data values.
Syntax of a Format

SAS formats have the following form:


<$>format<w>.<d>
Here is an explanation of the syntax:
$
indicates a character format; its absence
indicates a numeric format.
format
names the format.
w
specifies the format width, which for most formats is the
number of columns in the
output data.
d
specifies an optional decimal scaling factor in the numeric
formats.

data temp;
amount=1145.32;
put amount dollar10.2;
run;

The DOLLARw.d format in the PUT statement produces this


result:

$1,145.32
Informats

De.nition of an Informat
An informat is an instruction that SAS uses to read data
values into a variable.
For example, the following value contains a dollar sign and
commas:

$1,000,000

To remove the dollar sign ($) and commas (,) before storing
the numeric value 1000000 in a variable, read this value with
the COMMA11. informat.
Syntax of an Informat

SAS informats have the following form:


<$>informat<w>.<d>

Here is an explanation of the syntax:


$
indicates a character informat; its absence indicates a
numeric informat.
informat
names the informat.
w
specifies the informat width
d
specifies an optional decimal scaling factor in the numeric
informats.
data tmp1;
input ename $ edate ;
informat edate ddmmyy8.;
format edate date9.;
cards;
hari 10/10/07
;
run;
Functions:

De.nition of Functions
A SAS function performs a computation or system
manipulation on arguments and returns a value.

Syntax of Functions:

The syntax of a function is as follows:

function-name (argument-1<...,argument-n>)

x=max (cash,credit);
x=sqrt(1500);
Statements

Definition of Statements
A SAS statement is a series of items that may include
keywords, SAS names, special characters, and operators.
All SAS statements end with a semicolon.

INPUT, List
PUT , DATALINES
DO Iterative
DO Until
DO While
SELECT, DROP
MERGE ,SET
FILE , LENGTH
Sum, END
OUTPUT, KEEP, DATA, RETAIN
SAS System Options

System options are instructions that affect your SAS session.

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS


statement is
OPTIONS option(s);
Here is an explanation of the syntax:
option
specifies one or more SAS system options that you want to
change.

options nodate linesize=72;


STANDARD DATA:
The data values are in the standard format then the data
is called standard data
Eg: 467
NON STANDARD DATA :
If data values are not in the standard format then data
is called as non-standard data.
Eg:
18-10-05
45,000
$21,000
Informats are used to read non-standard data :

data dates;
input name $ Bdate: ddmmyy8. ;
format Bdate: ddmmyy8.;
cards;
hari 21-10-84
ravi 22-11-86
;
run;
Date Informats:

Date Informat Format


12-07-78 DDMMYY8. DDMMYY8.
21-09-05 DDMMYY10. DDMMYY10.
22Jan89 Date7. Date7.
22jan1989 Date9. Date9.
Numeric Informats:

Numeric Informat Format


25,000 COMMA6.
COMMA6.
$3,000 DOLLAR6.
DOLLAR6.
25,000 COMMA6. WORDS6.
DEFINING VARIABLES IN SAS:

INPUT statement provides instructions for reading data,


it defines the variables for the data set that come from the
raw data.

SAS variables can have these attributes:

_ name
_ type
_ length
_ informat
_ format
_ label
DIFFERENT WAYS TO READ DATA:

1.RAW DATA IN THE JOB STREAM:

You can place data directly in the job stream with the
programming statements that make up the DATA step.

The DATALINES statement tells SAS that raw data


follows.
The single semicolon that follows the last line of data marks
the end of the data.
The DATALINES statement and data lines must occur last
in the DATA step statements:
data weight_club;

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight


EndWeight ;

datalines;
1023 David Shaw red 189 165
1049 Amelia Serrano yellow 145 124
1219 Alan Nance red 210 192
1246 Ravi Sinha yellow 194 177
1078 Ashley McKnight red 127 118
;
3.DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set
as input to a new data set.

To read data from an existing SAS data set, you must


specify the existing data set’s name in one of these
statements:

_ SET statement
_ MERGE statement

Data Temp;
Set weight_club;
Run;
2.DATA IN AN EXTERNAL FILE:

If your raw data is already stored in a file, then you do not


have to bring that file into the data stream.
Use an INFILE statement to specify the file containing the
raw data.
The statements in the code that follows demonstrate the
same example, this time showing that the raw data is stored
in an external file:

data <dataset name>;


infile ’your-input-file path\filename.extension’;
input <var 1> <var 2> …….;
run;
4.DATA IN A DBMS FILE:

If you have data that is stored in another vendor’s database


management system (DBMS) files, then you can use
SAS/ACCESS software to bring this data into a SAS data set.
SAS/ACCESS software enables you to assign a libref to a
library containing the DBMS file. In this example, a libref is
declared, and points to a library containing Oracle data. SAS
reads data from an Oracle file into a SAS data set:

libname dblib oracle user=scott password=tiger ;

data employees;
set dblib.employees;
run;
DATA SET OPTIONS:

Data set options specify actions that apply only to the SAS
data set with which they appear. They enable you to perform
operations such as these:

KEEP:
This example uses the KEEP= data set option in the SET
statement to read only the variables that represent the in
Set Statement:

Data samp;
Set weight_club (Keep= IdNumber Team);
Run;
DROP:
Use the DROP= option to create a subset of a larger data set
when you want to specify which variables are being excluded
rather than which ones are being included. The following
DATA step reads all of the variables from the data set
weight_club except for those that are specified with the
DROP= option, and then creates a data set named A1.

Data A1;
Set weight_club (Drop= IdNumber Name);
Run;
OBS= :

Specifies when to stop processing observations

data s1 ;
set weight_club(obs=3);
run;

Firstobs=:

Specifies which observation SAS processes first

data s1 ;
set weight_club(obs=4 firstobs=2);
run;
RENAME=:

Changes the name of a variable

data two (rename=(name=Pname));


set weight_club;
run;

PW= :

Assigns a read, write, or alter password to a SAS and


enables access to a password-protected SAS.

data two1 (Pw=ram) ;


set weight_club;
run;
WHERE=:

Selects observations that meets the specified condition


data weight_club; ;

Data tmp;
set weight_club (where=(Name ="David Shaw"));
run;

IN=:

Creates a variable that indicates whether the data set


contributed data to the current observation.
DATA STEP STATEMENTS:

Data statement:
Begins a DATA step and provides names for any output
SAS data sets.

Creating an Output Data Set

data example1 ;

set weight_club;
run;
When Not Creating a Data Set

data _NULL_;
set weight_club;
put Name ;
run;
CARDS Statement:
Indicates that data lines follow

DATALINES Statement (New version):


Indicates that data lines follow

Using the DATALINES Statement In this example,


SAS reads a data line and assigns values to two character
variables, NAME and DEPT, for each observation in the
DATA step:
DELETE Statement:

Stops processing the current observation

if Team=“red” then delete;

FORMAT Statement:
Associates formats with variables

INFORMAT Statement:
Associates informats with variables
data two1;
input ename $ eid hiredate ;
informat hiredate mmddyy8.;
datalines;
hari 101 12/01/05
ravi 102 11/03/06
;
run;
DATALINES4 Statement: or Cards4:

Indicates that data lines that contain semicolons follow


data biblio;
input number citation $50.;
datalines4;
6 1988
2 LIN ET AL., 1995; BRADY, 1993
3 BERG, 1990; ROA, 1994; WILLIAMS, 1992
;;;;
DM Statement:

Submits SAS Program Editor, Log, Procedure Output or text


editor commands as SAS
Statements

dm log ‘clear’;

KEEP Statement:

Includes variables in output SAS data sets

data average;
set weight_club;
keep name team;
run;
LABEL Statement:

Assigns descriptive labels to variables

data rtest;
set weight_club;
label name=teamname;
run;
LENGTH Statement:

Specifies the number of bytes for storing variables

data testlength;
input firstname$ lastname$ n1 n2;
length name $25 ;
datalines;
Alexander Robinson 35 11
;
INPUT Statement:

Reads input values from specified columns and assigns


them to the corresponding SAS variables.

This DATA step demonstrates how to read input data


records with column input:

data scores;

input name $ 1-18 score1 25-27 score2 30-32;


INPUT METHODS:

1)List INPUT METHOD


2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD:
In this method the data values should be seperated
by at least single space.
EG:
-Do-

2)Column INPUT METHOD:


In this method character data values contain more
than 8 characters and it can contain blank spaces
also.
data temp;
input id 1-3 name $ 7-18 age 21-22;
cards;
101 shiva krish 38
102 ravi krish 38
103 rama krish 38
;
run;
3)NAMED INPUT METHOD
In this method data values are followed by variable
names.
data samp;
input id= name= $ age=;
cards;
id=290 name=ravi age=20
id=291 name=rani age=19
;
run;
4)FORMATTED INPUT METHOD
In this method variables length followed by period to
specify the length of the variable for all data values.
data one;
input id 3. name $ 11. age 3.;
datalines;
101 praveenraj 25
102 kiranraj 23
;
run;
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to
give exact location of data values.
data two;
input @1 id 3.+5 @10 name $
4.+5 @19 age;
cards;
102 hari 29
;
run;
Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing @ to


control the input pointer across iterations of the DATA step.

data test;
input name $ age @@;
datalines;
John 13 Monica 12 Sue 15 Stephen 10
Marc 22 Lily 17
;
The INPUT statement in this DATA step uses the & format
modifier with list input to read character values that contain
embedded blanks.

DATA AMPERS;
INPUT NAME & $25. AGE GENDER : $1.;
DATALINES;
RASPUTIN 45 M
BETSY ROSS 62 F
ROBERT LOUIS STEVENSON 75 M
;
PROC PRINT DATA=AMPERS;
TITLE 'Example 4';
RUN;
BY Statement:
Controls the operation of a SET, MERGE, MODIFY, or
UPDATE statement in the DATA step and sets up special
grouping variables.

In By group processing it creates two variables

First. variable and last. variable

Processing BY-Groups:

FIRST.variable has a value of 1 for any preceding variable


in the BY statement.

In all other cases, FIRST.variable has a value of 0.


LAST.variable has a value of 1 for any preceding
variable in the

BY statement.
In all other cases, LAST.variable has a value of 0.
Grouping Observations by City, State, Zip Code, and Street
data cnt;
set demo3;
by eid;
if first.eid then ct=0;
ct+1;
if last.eid;
run;
FILE Statement:

Specifies the current output .le for PUT statements

PUT Statement:

Writes variable values in the specified columns in the output


line
Data samp;
Input Pname $ PID Pwgt ;
File 'D:\hari.txt';
put pname $ PID;
cards;
hari 444 789
ravi 555 878
;
run;
OUTPUT Statement:

Writes the current observation to a SAS data set

data response(drop=time1-time3);
set sulfa;
time=time1;
output;
time=time2;
output;
time=time3;
output;
run;
IF Statement:

Continues processing only those observations that meet


the condition

if sex=’F’;

IF-THEN/ELSE Statement:

Executes a SAS statement for observations that meet speci.c


conditions

if status=’OK’ and type=3 then count+1;


SET Statement:

Reads an observation from one or more SAS data sets


data fitness;
set health exercise well;
run;

data Raleigh. members;


set nc.members;
if city=’Raleigh’;
run;
RUN Statement:

Executes the previously entered SAS statements

LIBNAME Statement:
Associates or disassociates a SAS data library with a libref
(a shortcut name);

Libname hari “D:\file”;


Libname hari clear;
RETAIN statement:

The Retain statement prevents SAS from re-initializing


the values of new variables at the top of the data step.
Previous values of retained variables are available for
processing across iteration of the data step.
data samp;
input patname $ patid;
datalines;
ravi 11
kiran 12
ramu 13
ramesh 14
rakesh 15
ganesh 16
venu 17
srinu 18
;
data temp;
set samp;

retain c1 c2;
if patname='ramu' then
c1=patid;

else if patname="ganesh" then


c2=patid;

run;
INFILE Statement:

Identifies an external file to read with an INPUT statements

Some Infile options:

-DELIMITER= Alias: DLM=


-The DSD option sets the comma as the default delimiter.
-FIRSTOBS=record-number
specifies a record number that SAS uses to begin reading
input data records in the input file.
-OBS= read a range of records
data num;
infile datalines dsd;
input x y z;
datalines;
1,2,3
4,5,6
7,8,9
;
data nums;
infile datalines dsd delimiter='*';
input X Y Z;
datalines;
1*2*3
4*5*6
7*8*9
;
data weather;
infile datalines missover;
input temp1-temp5;
datalines;
97.9 98.1 98.3
98.6 99.2 99.1 98.5 97.5
96.2 97.3 98.3 97.6 96.5
;
Sum Statement:

Adds the result of an expression to an accumulator variable

data cnt;
set demo3;
by eid;
if first.eid then ct=0;
ct+1;
if last.eid;
run;
END Statement:

Ends a DO group

do;
statements
end;
DO Statement, Iterative:

Executes statements between DO and END repetitively


based on the value of an index variable.

data tc;
do i=1 to 10;
i+1;
end;
run;
DO UNTIL Statement:

Executes statements in a DO loop repetitively until a


condition is true.

Data temp;
n=0;
do until(n>=8);
put n=;
n+1;
end;
run;
DO WHILE Statement:

Executes statements repetitively while a condition is true

n=0;
do while(n<5);
put n=;
n+1;
end;
MERGE Statement:

Joins observations from two or more SAS data sets into


single observations

Example 1: One-to-One Merging This example shows how


to combine observations from two data sets into a single
observation in a new data set:

data benefits.qtr1;
merge benefits.jan benefits.feb;
run;
Example 2: Match-Merging This example shows how to
combine observations from two data sets into a single
observation in a new data set according to the values of
a variable that is specified in the BY statement:

data inventry;
merge stock orders;
by partnum;
run;
FUNCTIONS:

Defenition of Functions:

A SAS function performs a computation or system


manipulation on arguments and returns a value.

Syntax of Functions:
The syntax of a function is
function-name (argument-1<, ...argument-n>)
Scan Function:

Scan function extract a word from a string.

Example: In the below example scan function extracts ‘ram’


word from sd and stores it into a new variable cs.

data n;
sd='hari kris ram ganesh';
cs=scan(sd,4);
run;
INDEX Funtion:

INDEX Function searches position of a string.

Example: In the below example INDEX function searches


position of ‘r’ and prints it in to a new variable.

Data samp;
gd='ganesh hari';
sc=index(gd,'r');
Run;
LENGTH Function :

Returns the length of a character string.

Example:
In the below example it returns the length of ‘ganesh hari’
into a new variable.

data temp3;
fg='ganesh hari';
ds=length(fg);
run;
SUB STRING Function:

Extract a substring from a character string.

Example: In the below example it extract the 1 to 5 substring


stores into a new variable dp.

Data samp1;
sg='ganesh';
dp=substr(sg,1,5);
Run;
UPCASE Function:

Translates the letters to lower case to upper case.


Example: In the below example ‘ganesh’ is translated to
‘GANESH’.

DATA temp1;
ps='ganesh';
tp=upcase(ps);
Run;
LOWCASE Function:

Translates the letters to upper case to low case.


Example: In the below example ‘GANESH’ is translated to
‘ganesh’.

Data temp2;

rr='GANESH';
ss=lowcase(rr);

Run;
LEFT Function:

Left aligns a SAS character expression

Data demo1;
a=’ DUE DATE’;
b=left(a);
put b;
Run;

MAX Function:

Returns the largest value

Data acnt;
x=max(8,3);
Run;
MEAN Function:

Returns the arithmetic mean (average)

Data demo2;
x1=mean(2,.,.,6);
Run;

Data demo3;
x2=mean(1,2,3,2);
Run;

2
MIN Function:

Returns the smallest value

x=min(7,4);
4

MOD Function

Returns the remainder from the division of the .rst argument


by the second argument, fuzzed to avoid most unexpected .
oating-point results

x1=mod(10,3);
put x1 ;

1
INPUT Function:

Use INPUT to convert character values to numeric values If


the INPUT function returns a value to a variable that has not
yet been assigned a length, by default the variable length is
determined by the width of the informat.The INPUT function
enables you to read the value of source by using a specified
informat.
data testin;
input sale $9.;
fmtsale=input(sale,comma9.);
datalines;
2,115,353
;
PUT Function:

Use PUT to convert a numeric value to a character value.

If the PUT function returns a value to a variable that has


not yet been assigned a length, by default the variable
length is determined by the width of the format.

The format must be the same type (numeric or character)


as the value of source. The result of the PUT function is
always a character string.
data temp;
num=15;
char=put(num,hex2.);
;
SYSTEM OPTIONS:

De.nition of System Options


System options are instructions that affect your SAS session.

DATE System Option:

Prints the date and time that the SAS session was initialized

FIRSTOBS= System Option:

Speci.es which observation or record SAS processes .rst


options firstobs=11;
data a;
set old; /* 100 observations */
run;
OBS= System Option:

Speci.es when to stop processing observations or records

options firstobs=2 obs=12;


proc print data=Ages;
run;

PAGESIZE= System Option:

Specifies the number of lines that compose a page of SAS outpu

PAGENO= System Option:

Resets the page number

You might also like