SAS Programming 1: Essentials Quick Reference
SAS System Options
OPTIONS DATE | NODATE; OPTIONS NUMBER | NONUMBER; OPTIONS LINESIZE | LS=n; OPTIONS PAGESIZE | PS=n; OPTIONS CENTER | NOCENTER; OPTIONS DTRESET | NO DTRESET; OPTIONS PAGENO=n;
Reading Instream Data
DATA output-SAS-data-set; INPUT specifications; DATALINES; instream data ; RUN;
Importing an Excel Worksheet LIBNAME Statement
LIBNAME libref 'SAS- library'; PROC IMPORT OUT= output-data-set DATAFILE='input=excel-workbook' DBMS=EXCEL REPLACE; RANGE='range-name'; RUN;
LIBNAME libref engine-name <SAS/ACCESS-options>;
Creating an Excel Workbook
LIBNAME libref CLEAR; LIBNAME libref 'physical-file-name'; DATA output-excel-worksheet; SET input-data-set; RUN;
Displaying Data Set Information
PROC CONTENTS DATA=SAS-data-set; RUN;
PROC CONTENTS DATA=libref._ALL_ NODS; RUN;
LIBNAME output-libref 'physical-file-name'; PROC COPY IN=input-libref OUT=output-libref; SELECT input-data-set1 input-data-set2; RUN;
Reading Raw Data
DATA output-SAS-data-set-name; INFILE 'raw-data-file-name' DLM='delimiter'>; INPUT specifications; RUN;
PROC EXPORT DATA= input-data-set OUTFILE='output-excel-workbook' DBMS=EXCEL REPLACE; RUN;
Reading and Concatenating Data Sets
DATA output-SAS-data-set-name(s); SET input-SAS-data-set name(s); <additional SAS statements> RUN;
INPUT variable <$> variable <:informat>;
Creating Variables
variable=expression;
1
Copyright 2010 SAS Institute Inc., Cary, NC, USA. All rights reserved.
SAS Programming 1: Essentials
Appending Data Sets
PROC APPEND BASE=SAS-data-set DATA=SAS-data-set <FORCE>; RUN;
Functions
WEEKDAY(SAS-date) YEAR(SAS-date) QTR(SAS-date) MONTH(SAS-date) TODAY() MDY(month, day, year)
Interleaving Data Sets
DATA SAS-data-set; SET SAS-data-set1 SAS-data-set2 ; BY <DESCENDING> BY-variable(s); <additional SAS statements> RUN;
UPCASE(argument)
SUM(argument1,argument2, . . .)
Merging Data Sorting Data
DATA SAS-data-set; MERGE SAS-data-set1 SAS-data-set2 ; BY <DESCENDING> BY-variable(s); <additional SAS statements> RUN; PROC SORT DATA=input-SAS-data-set <OUT=output-SAS-data-set>; BY <DESCENDING> BY-variable(s); RUN;
SAS Data Set Options
SAS-data-set (DROP=variable-list)
Printing Data
PROC PRINT DATA=SAS-data-set <option(s)>; VAR variable(s); BY BY-variable(s); RUN;
SAS-data-set (KEEP= variable-list)
Procedures for Data Summarization
SAS-data-set (IN=variable) PROC FREQ DATA=SAS-data-set <option(s)>; TABLES variable(s) </option(s)>; <additional statements> RUN;
SAS-data-set (RENAME= (old-name-1=new-name-1) old-name-2=new-name-2 old-name-n=new-name-n))
Conditional Processing in the DATA Step
IF expression THEN DO; executable statements END; ELSE IF expression THEN DO; executable statements END;
PROC MEANS DATA=SAS-data-set <statistic(s)> <option(s)>; CLASS classification-variable(s); VAR analysis-variable(s); RUN;
PROC SUMMARY DATA=SAS-data-set <statistic(s)> <option(s)>; VAR analysis-variable(s); CLASS classification-variable(s); RUN;
IF expression THEN DELETE;
SAS Programming 1: Essentials
Formatting Data and Variable Names
PROC UNIVARIATE DATA=SAS-data-set NEXTROBS=n; VAR variable(s); RUN; LABEL variable1='label1' variable2='label2' . . . ;
FORMAT variable(s) format; PROC TABULATE DATA=SAS-data-set <option(s)>; CLASS classification-variable(s); VAR analysis-variable(s); TABLE page-expression, row-expression, column-expression </ option(s)>; <additional statements> RUN;
LENGTH variable(s) <$> length;
Subsetting Data
WHERE where-expression;
PROC FORMAT; VALUE format-name value-or-range1= 'formatted-value1' format-name value-or-range2= 'formatted-value2' ; RUN;
Output Delivery System (ODS)
ODS destination FILE='file-specification' <STYLE=style-definition>; SAS code generating output ODS destination CLOSE;
Creating Graphs
GOPTIONS <options-list>;
PROC GCHART DATA=SAS-data-set; chart-form chart-variable(s) </ option(s)>; RUN; QUIT;
ODS _ALL_=CLOSE;
PROC GPLOT DATA=SAS-data-set; PLOT vertical-variable*horizontal-variable </ option(s)>; SYMBOL<1255> <options>; RUN; QUIT;
Titles and Footnotes
TITLEn 'text'; FOOTNOTEn 'text';
SAS Programming 1: Essentials
Operators
Arithmetic Operators
Operator
** * / + -
Logical Operators
Example Priority
I I II II III III
Action
Operator
Meaning
negative prefix negative=-x; exponentiation raise=x**y; multiplication division addition subtraction mult=x*y; divide=x/y; sum=x+y; diff=x-y;
AND or & and, both. If both expressions are true, then the compound expression is true. OR or | or, either. If either expression is true, then the compound expression is true.
Special WHERE Statement Operators
Mnemonic Definition
Comparison Operators
Symbol(s) Mnemonic
= ^= = ~= > < >= <= = EQ NE GT LT GE LE IN
BETWEEN-AND inclusive range
Definition
equal to not equal to greater than less than greater than or equal to less than or equal to equal to one of a list
IS NULL IS MISSING CONTAINS (?) LIKE
missing value missing value character string character pattern
SAS Programming 1: Essentials
Formats and Informats
Commonly Used Formats
Format
$w. w.d COMMAw.d
SAS Date Values and SAS Date Formats
Format
MMDDYY6. MMDDYY8. MMDDYY10. DDMMYY6. DDMMYY8. DDMMYY10. DATE7. DATE9. WORDDATE. WEEKDATE. MONYY7. YEAR4.
Definition
writes standard character data. writes standard numeric data writes numeric values with a comma that separates every three digits and a period that separates every decimal fraction.
Stored Value
0 0 0 365 365 365 -1 -1 0
Displayed Value
010160 01/01/60 01/01/1960 311260 31/12/60 31/12/1960 31DEC59 31DEC1959 January 1, 1960
COMMAXw.d writes numeric values with a period that separates every three digits and a comma that separates the decimal fraction. DOLLARw.d writes numeric values with a leading dollar sign, a comma that separates every three digits, and a period that separates the decimal fraction. writes numeric values with a leading euro symbol (), a period that separates every three digits, and a comma that separates the decimal fraction.
0 Friday, January 1, 1960 0 0 JAN1960 1060
EUROXw.d
Commonly Used Informats
Informat
$w. w.d COMMAw.d DOLLARw.d
Definition
reads standard character data. reads standard numeric data reads nonstandard numeric data and removes embedded commas, blanks, dollar signs, percent signs, and dashes.
COMMAXw.d reads nonstandard numeric data and removes embedded periods, blanks, dollar signs, percent signs, and dashes. EUROXw.d reads nonstandard numeric data and removes embedded characters in European currency.
SAS Programming 1: Essentials
SAS Functions
SAS Date Functions
These date functions extract date information from the date value that SAS stores. Date Function YEAR(SAS-date) QTR(SAS-date) Value Extracted the year the quarter Value Returned a four-digit year a number from 1 to 4
MONTH(SAS-date) DAY(SAS-date)
the month the day of the month
a number from 1 to 12 a number from 1 to 31
WEEKDAY(SAS-date)
the day of the week
a number from 1 to 7 (1=Sunday, 2=Monday, and so on)
These date functions create a SAS date value. Date Function TODAY() SAS Date Value Created the current date
MDY(month,day,year) a date with numeric month, day, and year
Statistical Functions
Function
SUM MEAN
Syntax
sum(argument, argument,...) mean(argument, argument,...) min(argument, argument,...) max(argument, argument,...) var(argument, argument,...) std(argument, argument,...)
Calculates
sum of values average of non-missing values minimum value maximum value variance of the values standard deviation of the values
MIN MAX VAR STD
SAS Programming 1: Essentials
Using PROC APPEND
Comparing PROC APPEND and the SET Statement
Criterion
Speed
PROC APPEND
SET Statement
Is faster because it does not process observations Is slower because it in the BASE= data set. processes all observations in all input data sets. Can concatenate any number of input data steps in one DATA step. Uses all variables in all input data sets. If necessary, assigns missing values.
Number of data sets Is limited to two input data sets in one PROC APPEND step. Combining data sets that contain different variables Uses all variables in the BASE= data set. If necessary, assigns missing values to observations from the DATA= data set. Drops any variables found only in the DATA= data set.
When to Use the FORCE Option in PROC Append
Situation
DATA= data set variables are not in the BASE= data set.
What SAS Does
Drops the variable not present in the BASE= data set.
DATA= data set variables have a Replaces all values for the variable in the DATA= data set different type than the variables in the with missing values and keeps the variable type of the BASE= data set. variable specified in the BASE= data set. DATA= data set variables are longer Truncates values from the DATA= data set to fit them into than the variables in the BASE= data the length that is specified in the BASE= data set. set.
SAS Programming 1: Essentials
PROC MEANS Statistic Keywords
Descriptive Statistics Keyword
CLM CSS CV KURTOSIS LCLM MAX MEAN MIN N NMISS RANGE SKEWNESS STDERR SUM SUMWGT UCLM USS VAR
Description
Quantile Statistics Keyword
P1 P5 P10 Q1 / P25 Q3 / P75 P90 P95 P99 QRANGE
Description
two-sided confidence limit for the mean corrected sum of squares coefficient of variation kurtosis one-sided confidence limit below the mean maximum value average minimum value number of observations with non-missing values number of observations with missing values range skewness standard error of the mean sum sum of the weight variable values one-sided confidence limit above the mean uncorrected sum of squares variance
MEDIAN / P50 median or 50th percentile 1st percentile 5th percentile 10th percentile lower quartile or 25th percentile upper quartile or 75th percentile 90th percentile 95th percentile 99th percentile difference between upper and lower quartiles: Q3-Q1
Hypothesis Testing Keyword
PROBT T
Description
probability of a greater absolute value for the t value Student's t for testing the hypothesis that the population mean is 0
STDDEV / STD standard deviation
SAS Programming 1: Essentials
PROC TABULATE Statistic Keywords
Descriptive Statistics Keyword
COLPCTN
Description
percentage of a value in a single cell in relation to the total values in the column percentage of a sum in a single cell in relation to the total sum in the column sum of squares corrected for the mean percent coefficient of variation kurtosis one-sided confidence limit below the mean maximum value average minimum value most frequent value number of observations with non-missing values number of observations with missing values percentage of a value in a single cell in relation to the total of the values in the page percentage of a sum in a single cell in relation to the total of the values in the page percentage that one frequency represents of another frequency (can specify a denominator definition) percentage that one sum represents of another sum (can specify a denominator definition) range percentage of a value in a single cell in relation to the total of the value in the report
Keyword
REPPCTSUM
Description
percentage of a sum in a single cell in relation to the total of the value in the report percentage of a value in a single cell in relation to the total values in the row percentage of a sum in a single cell in relation to the total sum in the row skewness standard deviation standard error of the mean sum sum of the weights one-sided confidence limit above the mean uncorrected sum of squares variance
COLPCTSUM
ROWPCTN
CSS CV KURTOSIS | KURT LCLM MAX MEAN MIN MODE N NMISS PAGEPCTN
ROWPCTSUM
SKEWNESS | SKEW STDDEV | STD STDERR SUM SUMWGT UCLM USS VAR
PAGEPCTSUM
PCTN
PCTNSUM
RANGE REPPCTN
SAS Programming 1: Essentials
Quantile Statistics Keyword
MEDIAN | P50 P1 P5 P10 Q1|P25 Q3 | P75 P90 P95 P99 QRANGE
Description
median or 50th percentile 1st percentile 5th percentile 10th percentile lower quartile or 25th percentile upper quartile or 75th percentile 90th percentile 95th percentile 99th percentile interquartile range (difference between upper and lower quartiles)
Hypothesis Testing Keyword
PROBT | PRT T
Description
probability of a greater absolute value for the t value Student's t for testing the hypothesis that the population mean is 0
10