You are on page 1of 6

SOR 202

SAS Laboratory Session 2

SAS Libraries and Manipulating Datasets: Tutorial 2
SAS Libraries
As already indicated (see page 3 of Tutorial 1), you automatically have full access to • A temporary library - the Work library • A permanent library - the Sasuser library. Creating a new library You can also create additional SAS data libraries in which to store and access your SAS datasets. For example, suppose within your H drive you wish to store all your SAS datasets within one folder namely “SOR202 SAS Datasets”. H: SOR202 SOR202 SAS Datasets

1. Create the folder where you wish to store your SAS datasets e.g. create the folder “SOR202 SAS Datasets” as a sub-folder of the folder “SOR202” within your H drive. 2. Add the SAS datasets (available through Queen’s Online - in the subdirectory “SAS Session 2”) to the folder you have just created. By adding a new library (within SAS) which accesses this folder, you will then be able to access these SAS datasets easily from SAS. A new SAS library (which could be called SOR202LB) can be created by one of the following ways: a. i. Select the “Explorer” Window and view the active libraries (within Libraries). Right click on the blank space within the Explorer window and select “New…” A “New Library” Window should appear. Input the name of the new SAS library within the name box (the name must be 8 characters or less, with no spaces) e.g. SOR202LB, and use the Browse button to select the folder within your home directory where you wish to store and access your SAS datasets. The option “Enable at startup” should be selected – this will ensure that this library will be available each time you open SAS.

ii.

iii.

13

SOR 202

SAS Laboratory Session 2

iv.

The new library “SOR202LB” should now appear alongside the other active libraries.

b. Alternatively this new library could have been created by submitting the following piece of SAS code:
libname SOR202LB "H:\SOR202\SOR202 SAS Datasets"; run;

If you double-click on the library “Sor202lb” in the Explorer Window you should be able to see the SAS datasets you have just downloaded:

14

SOR 202

SAS Laboratory Session 2

Accessing SAS Datasets within Libraries
In order to access SAS datasets found within SAS libraries (other than the “Work” library), you should use the general form of a SAS dataset name:
libref.SAS-dataset-name e.g. Sor202lb.diabetes • The first level, libref, is the SAS library in which the dataset can be

found. • The second level, SAS-dataset-name, is the actual name of the SAS dataset.

Browsing a SAS library/dataset: Proc Contents
A Proc Contents statement can be used to view the details of a SAS dataset. e.g. To view the contents of the SAS dataset “diabetes” submit the following portion of code:
proc contents data=Sor202lb.diabetes; run;

Information about the contents of this dataset should be outputted to the “Output” Window:

: :

Notice in particular that the output includes information on the number of observations and variables within the dataset. Attributes for each of the variables within the dataset (e.g. whether they are character/numeric and their maximum length) are also listed at the bottom of the output

15

SOR 202

SAS Laboratory Session 2

A Proc Contents statement can also be used to browse a SAS library. e.g. Submit the following portion of code:
proc contents data=Sor202lb._all_ nods; run;

Information about the contents of the library should be outputted to the “Output” Window:

• The _all_ keyword indicates to SAS to consider all the SAS datasets in the library Sor202lb. • The nods option suppresses the descriptor proportion.

Modifying contents of existing SAS datasets/creating new datasets from existing SAS datasets
So far, we have used the DATA statement to create a new SAS dataset, where the data was input from • Data files (e.g. excel files, text files etc) • Keyboard. We can also use the DATA statement to: 1. create a new SAS dataset from an existing SAS dataset, 2. modify the contents of an existing SAS dataset. Creating a new dataset from an existing SAS dataset The method of creating a new SAS dataset from an existing one is best shown through an example. e.g. To create a new SAS dataset called “diabetesv1”, to be stored in the library “SOR202LB” which is a replica of the SAS dataset “diabetes”, submit the following code:
data sor202lb.diabetesv1; set sor202lb.diabetes; run;

• The name of SAS dataset being created (diabetesv1) is given in the data statement. 16

SOR 202

SAS Laboratory Session 2

• The name of the SAS dataset being read (diabetes) is given in the set statement. • By default, the set statement reads in all the rows and columns from the input SAS dataset. If you wish to include only a sub-set of the variables in the original dataset in the newly created dataset, the keep or drop options can be used alongside the set statement. o e.g. If you wished to make a new SAS dataset called “diabetesv2” which only retained information on the variables “height” and “weight”, the following code should be submitted:
data sor202lb.diabetesv2; set sor202lb.diabetes (keep=height weight); run;

o e.g. If you wished to make a new SAS dataset called “diabetesv3” which retained all the variables except “age”, the following code should be submitted:
data sor202lb.diabetesv3; set sor202lb.diabetes (drop=age); run;

Modifying existing variables Additional assignment statements can be added to a data statement, after the set statement, in order to modify existing variables and/or create new ones. They are of the form:
variable=expression;

Again, this is best illustrated through an example: e.g. To create a new SAS dataset called “diabetesv4” (to be stored in the library “SOR202LB”) which is a replica of the SAS dataset “diabetes”, however it has an additional variable “weightstones” - which records every person’s weight in stones, submit the following code:
data sor202lb.diabetesv4; set sor202lb.diabetes; weightstones=weight/14; run;

• The above statements create a new SAS dataset called “diabetesv4” from the SAS dataset “diabetes”. • SAS uses the variable “weight”, which is part of the original “diabetes” dataset to compute a new variable called “weightstones”. • Note the variable “weight” in the dataset corresponds to a person’s weight in pounds, and needs to be divided by 14 to compute a person’s weight in stones, hence the expression used. • Notice that both the variables “weight” and “weightstones” are part of the newly created dataset.

17

SOR 202

SAS Laboratory Session 2

Operators are used in assignment statements to perform basic arithmetic calculations. Operator ** * / + Action Exponentiation Negative Prefix Multiplication Division Addition Subtraction Example
Raise=x**2; Negative=-x; Mult=x*y; Divide=x/y; Sum=x+y; Diff=x+y;

Priority I I II II III III

• Operations of priority I are performed before operations of priority II and so on. • Consecutive operations with the same priority are performed o From right to left if priority I o From left to right if priority II or III. • Parentheses or brackets can be used to control the order of operations e.g.
o New=(x-10)*y;

This statement ensures that for each observation in the dataset, 10 is first subtracted away from the x value for that observation before being multiplied by the observation’s y value to create the variable new – if the brackets were not present, the variable new would correspond to 10 times the observation’s y value being subtracted from the observation’s x value.

Proc Print Procedure
To print the data contained in a SAS dataset to the “Output” Window, a PROC PRINT statement can be used e.g. To print the data within the “diabetes” dataset, submit the following code:
proc print data=sor202lb.diabetes; run;

18