You are on page 1of 3

ST 307 Activity 3

Due: Sunday 2 February 2020 at 11:55pm

For this activity you will create a SAS program and upload that program to Moodle.
● Everyone must submit their own code (any cheating will be reported to student
conduct)
● Be sure that your SAS file adheres to the SAS file submission guidelines
(available on Moodle).
● You MUST Submit a .SAS file or you will receive a ZERO (no exceptions)!

You can save the program at any time. If you’d like to work on it more before submitting,
you’ll probably want to save it and email it to yourself (or use google drive, dropbox, etc.)
or upload it to moodle and then redownload it and replace it before the deadline with a
finished copy.

Dataset
For this activity, we will be using the built in SAS help data set called Heart
(sashelp.heart). More information on this data set can be found at this website (pg.69):
https://support.sas.com/documentation/tools/sashelpug.pdf

Task 1: Conceptual questions ​(3 points)


After the header in your program, answer the following questions in comments. Answers
should be no more than 2 words!

1. True or False: SASHELP pages can assist in learning syntax for options in PROC
steps. ​(1pt)
2. True or False: Considering the variable type (quantitative/qualitative) is important
when deciding what type of graph to choose in PROC SGPLOT ​(1pt)
3. Suppose you have a variable called ‘class’ that has five categories: 0 (Freshman), 1
(Sophomore), 2 (Junior), 3 (Senior), and 4 (Other). This variable would be classified
as “Num” in SAS. True or False: It make sense to take the mean of this variable. ​(1
pt)

Task 2: Analyze the dataset​ (29 points)

Write code corresponding to each step below, that is, do not change the code for step 1
to step 2 (you can copy and paste it so you don’t have to retype it, but leave the answer
to each step in your program). If you do not have code for every question (clearly labeled
with the question number above it) you will not receive points for that question:

1. Use an appropriate PROC step that allows you to see the list of variables in the
dataset. Use this proc step on the sashelp.heart data set to answer the following
question. ​(2pt)

1
a) In a comment under your code answer: What is the variable name that
corresponds to the label “Blood Pressure Status”. ​(1pt)

2. Print out the data set using an appropriate PROC step and the following options
(HINT: ​SAS Help - PROC PRINT​): ​(2pt)
● No observation numbers will print (We learned this in Activity 1) ​(1pt)
● The option that ​specifies to use the variables' labels as column headings ​(1pt)

3. Create a graph of the variable WEIGHT using an appropriate PROC step. Since we
know that WEIGHT is quantitative we want to create a horizontal boxplot (box and
whisker plot) of the WEIGHT variable with the following options (HINT: Search “SAS
9.4 PROC SGPLOT hbox statement options”. This will help you look at the syntax of
these options.): ​(3pt)
● Change the fill color of the box to any color (Here is a list of all the ​SAS
Colors​) ​(1pt)
● Hides the mean marker ​(1pt)
b) Answer in a comment: Based on the boxplot, roughly what is the median of the
weight variable?​ (1 pt)

4. Now we want to look at a qualitative variable. Let's consider the variable


DEATHCAUSE. First let's graph this variable using an appropriate PROC step to
make a vertical bar chart of the variable. Do this using the following options (you can
use the hint in 3 to help you again, just change hbox to vbar): ​(3pt)
● Change the fill color of the box to any color (Here is a list of all the ​SAS
Colors​) ​(1pt)
● Displays a label for each bar ​(1pt)

5. Copy and paste your code for number 4 into number 5. Then add the following
option to your graph. ​(1pt)
● accepts a missing value as a valid category value ​(1pt)
b) Make a comment below 5 that talks about the difference in interpretation of your
graphs in question 4 and 5 ​(1pt)

6. Using an appropriate PROC step create a two-way contingency table of the


DEATHCAUSE and WEIGHT_STATUS variables (DEATHCAUSE should be the
rows and weight status should be the columns).Use the following options: ​(3pt)
● Treats missing values as nonmissing ​(1pt)
● Suppresses display of column percentages ​(1pt)
● Suppresses display of percentages ​(1pt)

Answer the following questions in comments under your code:


b) Looking at only those who have cancer, which of the categories of
weight_status have the most observations in it? ​(1pt)
c) Based on your previous graphs, what do the blank row and column names
represent in the table? ​(1pt)

2
Make sure you have a comment prior to each step above that explains what the step
does and has the question number at the beginning! Don’t forget to change the header
as well and use proper spacing (see the SAS file submission guidelines on Moodle for all
requirements) ​Up to 10 points can be deducted ​for improper comments or spacing.
Save this program, ​ensure that it is a SAS file (*.SAS)​, and submit it to Moodle! If it is
NOT a .SAS file it will NOT be graded!

You might also like