You are on page 1of 2

2017 Practical Methods of Secondary Data Analysis

Assignment 1 (10% of the total grade)


Due date: April 13, 2017, 24:00

Complete the following tasks, and submit the SAS code and results.
Using NHI teaching database 2001-2005, compute the following for men and women
separately:
1. the distribution of calendar year of the first outpatient visit during 2001 to 2005.
2. descriptive statistics of age at the first outpatient visit, including
(a) mean and standard deviation (SD)
(b) frequency distribution, i.e., number of subjects and percentage, of age
classified into the following categories, 0-<18, 18-<35, 35-<45, 45-<55, 55-
<65, >=65.

Note:
1. use sex and birthdate (to calculate age) of ID dataset
2. a year is defined as 365.25 days

<Suggested steps>
(1)
1. SET CD files, using INPUT or SUBSTR or YEAR function to convert outpatient date
(variable name: func_date) to a numeric value allowing for calculation, i.e., date
of year values.
2. Sort + if first.xxx to identify the record of the first outpatient visit
3. Use PROC FREQ to calculate distribution of the year of first outpatient visit.

(2)
1. SET ID files.
2. Combine id_birthday and id_sex in ID files to the file containing the first
outpatient visit. Using INPUT or SUBSTR or YEAR function to convert id_birthday
to date/year values.
MERGE statement
ID files The file that contains the record
Id_sex and id_birthday of the first outpatient visit

3. Calculate the age (including age=0 years) at the first outpatient visit and group
them by using sintax IFTHEN.
4. Use PROC MEANS & PROC FREQ to obtain descriptive statistics.
Table. Year and age of the first outpatient visits during 2001 to 2005 among men and
women
Men Women
(n= ) (n= )
n (%) n (%)
Calendar year of the first
outpatient visit
2000
2001
2002
2003
2004
2005
Age at the first outpatient
visit
Mean (SD)
0-<18
18-<35
35-<45
45-<55
55-<65
>=65

You might also like