You are on page 1of 11

Introduction to Statistical

Computing
Methods in Clinical Research
July 2000
Health Service Cost Review
Commission (HSCRC) Data
• Discharge data on patients who underwent
abdominal aortic surgery in one of 52 non-
federal hospitals in MD.
• Data also obtained on the ICU
organizational characteristics for hospitals
in which patients were treated:
• 1994-1996
• Subset includes 490 patients
Types of Data Collected
• Outcomes (e.g. length of stay, mortality)
• Patient Characteristics (e.g. age, race)
• Comorbid Diseases (e.g. dementia, diabetes)
• Complications (e.g.aspiration, septicemia)
• Surgeon and Hospital Volume
• Organizational Characteristics (e.g. nurse-
patient ratio, frequency of morbidity review)
Motivating Question:
How are patient characteristics related to

– length of stay (los)?

– total charges (totchg)?

– days in ICU (icuday)?

– mortality (death)?
What variables do we have to
work with?

describe

inspect
What are the distributions of the
outcomes we are considering?
summarize
centile
hist
graph
tab
dotplot
stem
What does the patient population
look like?

age (age)

race (nonwhite)

gender (sex)
Do the outcomes differ by
gender?

boxplot

graph

table

by sex: summarize
Do the outcomes differ by race?
boxplot

histo

table

by nonwhite: summarize
Generating New Variables
• Length of Stay (los) appears “skewed”
• We want to “normalize” it by taking the
natural log.
• How do we make a new variable: log(los)?

generate loglos=log(los)
or
gen loglos=log(los)
Generating New Variables
• What if we want to create a categorical
variable of length of stay: short versus long
stay?
gen longstay=1 if los>10
replace longstay=0 if los<=10
or
gen longstay=cond(los>10,1,0)
replace longstay=. if los==.

You might also like