Professional Documents
Culture Documents
Computing
Methods in Clinical Research
July 2000
Health Service Cost Review
Commission (HSCRC) Data
• Discharge data on patients who underwent
abdominal aortic surgery in one of 52 non-
federal hospitals in MD.
• Data also obtained on the ICU
organizational characteristics for hospitals
in which patients were treated:
• 1994-1996
• Subset includes 490 patients
Types of Data Collected
• Outcomes (e.g. length of stay, mortality)
• Patient Characteristics (e.g. age, race)
• Comorbid Diseases (e.g. dementia, diabetes)
• Complications (e.g.aspiration, septicemia)
• Surgeon and Hospital Volume
• Organizational Characteristics (e.g. nurse-
patient ratio, frequency of morbidity review)
Motivating Question:
How are patient characteristics related to
– mortality (death)?
What variables do we have to
work with?
describe
inspect
What are the distributions of the
outcomes we are considering?
summarize
centile
hist
graph
tab
dotplot
stem
What does the patient population
look like?
age (age)
race (nonwhite)
gender (sex)
Do the outcomes differ by
gender?
boxplot
graph
table
by sex: summarize
Do the outcomes differ by race?
boxplot
histo
table
by nonwhite: summarize
Generating New Variables
• Length of Stay (los) appears “skewed”
• We want to “normalize” it by taking the
natural log.
• How do we make a new variable: log(los)?
generate loglos=log(los)
or
gen loglos=log(los)
Generating New Variables
• What if we want to create a categorical
variable of length of stay: short versus long
stay?
gen longstay=1 if los>10
replace longstay=0 if los<=10
or
gen longstay=cond(los>10,1,0)
replace longstay=. if los==.