Professional Documents
Culture Documents
SAMUEL (MSc/STA/07/14)
6.30. Two Models for predicting the average length of patient stay in a hospital.
length 1.0000
age 0.1889 1.0000
infectio 0.5334 0.0011 1.0000
service 0.3555 -0.0405 0.4126 1.0000
The table above shows that there is positive linear association between length, age, infection
and service hence.
length 1.0000
beds 0.4093 1.0000
infectio 0.5334 0.3598 1.0000
service 0.3555 0.7945 0.4126 1.0000
The table above shows that there is positive linear association between length, beds, infection
and service.
A. Prepare a stem and leaf plot for each of the predictor variables.
. stem age
Stem-and-leaf plot for age (Average age of all patients (in yrs))
38* 8
39*
40*
41*
42* 0
43* 7
44* 2
45* 0257
46*
47* 12
48* 126
49* 01355579
50* 22456679
51* 0012355677789
52* 0011123458888
53* 02222788899
54* 00112245569
55* 00788
56* 0123355789999
57* 22668
58* 022
59* 0569
60* 9
61* 1
62* 2
63* 9
64* 1
65* 9
The stem and leaf plot shows that, there is a normal distribution and the plots are symmetric
and skewed to the mean 53.23.
. stem infectio
1* 334
1. 678
2* 0013
2. 5677899999
3* 011244
3. 5577778999
4* 0111122222333333344444
4. 5555555666778888999
5* 0000112233344
5. 5555666777889
6* 12334
6. 56
7*
7. 678
The stem and leaf plot shows that, there is a normal distribution and the plots are symmetric
and skewed to the mean 4.35
. stem service
0** 57
1** 14,43
1** 71,71
2** 00,29,29,29,29,29,29,29
2** 57,57,57,57,57,86,86,86,86,86,86,86
3** 14,14,14,14,14,43,43,43,43,43,43,43
3** 71,71,71,71,71,71,71,71,71
4** 00,00,00,00,00,00,00,00,29,29,29
4** 57,57,57,57,57,57,57,57,57,57,86,86,86,86,86,86,86,86
5** 14,14,14,14,14,14,14,14,43,43,43,43,43,43
5** 71,71,71,71,71,71,71
6** 00,00,29,29,29,29,29
6** 57,57,57,86,86,86
7** 14,43
7** 71
8** 00
The stem and leaf plot shows that, there is a normal distribution and the plots are symmetric
and skewed towards the mean.
. stem beds
0** 29
0** 52,56,60,64,68,70,72,72,73,76,76,80,83,85,87,90,91,92,92,95, ... (26)
1** 00,06,07,08,13,15,15,19,29,30,30,33,34,43,47
1** 50,54,57,63,65,66,67,67,70,75,76,80,82,84,86,90,91,95,95,95,96,96,97
2** 10,21,35,37,46,48
2** 52,65,66,70,79,81,97,98,98,98
3** 04,05,06,12,18,18,22
3** 53,56,62,87
4** 24,45
4** 61,77,87,89
5** 08,35,46
5** 68,71,93,95
6** 00,20,40
6**
7**
7** 52,68
8** 31,33,35
The stem and leaf plot shows that, there is a normal distribution and the plots are symmetric
and skewed to the left.
B. Obtain the scatter plot matrix and correlation matrix for the first model
graph matrix length age infectio service, half
Length
of stay
70
60 Average age
of all
50 patients
(in yrs)
40
8
6
Infection
4 risk
2
100
Available
50 facilities
and
services
0
5 10 15 2040 50 60 70 2 4 6 8
The scatter plot matrix shows that there is a linear relationship among the variables.
However, on infection risk and available facilities and services, there is a slow increase in the
infection risk with the available facilities and services.
length 1.0000
From the table above, it shows that age, infection risk and services are statistically significant
to the model since all of them are less that the p-value 0.05.
Obtain the scatter plot matrix and correlation matrix for the second model
graph matrix length beds infectio service, half
Length
of stay
1000
Number
500 of
beds
0
8
6
Infection
4 risk
2
100
Available
50 facilities
and
services
0
5 10 15 200 500 1000 2 4 6 8
The plots show that there is positive linear correlation among the variables length, number of
beds, infection risk and available facilities and services.
length 1.0000
From the table above, it shows that age, infection risk and services are statistically significant
to the model since all of them are less that the p-value 0.05.
C. Fit first order regression models with three variables
. regress length age infectio service
6 8 10 12
Linear prediction
6 8 10 12
Linear prediction
pnorm res1
1.00 0.75
Normal F[(res1-m)/s]
0.500.25
0.00
There is no model which is appropriate than the other since both of them are showing the
same trend. The scatter plots with line of best fits and normal probability plot look very
similar.
6.31.
A. First order regression models
B. There are slight differences in terms of coefficients in these models. However, any unit
increase in the average number of patients in the hospital per day during study period will no
or minimal contribution to the infection risk if the other variables like age, routine culturing
ratio and available facilities and services are held constant.
C. Calculation of MSE and R2 for each region.
Region 1 (NE):
√MSE = 1.0108
MSE = 1.0108*1.0108 = 1.02171664
R2 = 0.4613
Region 2 (NC):
√MSE = 1.1009
MSE = 1.1009*1.1009 = 1.21198081
R2 = 0.4115
Region 3 (S):
√MSE = 0.96784
MSE = 0.96784*0.96784 = 0.9367142656
R2 = 0.6088
Region 4 (W):
√MSE = 0.97663
MSE = 0.97663*0.97663 = 0.9538061569
R2 = 0.0896
The measures for the region 1 and 2 are similar and region 3 and 4 are also similar.
D. Obtain the residuals for each fitted model and prepare a box plot of the residuals for each
fitted model.
From the box plot above, it shows that there are no outliers. Most of the residuals fall
between -1 and 1. This shows the normal distribution.
. regress infectio age routine census service if region==2
The box plot above shows that there is one outlier and most of the plots are below the median
(average). This shows a negative skewness and is portraying an extreme value in the positive
skew.
. regress infectio age routine census service if region==3
The plots shows that there is positive skewness and shows a normal distribution although
there is one extreme value of the residuals to the negative side.
. regress infectio age routine census service if region==4
The plots shows the normal distribution and there are two residuals which are outside lower
and upper quartiles which seems to be outliers.