You are on page 1of 12

MANG 434 Final Exam Instructions

Samar Alfalahi
READ ALL INSTRUCTIONS VERY, VERY CAREFULLY!!!!!!
1. Use the SPSS data set emailed to you with the exam. This data set is similar to the one used in Module 7
and is about hospitals. Each row represents a different hospital. The variables you will need to use are
described below:
a. Stay—the average number of days patients spent at that hospital.
b. Age—the average age of the patients at the hospital.
c. InfctRsk—a value indicating the hospital’s score on a scale measuring the risk patients have of acquiring
an infection while staying in the hospital. The higher the score, the more likely patients are to
experience an infection.
d. Culture—a value indicating the hospital’s score on a scale measuring cleanliness. Lower scores on this
scale indicate the hospital has very GOOD cleanliness standards and will not tolerate poor cleanliness
habits. A high score indicates the hospital has very BAD cleanliness standards and is more tolerant of
poor cleanliness habits. For example, if a hospital indicates they disinfect the bedrails and TV remote
controls between patients, they would have a LOWER score than a hospital that does not routinely do
so.
e. X-ray—the average number of x-rays done at the hospital each month over the past year.
f. Beds—the number of patient beds at that hospital.
g. MedSchool—indicates if the hospital is associated with a school of medicine. A value of 1 indicates they
ARE associated with a medical school. A value of 2 indicates they are NOT associated with a medical
school.
h. Region—indicates the region of the country in which the hospital is located as per the following:
 1 = NorthEast
 2 = SouthEast
 3 = NorthWest
 4 = SouthWest
i. Census—the number of individuals (in thousands) the hospital serves.
j. Nurses—the number of nurses employed at the hospital.
k. Facilities—the number of separate functioning units at the hospital (e.g., surgery, manternity)
l. BedsperNurse—the number of beds an individual nurse might be expected to serve at the hospital. This
number was derived by dividing the number of beds by the number of nurses at each hospital.

2. Conduct the appropriate analyses to answer each of the following. For each question, write a
comprehensive, thorough, and detailed report. Make certain you include all the pertinent values from
your SPSS output to give the reader a good understanding of your findings and to generate an
answer.IMPORTANT—be certain that you use the variable descriptions to make your write-up
specific to the question being asked and the data used. Do NOT include cut and paste
elements from your SPSS output UNLESS specifically asked to do so (only one place where
this happens).
a. Do a complete descriptive analysis of the variables Age and MedSchool. Be certain the values you
report are appropriate for the type of data involved.

The descriptive statistics of the variables Age and MedSchool have been provided in table A in the
appendix section. The variable Age shows the average ages of the patients who get treatments at the
hospital. On the other hand, the variable MedSchool serves as an indicator of the association of the
hospital with any medical school. The value 1 shows that the hospital is associated with a medical
school, while a value of 2 shows that the hospital is not associated with a medical school. There were a
total of 100 samples for both the variables. No sample was missing for both the variables. The mean age
of the patients was calculated to be 53.018. On the other hand, the mean of the MedSchool indicator
was calculated to be 1.85. This value showed that the majority of hospitals were not associated with any
medical school as the value of 2 is an indicator of no association with any medical school. In this case,
standard deviation is also calculated for measuring the dispersion. The standard deviation of Age is
calculated to be 4.4028 while it is 0.359 for MedSchool. This shows that there is a higher level of
dispersion for the variable Age in comparison to the variable MedSchool. Apart from all this, skewness
has also been calculated for both the variables to measure the symmetry of the distribution of each
variable. The value of skewness calculated for Age is -0.113 while it is -1.990 for MedSchool. This shows
that both of the variables are inclined towards left in terms of the distribution. However, as the negative
value of MedSchool is higher, it is more inclined towards left as compared to the variable Age. For
further references, study Table A (Descriptive Statistics) provided in the appendix below.
b. Generate a scatterplot of the variables Beds and Census using SPSS. To do this, select Graphs from the
top row of choices (where you usually select Analyze). Select Legacy Dialogs, then Scatter/Dot near
the bottom of the list. Simple Scatter is the default and this is what you want to use. Click on Define.
Put Beds on the Y-Axis and Census on the X-Axis. COPY AND PASTE YOUR SCATTERPLOT INTO YOUR
ANSWER. Then describe what this scatterplot is telling you about the relationship between these two
variables.

The scatter plot deals with the variables of beds and census. Beds represent the total number of beds
available in the hospital. On the other hand, census is the number of individuals who are in the hospital
services. As shown above, the Y axis of the graph shows the amount of beds, while the X axis of the
graph represents the census. Each dot on the graph shows a hospital. There are a total of 100 hospitals
under observation here. There is a dense cluster observable in the area where the beds as well as the
census are low. As the number of both the variables starts to increase, the density starts to decrease.
This shows that a high majority of the beds have a lower than 400 beds and 200,000 people in the
census. The hospitals, which are outside of this range can be classified as outliers. They have a
comparatively higher number of beds as well as the number of people in population. The scatter plot
also shows that the highest number of recorded census is nearly 600,000 while the highest number of
beds recorded is nearly 1000.
c. The average number of X-rays taken per month in the known population of U.S. hospitals is 90. Is the
number of X-rays taken per month (using the variable Xray) at these hospitals significantly different
from that known population mean?

The variable of X-rays shows the average number of x-rays which are completed at a hospital over the
course of each month in a year. The information provided in the question is that the average number of
X-rays done in the known population of the hospitals in the USA is 90. This mean of US hospitals is to be
compared with the mean of the same variable provided in the data set. For this, it is important to first
carry out one sample t-test to compare the values of means. The total number of samples provided for
hospitals in the data set is 100. The mean of these 100 samples is calculated to be 81.833. This shows
that the mean of the variable x-ray is lower in the data set provided vis e vis the man of same value in
the case of USA. The standard deviation of the 100 samples provided in the sample is 19.8673. This
value of standard deviation makes for a high level of variance and dispersion in the recorded values
provided in the data set. The standard error here is calculated to be 1.9867. This value shows the
accuracy with which a sample is representing a given population. For further reference, consult table B
(One-Sample Statistics) provided in the appendix below.
Also, a one sample test is also conducted in this regard for the variable x-ray. The value of t is calculated
to be -4.111. The degree of freedom here is calculated to be 99. This degree of freedom means the total
number of values which are independent and can be assigned to the distribution at hand. the value of
significance here is 0.000. Moving on, the mean difference in here can be calculated to be -8.1670. The
lower bound and the upper bound of the confidence interval here is -12.109 and -4.225 respectively.
These are the bounds for the 95% confidence interval. For further references, studt table C (One-Sample
Test) provided in appendix below.

d. Is there a significant difference in the number of beds at these hospitals for those hospitals who are
associated with a medical school as compared to those who are not associated with a medical school?
Remember that for the variable MedSchool, a 1 indicates the hospital is associated with a medical
school and a 2 indicates the hospital is not associated with a medical school.

In the given case at hand, values of beds for two different cases have to be compared. The first case
here is the beds in hospitals, which are associated with a medical school. On the other hand, the second
case is the number of beds in the hospitals who are not associated with any medical school.
The data set shows that there are 15 schools out of the total 100 who are associated with a medical
school. On the other hand, there are 85 schools that are not associated with any medical school out of
the total 100. The mean of total beds in case of hospitals who are associates with a medical school is
493.13. On the other hand, the mean number of beds in the case of hospitals who are not associated
with any medical school is 196.28. These values show that the average number of beds is higher in the
case of hospitals who are associated with a medical school. The standard deviation is also higher for the
hospitals associated with a medical school. Same goes for the standard error of mean which is higher in
the case of hospitals that are associated with a medical school. For further references, study table D
(Group statistics) provided in the appendix below.
Also, it is important to calculate the independent samples test for both the variables to compare the
means of beds available in both types of hospitals. There are two types of equality variances here. There
are the equality variances which are assumed and there are the equality variances which are not
assumed. The value of F for both of them is 3.991. This value of F is used to assess the equality of the
different variances provided here (assumed and not assumed). The value of significance here is 0.049.
The degrees of freedom of assumed variances are 98 while it is 16.826 for the variances not assumed.
The value of mean difference is same for both which is 296.851. These same values show that the level
of absolute difference between both the variances is same. Moving on, the lower and upper bound of
assumed equal variances is 212.901 and 380.801 respectively. On the other hand, the lower and upper
bound for variances not assumed are 185.838 and 406.864 respectively. For further references, study
table E (Independent samples test) provided in the appendix below.

e. Is there a significant difference in number of BedsperNurse across the four different values of Region?
Be certain that you include a descriptive analysis, that you conduct a Levene’s test, and that you use a
Bonferroni’s post hoc analysis to fully answer this question.

The variable of BedsperNurse is the total number of beds which are there for each nurse in a hospital. As
mentioned before, there are four different geographic regions which are under consideration here.
These regions are NorthEast, SouthEast, NorthWest and SouthWest. The difference in the number of
bed across these regions is to be calculated.
For this purpose, it is important to first calculate the descriptive. The total number of hospitals in
NorthEast, SouthEast, NorthWest and SouthWest are 26, 27, 28 and 19 respectively. The means for
these regions are 1.4, 1.6, 1.9, 1.3 and 1.6. The purpose of this table is to summarize the given data set
of the hospitals provided in perspective of the four regions i-e NorthEast, SouthEast, NorthWest and
SouthWest. This has been done in the table F (Descriptives) in the appendix below.
Next, there are the calculations of tests of homogeneity of variances. The value of Levene’s statistic here
is calculated to be 1.530. The two degree of freedoms here are 3 and 96. Also, the value of significance
here can be calculated to be 0.212. For further reference, study table G (Test of homogeneity of
variances) provided in appendix below.
It is also important to run an ANOVA here. The ANOVA included between groups and within groups. First
of all, the sums of squares need to be calculated. The sum of squares for the between groups is 4.860
while it is 41.963 for the within groups. This brings the total of the sum of squares to 46.824. The df of
between groups and within groups is calculated to be 3 and 96 which bring the total to 99. The mean
squares are then calculated by dividing the sum of squares with the values of df. Via this calculation, the
mean squares of between groups become equal to 1.620 and for within groups, it becomes equal to
0.437. Both the mean squares are then divided to calculate the value of F statistic. The value of F here is
calculated to be 3.706. Also, the sig value here is 0.014. For further references, see table H (ANOVA)
given in the appendix section below.
It also becomes necessary to devise a table of comparisons using the Bonferroni’s analysis. This has been
provided in the table I (Multiple comparisons) provided in the appendix section below. In this table,
multiple comparisons have been made of a single variable against all the other variables. In this regard,
their mean differences, standard error, sig value and 95% confidence intervals have been calculated. The
value of significance here is kept at 0.05 level.
f. What are the relationships between every pair of the variables Beds, Census, Facilities, and Xray? Be
sure to include the pertinent values and describe the direction, strength, and significance of these
relationships.

g. Can you predict the variable Stay using any/all of the variables Xray, Culture, InfctRsk, and
BedsperNurse? Your final model must include ONLY significant predictors. Be certain that you fully
describe your final model and that you include the formula for your final model.

Regression Analysis also needs to be conducted here to identify the relations between the

dependent and independent variable. The R here stands for the multiple correlation coefficients.

The higher this value is, the stronger the relationship is. In this case, the value for R is 0.568. The

R square here becomes 0.323. This means that a 100% of the variation in time can be explained by

this model. The adjusted R square is a corrected version of the R square statistic. Its value here is

equal to 0.323. All of these calculations have been shown in the table L Model Summary provided

in appendix given below.

It can be seen that the value of significance or sig is equal to 0.000. As it is lower than 0.005, the

values of these calculations become significant. The two variables here are Regression and

Residual. The sum of squares can be calculated here. The value of the sum of squares for

regression becomes equal to 72.2 while it is 151.594 for Residual. This brings the total to

223.803. The degree of freedom for Regression is 2 while it is 97 for Residual. This brings the
total of this field for 99. The mean square can be calculated by dividing both the previous fields.

The value of the mean square for Regression becomes equal to 36.104 while it is 1.563 for

Residual. The R square change here represents the improvement which is made in the R-Square

value after the second predictor is factored in.

The results for the coefficients have also been provided in the table N Coefficients. First of all,

there are unstandardized coefficients. These are the coefficients which are produced as a result of

the regression analysis. These are in raw form. On the other hand, there are the standardized

coefficients which have a definite unit and are in “real life” scale. The t column shows the results

for the t-test. In the next column, values for the significance are given. Lastly, there is a 95%

confidence interval with both lower and upper bounds. The further calculations which have been

made in this case are the standard error, beta and values of B. The final Regression Formula

which is used in this case is:

Y (Stay)=6.074 (Constant) + 0.013(Xray) + 0.538 (InfctRsk)


Appendix

Table A: Descriptive Statistics

Age MedSchool

N Valid 100 100

Missing 0 0
Mean 53.018 1.85
Median 52.900 2.00
a
Mode 53.2 2
Std. Deviation 4.4028 .359
Variance 19.385 .129
Skewness -.113 -1.990
Std. Error of Skewness .241 .241
Range 27.1 1
Minimum 38.8 1
Maximum 65.9 2
Sum 5301.8 185

Table B: One-Sample Statistics

N Mean Std. Deviation Std. Error Mean

Xray 100 81.833 19.8673 1.9867

Table C: One-Sample Test

Test Value = 90

95% Confidence Interval of the


Difference

t Df Sig. (2-tailed) Mean Difference Lower Upper

Xray -4.111 99 .000 -8.1670 -12.109 -4.225

Table D: Group Statistics

MedSchool N Mean Std. Deviation Std. Error Mean

Beds Hospital associated with a


15 493.13 192.650 49.742
medical school

Hospital not associated with a


85 196.28 142.949 15.505
medical school

Table E: Independent Samples Test


Levene's Test for Equality
of Variances t-test for Equality of Means

95% Confidence Interval

Sig. (2- Mean Std. Error of the Difference

F Sig. t df tailed) Difference Difference Lower Upper

Beds Equal variances


3.991 .049 7.017 98 .000 296.851 42.303 212.901 380.801
assumed

Equal variances not


5.697 16.826 .000 296.851 52.103 186.838 406.864
assumed

Table F: Descriptives
BedsperNurse

95% Confidence Interval for


Mean

N Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum

NorthEast 26 1.4363 .41175 .08075 1.2700 1.6026 .92 2.59


SouthEast 27 1.6471 .70634 .13593 1.3677 1.9266 .87 4.00
NorthWest 28 1.9248 .91631 .17317 1.5695 2.2801 1.32 6.05
SouthWest 19 1.3519 .34021 .07805 1.1880 1.5159 .95 2.00
Total 100 1.6140 .68773 .06877 1.4775 1.7504 .87 6.05

Table G: Test of Homogeneity of Variances


BedsperNurse

Levene Statistic df1 df2 Sig.

1.530 3 96 .212

Table H: ANOVA
BedsperNurse

Sum of Squares df Mean Square F Sig.

Between Groups 4.860 3 1.620 3.706 .014


Within Groups 41.963 96 .437
Total 46.824 99

Table I: Multiple Comparisons


Dependent Variable: BedsperNurse
Bonferroni

Mean Difference 95% Confidence Interval

(I) Region (J) Region (I-J) Std. Error Sig. Lower Bound Upper Bound

NorthEast SouthEast -.21081 .18166 1.000 -.7002 .2786


*
NorthWest -.48849 .18007 .047 -.9736 -.0034

SouthWest .08440 .19955 1.000 -.4532 .6220


SouthEast NorthEast .21081 .18166 1.000 -.2786 .7002
NorthWest -.27768 .17833 .736 -.7581 .2027
SouthWest .29521 .19798 .835 -.2381 .8286
NorthWest NorthEast .48849* .18007 .047 .0034 .9736
SouthEast .27768 .17833 .736 -.2027 .7581
*
SouthWest .57289 .19651 .027 .0435 1.1023
SouthWest NorthEast -.08440 .19955 1.000 -.6220 .4532

SouthEast -.29521 .19798 .835 -.8286 .2381


*
NorthWest -.57289 .19651 .027 -1.1023 -.0435

*. The mean difference is significant at the 0.05 level.


Table J: Descriptive Statistics

Mean Std. Deviation N

Beds 240.81 184.216 100

Census 182.07 143.766 100

Facilities 42.656 14.9603 100

Xray 81.833 19.8673 100

Table K: Correlations
Beds Census Facilities Xray
** **
Beds Pearson Correlation 1 .984 .795 .014

Sig. (2-tailed) .000 .000 .889

N 100 100 100 100


** **
Census Pearson Correlation .984 1 .792 .030
Sig. (2-tailed) .000 .000 .768
N 100 100 100 100
** **
Facilities Pearson Correlation .795 .792 1 .092
Sig. (2-tailed) .000 .000 .363
N 100 100 100 100
Xray Pearson Correlation .014 .030 .092 1

Sig. (2-tailed) .889 .768 .363

N 100 100 100 100

**. Correlation is significant at the 0.01 level (2-tailed).

Table L: Model Summaryb

Change Statistics

Adjusted R Std. Error of R Square Sig. F


Model R R Square Square the Estimate Change F Change df1 df2 Change

1 .568a .323 .309 1.25013 .323 23.102 2 97 .000

a. Predictors: (Constant), InfctRsk, Xray


b. Dependent Variable: Stay
Table M: ANOVAa

Model Sum of Squares df Mean Square F Sig.

1 Regression 72.209 2 36.104 23.102 .000b

Residual 151.594 97 1.563

Total 223.803 99

a. Dependent Variable: Stay

b. Predictors: (Constant), InfctRsk, Xray


Table N: Coefficientsa

Standardized
Unstandardized Coefficients Coefficients

Model B Std. Error Beta t Sig.

1 (Constant) 6.074 .574 10.580 .000

Xray .013 .007 .174 1.888 .062

InfctRsk .538 .105 .471 5.099 .000

a. Dependent Variable: Stay

Table O: Residuals Statisticsa

Minimum Maximum Mean Std. Deviation N

Predicted Value 7.4296 11.7567 9.4948 .85404 100


Std. Predicted Value -2.418 2.648 .000 1.000 100
Standard Error of Predicted
.126 .359 .207 .063 100
Value
Adjusted Predicted Value 7.4514 11.6667 9.4897 .85097 100
Residual -2.85966 2.76218 .00000 1.23744 100
Std. Residual -2.287 2.210 .000 .990 100
Stud. Residual -2.299 2.253 .002 1.005 100
Deleted Residual -2.88926 2.87306 .00515 1.27567 100
Stud. Deleted Residual -2.352 2.303 .002 1.014 100
Mahal. Distance .018 7.195 1.980 1.860 100
Cook's Distance .000 .138 .010 .018 100
Centered Leverage Value .000 .073 .020 .019 100

a. Dependent Variable: Stay

You might also like