You are on page 1of 16

INSTRUCTIONS:

Imagine you are a statistician who has been tasked with assisting an investigator to determine if there are
differences in hospital experiences that lead to readmission after a hospitalization. You Are given the
dataset: “Diabetes Readmission Data.sav,” and are responsible for determining the best types of analyses
to do to help the investigator answer their questions.

1. 6 POINTS Before analyzing any data, you need to make sure that you understand just what type of
data you have (i.e. categorical - nominal, binary, ordinal, or interval or ratio). Fill out the following
table (specify the data type and why you chose that type) for each of the following variables:

Variable Data type Explain the reason you chose data type
Race Categorical-nominal Race can be classified as categorical that can be
classified into 5 valid values. As there is no
ordering in the categories and also you cannot
specify them from lowest to highest.
Gender Categorical-nominal Gender is classified as a categorical data because
we categorize gender into Female and Male
depending on the unique qualities or
characteristics of each gender type. In this if any
value is missing then a value is specified that’s
why considered in categorical nominal.
Age Interval As Intervals will be specified to take the input as
specified there will be 10 intervals.
Time in hospital Ratio As it finds the difference in the number of days
between the admission and discharge
Change Categorical-binary To check the change value will be there. (value is
either 0 or 1)
Readmit30Days Categorical-binary To check Patient admitted to hospital within 30
days or not

2. 4 POINTS Using the graphing options in SPSS, choose two appropriate graphical display options to
represent the following variable: time in hospital. One option must show outliers on the graph. Copy
and paste your graphs/charts below and for each chart, explain why you chose that chart option.

ANSWER BELOW:

Below are the two graphs that shows the variable time in hospital. You can choose the chart builder
option in SPSS to draw the below graphs. First is you can draw histogram chart and another chart is
boxplot through which outliers can also be drawn. For values in variable time in hospital histogram
graph is used and to show the outliers in the graph also box plot graph is used. You can also create
line graph in that you can check histogram option to draw using line only as shown below.
3. 1 POINT Describe the shape, location, and spread of the data for the time in hospital variable and
chart you generated above.

ANSWER BELOW: Shape of the graph is screwed Right.

4. 1 POINT Calculate a 5-point-summary for the time in hospital variable and describe how the 5-
point-summary relates to the graph in Question 2.

ANSWER BELOW:

Statistics
Length of Stay - Inpatient days between
admission and discharge
N Valid 101766

Missing 0
Mean 4.40
Median 4.00
Mode 3
Minimum 1
Maximum 14
Sum 447362
Percentiles 25 2.00

50 4.00

75 6.00

5-point summary: Remember here there are 5 data points we are interested in (5-point summary).
We are interested in Q0 (min), Q1 (25th%ile), Q2(median – 50th%ile), Q3 (75th%ile), Q4 (max). So
based on the SPSS output above, the minimum is 1, the maximum is 14, 25th%ile is 2, 50th%ile is 4,
75th%ile is 6. So the 5-point summary is 1, 2, 4, 6, 14. To go even further, the IQR (interquartile
range) is Q3-Q1 = 6-2 = 4. One step more and we can find outliers to our data. First we multiple the
IQR by 1.5, so 4 X 1.5 = 6. Then we subtract it from Q1, and add it to Q3. So Q1-6 = 2-6 = -4, and
Q3+6 = 6+6 = 12. So, any numbers below -4 and above 12 are outliers. You can’t have any number
of convulsions less than 0 so there are not lower outliers, but because the maximum number is 14,
which is above 12 and also you can see in above graph that shows outliers.

5. 4 POINTS Using the graphing options in SPSS, choose an appropriate graphical display option to
represent the following variables: race and age. Copy and paste your graphs/charts below and for
each chart (1 for each variable), explain why you chose that chart option.

ANSWER BELOW:

As the age can be defined in the form of intervals so the bar graph is recommended for the age
variable. For the Race variable categories can be defined or plotted in the form of Pie chart and bar
chart also. So all the graphs are shown below.

In case of age count and cumulative frequency any one of them is considered.
Bar Plot

Second option for Race variable is bar plot


6. 8 POINTS For quality control (and reimbursement) purposes hospital administrators are generally
interested in whether a patient is readmitted to the hospital within 30 days of a hospital discharge.
One of the questions the investigator has is: Is there a difference in the number of inpatient visits in
the previous year between the group who has a readmission within 30 days and the group who does
not have a readmission within 30 days? Conduct a formal hypothesis test to answer this question
(choose the appropriate statistical test, explain why you chose it, write out your null and alternative
hypotheses, run the test, and interpret the results). Include appropriate output from SPSS to show
what you did.

ANSWER BELOW:

As there will be only two groups and that are mutually exclusive. Let one group be inpatient visits in
the previous year between the group who had readmission within 30 days and another group who
does not have a readmission within 30 days. So for two independent groups t-test is applied.
Call (Patient was not readmitted to the hospital with in 30 days) -- Group 1 and (Patient was
readmitted to the hospital within 30 days) Group 2

1. Ho: u1 = u2, Ha: u1 ≠ u2


2. t-statistic = -35.384 – you use the second row this case. One of the assumptions of a t test is that
the variances between groups are equal so we have to the Levene’s test first where Ho: 21 = 22,
Ha: 21 ≠ 22, so here we have a significance value of p=0.000 (p<0.001) so we reject the null
hypothesis of equal variances and accept the alternative hypothesis that the variances are not
equal. So we have to use the second row of the t-test results table.
3. p-value = 0.000 (p<0.001) – this is from the row labeled Sig. (2-tailed)
4. Significance – this is a statistically significant result at an alpha value of 0.05.
5. Conclusion – Reject the null hypothesis that there is no difference in the number of patient
readmitted to hospital within 30days and accept the alternative hypothesis there is difference in
the number of patient readmitted to hospital within 30days.

7. 8 POINTS Sometimes hospital readmission is coded in different ways. For instance, data can be
collected that describes whether a patient was readmitted within 30 days, was readmitted beyond 30
days, or not readmitted at all. Instead of looking at readmission within 30 days or not, the investigator
changes their question to ask if there is a difference in the number of inpatient visits in the previous
year between the groups who were not readmitted, were readmitted within 30 days, and were
readmitted after 30 days. Conduct a formal hypothesis test to answer this question (choose the
appropriate statistical test, explain why you chose it, write out your null and alternative hypotheses,
run the test, and interpret the results). Include appropriate output from SPSS to show what you did.

ANSWER BELOW:
In this there are three groups that are mutually exclusive. As in this the number of groups are more
than 2 so t-test is not applicable on this data. So Anova applied on these groups as Anova can take
more than 2 groups that are mutually exclusive.
Let Patients not admitted be group 1, Patients readmitted with in 30 days be group 2, , Patients not
readmitted with in 30 days be group 3.

1. Ho: u1 = u2 = u3, Ha: means are different.


2. Test statistic – F = 2963.324 (from 5th column of ANOVA Table)
3. P-value = 0.000 (report as p<0.001) (from 6th column of ANOVA table)
4. Significance: this is a statistically significant result based on an alpha value of 0.05.
5. Conclusion: reject the null hypothesis that the 3 means are equal and accept the alternative hypothesis
that means are different. To determine which means are different you refer to the post hoc tests and
also the mean plot chart as shown above as reported in the Multiple Comparisons table. In this case
you can see the difference in mean scores between each group and the others (refer to the mean
difference column – 3rd column). Each of the significant differences in flagged with an asterisk, and
the actual p-value for each comparison is given in he “Sig.” column (column 5). So in this case, all of
the p-values are less than 0.05 so we can reject all of these null hypotheses, accept all of the
alternatives, and conclude that the means are different.

8. 8 POINTS When a person is admitted to the hospital, if they have uncontrolled diabetes it can greatly
affect the course of their hospital stay. One indicator of uncontrolled diabetes is a hemoglobin A1c
test. The investigator asks: Is there an association between being readmitted within 30 days (or not),
and whether an A1c test is normal (or not)? Conduct a formal hypothesis test to answer this question
(choose the appropriate statistical test, explain why you chose it, write out your null and alternative
hypotheses, run the test, and interpret the results). Include appropriate output from SPSS to show
what you did.

ANSWER BELOW:

CHI-SQUARE test is applicable. As for crosstabs chi-square test is applicable.


1. Ho: There is no association between being readmitted within 30 days (or not), and whether an
A1c test is normal (or not), Ha: There is association between being readmitted within 30 days (or
not), and whether an A1c test is normal (or not).
2. Test statistic: Chi-square = 0.284
3. P-value = 0.000 (report as p<0.001).
4. Significance, this test result is significant at an alpha value of 0.05.
5. Conclusion: Reject the null hypothesis of no association and accept the alternative hypothesis that
There is association between being readmitted within 30 days (or not), and whether an A1c test is
normal (or not). Remember, in a chi-square test we are really trying to determine if what we
observe in a distribution is different from what we expect. Because we accept the alternative
hypothesis of an association, we also use the cross tabulated table to see where there are
differences between groups.

Grading: Grades are based on the following rubrics:


Question 1 1 point 0.5 points 0 points
6 Points total with 1 potential Correctly identifies data type AND Correctly identifies data type OR Incorrectly identifies
point for each variable. explanation supports identification explanation supports identification, data type AND
BUT both are not present explanation does not
support identification

Question 2 2 Points 1 Point 0 points


4 points total with 2 points Correct chart type included, Correct chart type included, Incorrect chart type
maximum for each chart/graph explanation supports choice of chart explanation is not clear or not
included

Question 3 1 Point Partial credit 0 Points


1 Point – Describe shape, Correctly describes each of shape, Partial point for correctly Incorrectly describes
location, and spread of data location, and spread. describing one or two of shape, shape, location, and
location, or spread but not all spread, OR is not done.

Question 4 1 Point 0 Points


1 Point – 5-point summary and Correctly calculates 5-point Incorrect 5-point summary or
how it relates to one chart from summary and relates the values of incorrect description of how it
Q2 the summary to points on a chart relates to a chart from Q2.
from Q2

Question 5 2 Points 1 Point 0 points


4 Points – 2 points maximum for Correct chart type included, Correct chart type included, Incorrect chart type
each graph explanation supports choice of chart explanation is not clear or not
included

Question 6 8 Points 0-7 points


8 Points – Hypothesis test 1) Correct choice of test, 2) Partial credit awarded for each of
explanation of choice, included both the 8 required components of the
3) null and 4) alternative hypotheses, answer
5) ran the test correctly, 6)
interpreted the results, 7) made
conclusions about hypotheses, and 8)
included appropriate output from
SPSS.
Question 7 8 Points 0-7 points
8 Points – Hypothesis test 1) Correct choice of test, 2) Partial credit awarded for each of
explanation of choice, included both the 8 required components of the
3) null and 4) alternative hypotheses, answer
5) ran the test correctly, 6)
interpreted the results, 7) made
conclusions about hypotheses, and 8)
included appropriate output from
SPSS.

Question 8 8 Points 0-7 points


8 Points – Hypothesis test 1) Correct choice of test, 2) Partial credit awarded for each of
explanation of choice, included both the 8 required components of the
3) null and 4) alternative hypotheses, answer
5) ran the test correctly, 6)
interpreted the results, 7) made
conclusions about hypotheses, and 8)
included appropriate output from
SPSS.

You might also like