You are on page 1of 27

Q1.

a.
Stem-and-leaf of WEIGHT N = 60
1 36 3
1 36
1 37
2 37 7
3 38 0
5 38 89
7 39 03
7 39
7 40
8 40 9
10 41 13
12 41 58
15 42 001
19 42 5589
21 43 34
22 43 5
24 44 04
26 44 78
30 45 2224
30 45 5799
26 46 23
24 46 568
21 47 24
19 47 589
16 48 1234
12 48 569
9 49 12
7 49 5
6 50
6 50 9
5 51 4
4 51 8
3 52
3 52 8
2 53
2 53 8
1 54
1 54
1 55
1 55
1 56
1 56 6

Leaf Unit = 0.1

B.
The histogram is right-skewed, and the IQR in the boxplot is not perfectly in the middle,
showing that the distribution is not perfectly normal. However, the normal probability plot is
linear which means that the confidence interval procedure can still be reasonably applied.

C.
Descriptive Statistics
N Mean StDev SE Mean 95% CI for μ
60 45.297 4.161 0.581 (44.158,
46.435)

μ: population mean of WEIGHT


Known standard deviation = 4.5
Q2.
B.
Descriptive Statistics
N Mean StDev SE Mean 95% CI for μ
35 23.600 5.720 0.967 (21.635,
25.565)

μ: population mean of Random samples


C.
Statistics
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum
MILEAGE 1076 0 24.823 0.167 5.476 11.000 21.000 25.000 28.000 66.000

The mean is 24.823 which lies in the confidence interval of (21.635, 25.565) found by the sample. It does
not necessarily have to lie in the confidence interval, as we took a 95% confidence interval meaning that
there is a 95% chance that the population mean lies in this interval.
Q3.
A.
Descriptive Statistics
95% Lower Bound
N Mean StDev SE Mean for μ
20 0.2950 0.5000 0.0939 0.1405

μ: population mean of GAIN


Known standard deviation = 0.42

Test
Null hypothesis H₀: μ ≤ 0.2

Alternative hypothesis H₁: μ > 0.2

Z-Value P-Value
1.01 0.156

Since this is a one-tailed Z-test and Zcalc < Ztab , (1.01 < 1.645) we accept the null hypothesis that
net % gain does not exceed 0.2, and the alternative hypothesis is rejected.

B.
Stem-and-leaf of GAIN N = 20
1 -11 0
1 -10
1 -9
1 -8
1 -7
1 -6
2 -5 0
2 -4
2 -3
3 -2 0
4 -1 0
4 -0
5 0 0
6 1 0
8 2 00
10 3 00
10 4 0
9 5 0
8 6 000
5 7 0
4 8 000
1 9 0

Leaf Unit = 0.01

C.
Descriptive Statistics
95% Lower Bound
N Mean StDev SE Mean for μ
19 0.3684 0.3874 0.0964 0.2099

μ: population mean of GAIN w/o outliers


Known standard deviation = 0.42

Test
Null hypothesis H₀: μ =
0.2
Alternative H₁: μ >
hypothesis 0.2
Z-Value P-Value
1.75 0.040

Since this is a one-tailed Z-test and Zcalc > Ztab , (1.75 > 1.645) we reject the null
hypothesis that net % gain does not exceed 0.2, and the alternative hypothesis is
accepted that the net gain % exceeds 0.2.
D.
N<30 and population standard deviation is also unknown so Z-test cannot be applied and T-test
is preferred.

Q4.
A.
Statistics
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum
INTEGRATED 227 0 24.916 0.300 4.513 11.000 22.000 25.000 28.000 35.000
STANDARD 192 0 23.000 0.521 7.221 6.000 18.000 23.000 28.750 40.000

B. A non-pooled t-procedure is applied to compare the population means because the


standard deviations are not equal, which is one of the conditions for applying pooled
procedures.
C.
- Non-pooled
Method
μ₁: population mean of
INTEGRATED
µ₂: population mean of STANDARD
Difference: μ₁ - µ₂

Equal variances are not assumed for this analysis

Descriptive Statistics
Sample N Mean StDev SE Mean
INTEGRATED 227 24.92 4.51 0.30
STANDARD 192 23.00 7.22 0.52

Test
Null hypothesis H₀: μ₁ - µ₂ =
0
Alternative H₁: μ₁ - µ₂ >
hypothesis 0
T-Value DF P-Value
3.19 309 0.001

- Pooled
Method
μ₁: population mean of
INTEGRATED
µ₂: population mean of STANDARD
Difference: μ₁ - µ₂

Equal variances are assumed for this analysis.

Descriptive Statistics
Sample N Mean StDev SE Mean
INTEGRATED 227 24.92 4.51 0.30
STANDARD 192 23.00 7.22 0.52

Estimation for Difference


99% Lower Bound
Difference Pooled StDev for Difference
1.916 5.910 0.563

Test
Null hypothesis H₀: μ₁ - µ₂ =
0
Alternative H₁: μ₁ - µ₂ >
hypothesis 0
T-Value DF P-Value
3.31 417 0.001
The null hypothesis is that on average they do not prefer the integrated treatment.
Tcalc > Ttab as 3.19 > 2.34 for the non-pooled test and 3.31 > 2.34 for the pooled test, so
we reject the null hypothesis and accept the alternative hypothesis that they do prefer
the integrated treatment.

D.

Method
μ₁: population mean of
INTEGRATED
µ₂: population mean of STANDARD
Difference: μ₁ - µ₂

Equal variances are not assumed for this analysis.

Descriptive Statistics
Sample N Mean StDev SE Mean
INTEGRATED 227 24.92 4.51 0.30
STANDARD 192 23.00 7.22 0.52

Estimation for Difference


98% CI for
Difference Difference
1.916 (0.511,
3.322)
Q5.
Method
μ₁: population mean of Weight
Method
µ₂: population mean of Groove
Method
Difference: μ₁ - µ₂

Equal variances are not assumed for this analysis.

Descriptive Statistics
Sample N Mean StDev SE Mean
Weight Method 11 23.71 7.19 2.2
Groove Method 11 19.95 5.77 1.7

Estimation for Difference


95% CI for
Difference Difference
3.75 (-2.06, 9.57)

Test
Null hypothesis H₀: μ₁ - µ₂ = 0

Alternative hypothesis H₁: μ₁ - µ₂ ≠ 0

T-Value DF P-Value
1.35 19 0.193

Here, the null hypothesis is that there is no difference in the 2 measurement methods, i.e μ₁ -
µ₂ = 0. At 5% significance level Tcalc < Ttab since 1.35 < 2.093, so we will accept the null
hypothesis and reject the alternative hypothesis that there is a difference.
Q6.
Stem-and-leaf of TEMP N = 93
3 96 789
31 97 0000122233444444666677788889
(52) 98 0000000000011122222222334444445555666666666778888888
10 99 0000122334

Leaf Unit = 0.1

b.
The probability plot is linear, and the histogram and boxplot also show normal distribution so
one SD χ2 can be applied.
C.
Descriptive Statistics
N Mean StDev SE Mean 95% CI for μ
93 98.1237 0.6468 0.0653 (97.9956, 98.2517)

μ: population mean of TEMP


Known standard deviation = 0.63

Test
Null hypothesis H₀: μ = 98.6

Alternative hypothesis H₁: μ ≠ 98.6

Z-Value P-Value
-7.29 0.000

The null hypothesis is that mean temperatures do not differ from 98.6.

Since [Zcalc] > [Ttab], as 7.29 > 1.96 the null hypothesis is in the critical region and is
rejected in favor of the alternative hypothesis that the mean temperatures differ from
98.6

D.
95% CI for μ
(97.9956,
98.2517)

μ: population mean of TEMP


Known standard deviation = 0.63

The confidence interval is (97.9956, 98.2517) which means that there is a 95% chance that the
population mean lies in this interval, and that 95% of all similar samples would have a mean
close to the population mean.
Q7.
Rows: C1 Columns: Worksheet columns
Diabetes No Diabetes All

Less than HS 33 218 251


18.73 232.27
14.269 -14.269
10.8692 0.8765

HS grad 25 389 414


30.90 383.10
-5.896 5.896
1.1250 0.0907

Some college 20 393 413


30.82 382.18
-10.821 10.821
3.7991 0.3064

College grad 17 178 195


14.55 180.45
2.448 -2.448
0.4117 0.0332

All 95 1178 1273

Cell Contents
Count
Expected count
Residual
Contribution to Chi-square

Chi-Square Test
Null hypothesis H₀: no association exists
Alternative hypothesis H₁: association exists

Chi-Square DF P-Value
Pearson 17.512 3 0.001
Likelihood 16.099 3 0.001
Ratio
Since χ2calc > χ2tab, 17.512 > 11.3449, the null hypothesis is rejected, and alternative
hypothesis accepted i.e there is an association between educational attainment and
diabetic state at the 1% significance level.
Q8.
A.
Coefficients
Term Coef SE Coef T-Value P-Value VIF
Constant -12.87 2.56 -5.03 0.000
Age 0.7033 0.0496 14.18 0.000 1.76
Weight 0.9699 0.0631 15.37 0.000 8.42
BSA 3.78 1.58 2.39 0.033 5.33
Dur 0.0684 0.0484 1.41 0.182 1.24
Pulse -0.0845 0.0516 -1.64 0.126 4.41
Stress 0.00557 0.00341 1.63 0.126 1.83

B.
Correlations
BP Age Weight BSA Dur Pulse
Age 0.659
Weight 0.950 0.407
BSA 0.866 0.378 0.875
Dur 0.293 0.344 0.201 0.131
Pulse 0.721 0.619 0.659 0.465 0.402
Stress 0.164 0.368 0.034 0.018 0.312 0.506

C.
Model Summary
S R-sq R-sq(adj) R-sq(pred)
0.407229 99.62% 99.44% 99.08%

An R2 value of 99.62% means that 99.62% of changes in BP can be explained by the


explanatory variables used in this model and only 0.38% can be explained by other
variables.

D.
Regression Equation
BP = -12.87 + 0.7033 Age + 0.9699 Weight + 3.78 BSA + 0.0684 Dur
- 0.0845 Pulse
+ 0.00557 Stress
Prediction
Fit SE Fit 95% CI 95% PI
114.205 0.0931451 (114.004, (113.303,
114.406) 115.108)

Based on this multivariable regression equation the predicted value for BP =


114.205, given the other values.
E.
Term VIF
Constant
Age 1.76
Weight 8.42
BSA 5.33
Dur 1.24
Pulse 4.41
Stress 1.83

Based on these VIF values, age, stress, and duration of hypertension are only
weakly correlated with the other explanatory variables. Pulse is moderately
correlated whereas weight and BSA are highly correlated with the other variables
and contribute to the variance inflation highly.

Q9.

Descriptive Statistics
99% Lower Bound
N Mean StDev SE Mean for μ
40 81.44 8.09 1.20 78.64

μ: population mean of Price


Known standard deviation = 7.61

Test
Null hypothesis H₀: μ = 78.01

Alternative hypothesis H₁: μ > 78.01

Z-Value P-Value
2.85 0.002

The null hypothesis is that the mean remained the same or decreased.
Since Zcalc > Ztab, 2.85 > 2.33 at 1% significance, the null hypothesis is rejected and the
alternative hypothesis i.e the mean increased, is accepted.
Q10.
Method
μ₁: population mean of Psychotic
µ₂: population mean of Non
psychotic
Difference: μ₁ - µ₂

Equal variances are not assumed for this analysis.

Descriptive Statistics

Sample N Mean StDev SE Mean


Psychotic 10 0.02426 0.00514 0.0016
Non 15 0.01643 0.00470 0.0012
psychotic

Estimation for Difference


99% Lower Bound
Difference for Difference
0.00783 0.00266
Test
Null hypothesis H₀: μ₁ - µ₂ =
0
Alternative H₁: μ₁ - µ₂ >
hypothesis 0
T-Value DF P-Value
3.86 18 0.001
In this one tailed test, the null hypothesis is that there is no difference in dopamine activity
between psychotic and non-psychotic samples.
Since Tcalc > Ttab, 3.86 > 2.55 at 1% significance, the null hypothesis is rejected and alternative
hypothesis is accepted, which is that dopamine activity is higher in psychotic patients.
Q11.
Rows: C1 Columns: Worksheet columns
red blue purple All

get better 169 191 196 556


184.1 184.1 187.8
-15.106 6.894 8.212
1.2394 0.2582 0.3591

get worse 129 89 106 324


107.3 107.3 109.4
21.715 -18.285 -3.430
4.3953 3.1163 0.1075

stay same 149 178 171 498


164.9 164.9 168.2
-15.901 13.099 2.801
1.5332 1.0406 0.0467

don't know 53 42 37 132


43.7 43.7 44.6
9.291 -1.709 -7.583
1.9751 0.0668 1.2897

All 500 500 510 1510

Cell Contents
Count
Expected count
Residual
Contribution to Chi-square

Chi-Square Test
Chi-Square DF P-Value
Pearson 15.428 6 0.017
Likelihood Ratio 15.360 6 0.018

Since χ2calc > χ2tab, 15.428 > 12.592, the null hypothesis ( no association/homogeneity) is
rejected and alternative hypothesis is accepted i.e an association/homogeneity exists in views
at 5% significance level.
Q12.
A. Area and population
Correlations
POPULATION
AREA 0.109

B. Plants and populations


Correlations
POPULATION
PLANTS 0.721

C. Plants and area


Correlations
AREA
PLANTS -0.309

D.
DF = 48
Area and population have a positive correlation, but it is weak.
Plants and population are significantly positively correlated.
Plants and area are moderately and negatively correlated.
Q13.

Model Summary
S R-sq R-sq(adj) R-sq(pred)
2.32115 9.71% 6.49% 0.00%
Model Summary
S R-sq R-sq(adj) R-sq(pred)
2.69933 15.56% 12.55% 0.00%

B.
A regression line for this is not reasonable because the R2 value is low and the
variables only have weak positive correlation, which is also indicated by the
scatterplot.

You might also like