You are on page 1of 16

Last Name 1

Student’s Name

Professor’s Name

Course Number

Date

Eddie Database STATA Task

1. The variables used in this study are PHYSHLTH (number of days physical health was not

good in a month), MENTHLTH (number of days mental health was not good in a

month), POORHLTH (number of days health was poor in a month), RACE3

(classification of races and ethnicities), SEX (gender of the respondent) and UNHEALTH

(sum of physically and mentally unhealthy days).

2. The independent variables are RACE3 and SEX. The other variables are dependent.

3. The distribution of the data can be determined by examining its median, mean and

standard deviation. It can also be determined visually through boxplots or histograms.

From the distributions (screenshots at the appendix), we can see that variables are not

normally distributed, but skewed. It will be better to perform non-parametric tests on

them.

4. If the confidence contains a zero, we can deduce that a non-significant

relationship/difference exists. This was only observed in the other race (non-Hispanic)

variable for mental health 95%CI (-3.08, 8.12) and poor health 95%CI (-4.22, 7.67)
Last Name 2

Results from Stata

Distribution plot of the dependent variables


30
20
Frequency
10
0

PHYSHLTH MENTHLTH
POORHLTH Sum of unhealthy days

Figure 1: Distribution of dependent variables


These variables do not exhibit normality. It will be best to use non-parametric tests. Parametric

tests will still work, but the results may not be reliable.
Last Name 3

Table 1: T-test on physical health by sex


Two-sample t test with equal variances

Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]

Male 172 3.30814 .5863105 7.68939 2.150801 4.465478


Female 323 3.560372 .4513479 8.111715 2.672408 4.448335

combined 495 3.472727 .3578025 7.960604 2.769725 4.17573

diff -.252232 .7520966 -1.729942 1.225478

diff = mean(Male) - mean(Female) t = -0.3354


Ho: diff = 0 degrees of freedom = 493

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0


Pr(T < t) = 0.3687 Pr(|T| > |t|) = 0.7375 Pr(T > t) = 0.6313

Conclusion:

The p-value obtained is 0.7375, which is greater than the alpha of 0.05. We fail to reject the null

hypothesis. There is no significant difference between the mean physical health of males and

females.

Table 2: T-test on mental health by sex


Two-sample t test with equal variances

Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]

Male 171 2.070175 .4284819 5.603128 1.224345 2.916006


Female 319 4.163009 .4819193 8.607355 3.214856 5.111163

combined 490 3.432653 .350191 7.751809 2.744588 4.120718

diff -2.092834 .729321 -3.525831 -.659837

diff = mean(Male) - mean(Female) t = -2.8696


Ho: diff = 0 degrees of freedom = 488

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0


Pr(T < t) = 0.0021 Pr(|T| > |t|) = 0.0043 Pr(T > t) = 0.9979
Last Name 4

Conclusion:

The p-value obtained is 0.0043, which is less than the alpha of 0.05. We reject the null

hypothesis. There is a significant difference between the mean mental health of males and

females.

Table 3: T-test on poor health by sex


Two-sample t test with equal variances

Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]

Male 173 1.67052 .4127057 5.428296 .8559002 2.48514


Female 323 2.136223 .3532168 6.348083 1.441319 2.831127

combined 496 1.97379 .2712811 6.041717 1.440786 2.506795

diff -.4657027 .5694062 -1.584459 .6530539

diff = mean(Male) - mean(Female) t = -0.8179


Ho: diff = 0 degrees of freedom = 494

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0


Pr(T < t) = 0.2069 Pr(|T| > |t|) = 0.4138 Pr(T > t) = 0.7931

Conclusion:

The p-value obtained is 0.4138, which is greater than the alpha of 0.05. We fail to reject the null

hypothesis. There is no significant difference between the mean of poor mental health days of

males and females.


Last Name 5

Table 4: T-test on unhealthy days by sex


Two-sample t test with equal variances

Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]

Male 169 4.828402 .6779136 8.812877 3.490075 6.166729


Female 316 6.338608 .5882854 10.45759 5.181142 7.496073

combined 485 5.812371 .4510074 9.932409 4.926197 6.698545

diff -1.510205 .9450234 -3.36707 .3466595

diff = mean(Male) - mean(Female) t = -1.5981


Ho: diff = 0 degrees of freedom = 483

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0


Pr(T < t) = 0.0553 Pr(|T| > |t|) = 0.1107 Pr(T > t) = 0.9447

Conclusion:

The p-value obtained is 0.1107, which is greater than the alpha of 0.05. We fail to reject the null

hypothesis. There is no significant difference between the mean number of unhealthy days of

males and females.

2. Simple t-test was used in this problem because we were comparing the means of two

independent groups that can be shown to exhibit approximately normal distribution.


Last Name 6

Table 5: ANOVA of physical health by race


Race/Ethnic Summary of PHYSHLTH
ity Mean Std. Dev. Freq.

White onl 2.9973545 7.3062956 378


Black onl 4.6981132 9.6627455 53
Other rac 3.2608696 6.9492798 23
Hispanic 6.2162162 10.656597 37

Total 3.4358452 7.8992029 491

Analysis of Variance
Source SS df MS F Prob > F

Between groups 443.856906 3 147.952302 2.39 0.0679


Within groups 30130.8722 487 61.8703742

Total 30574.7291 490 62.3974064

Bartlett's test for equal variances: chi2(3) = 17.6397 Prob>chi2 = 0.001

Conclusion:

The p-value obtained is 0.0679, which is greater than the alpha of 0.05. Also, the F-statistic is

2.39, at 3 degrees of freedom between groups. We fail to reject the null hypothesis. There is no

significant difference in the physical health variable according to races.


Last Name 7

Table 6: ANOVA of mental health by race


Race/Ethnic Summary of MENTHLTH
ity Mean Std. Dev. Freq.

White onl 2.6461126 6.5838853 373


Black onl 7.1481481 11.093781 54
Other rac 3.9090909 9.5114422 22
Hispanic 5.2162162 9.1230329 37

Total 3.399177 7.6848438 486

Analysis of Variance
Source SS df MS F Prob > F

Between groups 1098.36954 3 366.12318 6.41 0.0003


Within groups 27544.1901 482 57.1456227

Total 28642.5597 485 59.0568241

Bartlett's test for equal variances: chi2(3) = 38.5537 Prob>chi2 = 0.000

Conclusion:

The p-value obtained is 0.0003, which is less than the alpha of 0.05. Also, the F-statistic is 6.41,

at 3 degrees of freedom between groups. We reject the null hypothesis. There is a significant

difference in the mental health variable according to races.


Last Name 8

Table 7: ANOVA of Poor health by race


Race/Ethnic Summary of POORHLTH
ity Mean Std. Dev. Freq.

White onl 1.7792553 5.8548392 376


Black onl 2.0727273 5.5739292 55
Other rac 3.625 9.5862787 24
Hispanic 2.5405405 5.6499488 37

Total 1.9593496 6.0357366 492

Analysis of Variance
Source SS df MS F Prob > F

Between groups 91.9855203 3 30.6618401 0.84 0.4719


Within groups 17795.2015 488 36.4655768

Total 17887.187 491 36.4301161

Bartlett's test for equal variances: chi2(3) = 15.2150 Prob>chi2 = 0.002

The p-value obtained is 0.4719, which is greater than the alpha of 0.05. Also, the F-statistic is

0.84, at 3 degrees of freedom between groups. We fail to reject the null hypothesis. There is no

significant difference in the poor health variable according to races.


Last Name 9

Table 8: ANOVA of unhealthy days by race


Race/Ethnic Summary of Sum of unhealthy days
ity Mean Std. Dev. Freq.

White onl 4.9946092 9.0610526 371


Black onl 8.8269231 12.189229 52
Other rac 5.7142857 11.109198 21
Hispanic 9.5945946 12.255292 37

Total 5.7941788 9.9037686 481

Analysis of Variance
Source SS df MS F Prob > F

Between groups 1249.98754 3 416.662514 4.34 0.0050


Within groups 45830.6362 477 96.0809982

Total 47080.6237 480 98.0846327

Bartlett's test for equal variances: chi2(3) = 14.6910 Prob>chi2 = 0.002

The p-value obtained is 0.005, which is less than the alpha of 0.05. Also, the F-statistic is 4.34, at

3 degrees of freedom between groups. We reject the null hypothesis. There is a significant

difference in the number of unhealthy days variable according to races.

4. Simple ANOVA was used because we wished to compare means of more than 2 multi-

level group variables. Just like the t-tests, these variables have approximately normal

distributions.

5. Post-hoc testing is used to determine which variables or interactions provide a

significance difference in the ANOVA model. The ANOVA can tell us which distributions have

significant differences, but does not tell precisely which of the levels of interaction provide those

differences, and what direction the differences are. Post-hoc testing is not required for ANOVAS

that do not present a significant result. In our model, the post-hoc test for the significant

interactions as shown in the tables below:


Last Name 10

Table 9: post-Hoc test of ANOVA of mental health by race


Pairwise comparisons of means with equal variances

over : RACE3

Number of
Comparisons

RACE3 6

Tukey Tukey
MENTHLTH Contrast Std. Err. t P>|t| [95% Conf. Interval]

RACE3
Black only, non-hispanic
vs
White only, non-hispanic 4.502036 1.100662 4.09 0.000 1.664511 7.33956
Other race only, non-hispanic
vs
White only, non-hispanic 1.262978 1.658534 0.76 0.872 -3.012747 5.538704
Hispanic
vs
White only, non-hispanic 2.570104 1.302951 1.97 0.200 -.7889247 5.929132
Other race only, non-hispanic
vs
Black only, non-hispanic -3.239057 1.91201 -1.69 0.328 -8.168248 1.690133
Hispanic
vs
Black only, non-hispanic -1.931932 1.613297 -1.20 0.629 -6.091038 2.227174
Hispanic
vs
Other race only, non-hispanic 1.307125 2.035192 0.64 0.918 -3.93963 6.553881

The significant difference in mental health is observed in [Black only, non-Hispanic vs White

only, non-Hispanic] (p < 0.01).


Last Name 11

Table 10: post-Hoc test of ANOVA of unhealthy days by race


Pairwise comparisons of means with equal variances

over : RACE3

Number of
Comparisons

RACE3 6

Tukey Tukey
UNHEALTH Contrast Std. Err. t P>|t| [95% Conf. Interval]

RACE3
Black only, non-hispanic
vs
White only, non-hispanic 3.832314 1.451444 2.64 0.042 .0903314 7.574296
Other race only, non-hispanic
vs
White only, non-hispanic .7196765 2.198696 0.33 0.988 -4.948804 6.388157
Hispanic
vs
White only, non-hispanic 4.599985 1.6899 2.72 0.034 .2432372 8.956734
Other race only, non-hispanic
vs
Black only, non-hispanic -3.112637 2.534363 -1.23 0.609 -9.646505 3.42123
Hispanic
vs
Black only, non-hispanic .7676715 2.108197 0.36 0.983 -4.667493 6.202836
Hispanic
vs
Other race only, non-hispanic 3.880309 2.678072 1.45 0.469 -3.024055 10.78467

The significant difference in number of unhealthy days is observed in [Black only, non-Hispanic

vs White only, non-Hispanic] (p = 0.042).


Last Name 12

-> RACE3 = White only, non-hispanic

PHYSHLTH

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 378
25% 0 0 Sum of Wgt. 378

50% 0 Mean 2.997354


Largest Std. Dev. 7.306296
75% 2 30
90% 10 30 Variance 53.38196
95% 25 30 Skewness 2.845068
99% 30 30 Kurtosis 10.06776

MENTHLTH

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 373
25% 0 0 Sum of Wgt. 373

50% 0 Mean 2.646113


Largest Std. Dev. 6.583885
75% 2 30
90% 7 30 Variance 43.34755
95% 15 30 Skewness 3.238909
99% 30 30 Kurtosis 12.99682

POORHLTH

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 376
25% 0 0 Sum of Wgt. 376

50% 0 Mean 1.779255


Largest Std. Dev. 5.854839
75% 0 30
90% 5 30 Variance 34.27914
95% 14 30 Skewness 4.007559
99% 30 30 Kurtosis 18.59005

Sum of unhealthy days

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 371
25% 0 0 Sum of Wgt. 371

50% 0 Mean 4.994609


Largest Std. Dev. 9.061053
75% 5 30
90% 24 30 Variance 82.10267
95% 30 30 Skewness 2.021349
99% 30 30 Kurtosis 5.669441
Last Name 13

-> RACE3 = Black only, non-hispanic

PHYSHLTH

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 53
25% 0 0 Sum of Wgt. 53

50% 0 Mean 4.698113


Largest Std. Dev. 9.662745
75% 4 30
90% 29 30 Variance 93.36865
95% 30 30 Skewness 2.052754
99% 30 30 Kurtosis 5.555446

MENTHLTH

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 54
25% 0 0 Sum of Wgt. 54

50% 0 Mean 7.148148


Largest Std. Dev. 11.09378
75% 14 30
90% 30 30 Variance 123.072
95% 30 30 Skewness 1.272582
99% 30 30 Kurtosis 2.988067

POORHLTH

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 55
25% 0 0 Sum of Wgt. 55

50% 0 Mean 2.072727


Largest Std. Dev. 5.573929
75% 0 14
90% 5 15 Variance 31.06869
95% 15 20 Skewness 3.408357
99% 30 30 Kurtosis 14.94118

Sum of unhealthy days

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 52
25% 0 0 Sum of Wgt. 52

50% 0 Mean 8.826923


Largest Std. Dev. 12.18923
75% 18 30
90% 30 30 Variance 148.5773
95% 30 30 Skewness .9620966
99% 30 30 Kurtosis 2.156771
Last Name 14

-> RACE3 = Other race only, non-hispanic

PHYSHLTH

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 23
25% 0 0 Sum of Wgt. 23

50% 0 Mean 3.26087


Largest Std. Dev. 6.94928
75% 1 10
90% 20 20 Variance 48.29249
95% 20 20 Skewness 1.913914
99% 20 20 Kurtosis 4.877515

MENTHLTH

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 22
25% 0 0 Sum of Wgt. 22

50% 0 Mean 3.909091


Largest Std. Dev. 9.511442
75% 0 6
90% 20 20 Variance 90.46753
95% 30 30 Skewness 2.187659
99% 30 30 Kurtosis 6.083538

POORHLTH

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 24
25% 0 0 Sum of Wgt. 24

50% 0 Mean 3.625


Largest Std. Dev. 9.586279
75% 0 2
90% 25 25 Variance 91.89674
95% 30 30 Skewness 2.29411
99% 30 30 Kurtosis 6.339464

Sum of unhealthy days

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 21
25% 0 0 Sum of Wgt. 21

50% 0 Mean 5.714286


Largest Std. Dev. 11.1092
75% 2 20
90% 30 30 Variance 123.4143
95% 30 30 Skewness 1.611376
99% 30 30 Kurtosis 3.794487
Last Name 15

-> RACE3 = Hispanic

PHYSHLTH

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 37
25% 0 0 Sum of Wgt. 37

50% 0 Mean 6.216216


Largest Std. Dev. 10.6566
75% 7 30
90% 30 30 Variance 113.5631
95% 30 30 Skewness 1.563768
99% 30 30 Kurtosis 3.773317

MENTHLTH

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 37
25% 0 0 Sum of Wgt. 37

50% 0 Mean 5.216216


Largest Std. Dev. 9.123033
75% 6 25
90% 25 25 Variance 83.22973
95% 25 25 Skewness 1.594031
99% 30 30 Kurtosis 3.997652

POORHLTH

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 37
25% 0 0 Sum of Wgt. 37

50% 0 Mean 2.540541


Largest Std. Dev. 5.649949
75% 1 14
90% 14 15 Variance 31.92192
95% 20 20 Skewness 2.207911
99% 20 20 Kurtosis 6.49463

Sum of unhealthy days

Percentiles Smallest
1% 0 0
5% 0 0
10% 0 0 Obs 37
25% 0 0 Sum of Wgt. 37

50% 3 Mean 9.594595


Largest Std. Dev. 12.25529
75% 22 30
90% 30 30 Variance 150.1922
95% 30 30 Skewness .8504885
99% 30 30 Kurtosis 1.969164
Last Name 16

-> RACE3 = White only, non-hispanic

Variable Obs Mean Std. Err. [95% Conf. Interval]

PHYSHLTH 378 2.997354 .3757953 2.258437 3.736272


MENTHLTH 373 2.646113 .3409007 1.975779 3.316447
POORHLTH 376 1.779255 .3019403 1.185547 2.372964
UNHEALTH 371 4.994609 .4704264 4.069564 5.919654

-> RACE3 = Black only, non-hispanic

Variable Obs Mean Std. Err. [95% Conf. Interval]

PHYSHLTH 53 4.698113 1.32728 2.034731 7.361496


MENTHLTH 54 7.148148 1.509672 4.120129 10.17617
POORHLTH 55 2.072727 .7515885 .5658831 3.579571
UNHEALTH 52 8.826923 1.690342 5.43342 12.22043

-> RACE3 = Other race only, non-hispanic

Variable Obs Mean Std. Err. [95% Conf. Interval]

PHYSHLTH 23 3.26087 1.449025 .2557756 6.265964


MENTHLTH 22 3.909091 2.027846 -.3080463 8.126228
POORHLTH 24 3.625 1.956791 -.4229305 7.67293
UNHEALTH 21 5.714286 2.424226 .6574393 10.77113

-> RACE3 = Hispanic

Variable Obs Mean Std. Err. [95% Conf. Interval]

PHYSHLTH 37 6.216216 1.751934 2.66313 9.769303


MENTHLTH 37 5.216216 1.499817 2.174446 8.257987
POORHLTH 37 2.540541 .9288459 .6567538 4.424327
UNHEALTH 37 9.594595 2.014758 5.508477 13.68071

You might also like