Comparing Academic Performance Between Groups

Altynai Kubatova Fin-18
Homework Assignment 2
1. T-test:
Firstly as important grade in university/college, Cumulative GDP is chosen as dependent variable,
which shows students’ study performance.
1
.8 .6
Density
.4
.2
0
0 1 2 3 4
CumGPA
As histogram shows, Cum GPA is normally data distributed. According the National score of
Cumulative GPA in the world, universities and colleges usually require GPAs higher than 3.0.
Therefore, for null hypothesis it would be greater or equal to 3.0.
H0: μ ≥ 3.0 (null hypothesis)
H1: μ<3.0 (alternative hypothesis)
a. One-Sample Test for the Mean (σ Unknown)
ttest CumGPA ==3.0
One-sample t test
------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
CumGPA | 732 2.080861 .0365773 .9896168 2.009052 2.15267
------------------------------------------------------------------------------
mean = mean(CumGPA) t = -25.1287
Ho: mean = 3.0 degrees of freedom = 731
Ha: mean < 3.0 Ha: mean != 3.0 Ha: mean > 3.0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
Our null hypothesis is a greater or equal to 3.0 and it is one-tailed. Mean is 2.08, which is less than
3.0
Decision making: critical value is +-1.96

Reject H0, if Tstat >+1.96 or Tstat<-1.96. Our T value is less than -1.96, which is in rejection region.
So, we reject our null hypothesis (μ≥3.0), because Cum GPA’s mean is not equal to 3.0.
P-value approach:
P-value≥α , fail to reject H0 and P-value<α, reject H0.
Our p-value is 0.0000, and our alpha is 0.05 by having 95% Confidence interval. So, we will reject
null hypothesis (μ≥3.0), because real Cum GPA’s mean is different.
b. Comparing the Means of Two Independent Populations
H0: μ1-μ2=0
H1: μ2-μ2≠0
For two independent variables, Rank in HS and Football Player was chosen.
sdtest HSRank , by( FootballPlayer1 )
Variance ratio test

------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
0 | 494 100.5951 4.784699 106.3453 91.19422 109.9961
1 | 238 123.0756 7.453738 114.9907 108.3916 137.7597
---------+--------------------------------------------------------------------
combined | 732 107.9044 4.053144 109.6598 99.94718 115.8616
------------------------------------------------------------------------------
ratio = sd(0) / sd(1) f = 0.8553
Ho: ratio = 1 degrees of freedom = 493, 237
Ha: ratio < 1 Ha: ratio != 1 Ha: ratio > 1

Pr(F < f) = 0.0774 2*Pr(F < f) = 0.1548 Pr(F > f) = 0.9226
Variance ratio shows difference between 2 standard deviation for determining unequal and equal
variance. Here null hypothesis must equal to 1, which says that equal variances. If we divide sd(0)
by sd(1), ratio will be equal 0.9161. p-value is greater than alpha, 2*Pr(F < f) = 0.1548 > 0.05, fail
to reject H0. Because ratio is 0.9161, and it is near to 1 (0.9161≈1). It means, that 2 variances are
equal.
ttest HSRank , by( FootballPlayer1 )
Two-sample t test with equal variances

------------------------------------------------------------------------------
---------+--------------------------------------------------------------------
0 | 494 100.5951 4.784699 106.3453 91.19422 109.9961
1 | 238 123.0756 7.453738 114.9907 108.3916 137.7597
---------+--------------------------------------------------------------------
combined | 732 107.9044 4.053144 109.6598 99.94718 115.8616
---------+--------------------------------------------------------------------
diff | -22.48049 8.618545 -39.40058 -5.560397
------------------------------------------------------------------------------
diff = mean(0) - mean(1) t = -2.6084
Ho: diff = 0 degrees of freedom = 730
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

Pr(T < t) = 0.0046 Pr(|T| > |t|) = 0.0093 Pr(T > t) = 0.9954
T-test is -2.6084. Reject H0, if Tstat >+1.96 or Tstat<-1.96. Therefore, reject H0, where T-test is less
than -1.96.
P-value=0.0093, it is less than alpha (0.05). Difference of two means is-22.48049, which is not
equal to zero (null hypothesis). So, reject H0. During the comparing, we got a big difference
between two means.
c. Comparing the Means of Two Related/Matched Populations
As two related variables, Cumulative GDP and Term GDP is chosen, and it collected from one
place.
ttest CumGPA = TermGPA
Paired t test
------------------------------------------------------------------------------
---------+--------------------------------------------------------------------
CumGPA | 732 2.080861 .0365773 .9896168 2.009052 2.15267
TermGPA | 732 2.330246 .0280262 .7582622 2.275225 2.385267
---------+--------------------------------------------------------------------
diff | 732 -.2493852 .0381588 1.032405 -.3242991 -.1744714
------------------------------------------------------------------------------
mean(diff) = mean(CumGPA - TermGPA) t = -6.5355
Ho: mean(diff) = 0 degrees of freedom = 731
Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
According the critical value, H0 is reject, because t=-6.5355 is a greater than critical value =-1.96.
P-value is 0.0000, this is also reject H0 because p-value is less than 0.05 (alpha). If look mean
difference, so it is -0.2494, which is not related with null hypothesis (mean differ=0)
2. Z-test:
.003
.002
Density
.001
0
400 600 800 1000 1200 1400 400 600 800 1,000 1,200 1,400
SAT Score SAT Score
For dependent variable, it was chosen SAT score, because nowadays SAT is more important and
more valuable admissions.
According the histogram and box plot, data is normally distributed (normal distribution), because
box plot shows us easy way to see mean, outliers and quartile. Therefore, we sure that data set is
following bell-shaped symmetrical curve
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
SATScore | 732 898.9071 168.1912 450 1430
For evaluate null hypothesis, look the average SAT score in the world, it is around 1000. (Source
from blog.prepscholar.com › what-is-the-average-sat-score)
H0: μ ≥1000 (null hypothesis)

H1: μ <1000 (alternative hypothesis)
a. One-Sample Test for the Mean (σ known, use S given the N is large)
sdtest SATScore, by( SemesterF1S2 ) Variance ratio
test
------------------------------------------------------------------------------
---------+--------------------------------------------------------------------
1 | 366 898.9071 8.797513 168.3063 881.6069 916.2073
2 | 366 898.9071 8.797513 168.3063 881.6069 916.2073
---------+--------------------------------------------------------------------
combined | 732 898.9071 6.216525 168.1912 886.7027 911.1115
------------------------------------------------------------------------------
ratio = sd(1) / sd(2) f = 1.0000

Pr(F < f) = 0.5000 2*Pr(F > f) = 1.0000 Pr(F > f) = 0.5000
For get standard deviation, we combine SAT with Semester. In this case, it shows what SAT
score get student in fall and spring semester. St. Dev=168.1912
ztest SATScore==1000, sd(168.1912)
One-sample z test
------------------------------------------------------------------------------
---------+--------------------------------------------------------------------
SATScore | 732 898.9071 6.216526 168.1912 886.7229 911.0913
------------------------------------------------------------------------------
mean = mean(SATScore) z = -16.2620
Ho: mean = 1000
Ha: mean < 1000 Ha: mean != 1000 Ha: mean > 1000
Pr(Z < z) = 0.0000 Pr(|Z| > |z|) = 0.0000 Pr(Z > z) = 1.0000
Our null hypothesis is a greater or equal to 1000 and it is one-tailed. Alternative hypothesis
is highlighted yellow color. Stata gives us above results. According the stata analysis, mean
is 898.9071, which is not the same value as null hypothesis. Z-statistic is -16.2620.
H0: μ ≥1000 (null hypothesis)
H1: μ <1000 (alternative hypothesis)
Decision making: critical value is +-1.96
Reject H0, if Zstst >+1.96 or Zstst<-1.96. Our Z value is less than -1.96, which is in rejection
region. So, we reject our null hypothesis (μ=1000), because SAT score’s mean is not equal
to 1000, real mean is 898.9071.
P-value approach:
P-value≥α , fail to reject H0 and P-value<α, reject H0.
Our p-value is 0.0000, and our alpha is 0.05 by having 95% Confidence interval. So, we will
reject null hypothesis (μ=1000), because real SAT score’s mean is different.
b. Comparing the Means of Two Independent Populations

H0: μ1-μ2=0
H1: μ2-μ2≠0
VerdivMath and Race are two independent population, and they are not related.
sdtest VerdivMath , by( White1 )
Variance ratio test

------------------------------------------------------------------------------
---------+--------------------------------------------------------------------
0 | 178 .852694 .0120807 .1611768 .8288533 .8765348
1 | 554 .855089 .006456 .1519552 .8424078 .8677701
---------+--------------------------------------------------------------------
combined | 732 .8545066 .0056972 .1541396 .8433218 .8656913
------------------------------------------------------------------------------
ratio = sd(0) / sd(1) f = 1.1251

Pr(F < f) = 0.8400 2*Pr(F > f) = 0.3199 Pr(F > f) = 0.1600
Std. Dev. ratio is 1.08 (0.16/0.15) and statistically it is equal to 1. So, fail to reject H0,
because p-value is greater than alpha. Also, it means that is equal variance.
For analysis, we compute two-sample test.
ztest VerdivMath , by( White1 ) sd1(0.1611) sd2(0.1519)
Two-sample z test
------------------------------------------------------------------------------
---------+--------------------------------------------------------------------
0 | 178 .852694 .012075 .1611 .8290276 .8763605
1 | 554 .855089 .0064536 .1519 .8424401 .8677378
---------+--------------------------------------------------------------------
diff | -.0023949 .0136914 -.0292295 .0244397
------------------------------------------------------------------------------
diff = mean(0) - mean(1) z = -0.1749
Ho: diff = 0
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

Pr(Z < z) = 0.4306 Pr(|Z| > |z|) = 0.8611 Pr(Z > z) = 0.5694
Z-statistic is more than-1.96, because -0.1749 is near to zero and far from -1.96. In critical
value approach, it would fail to reject H0
P-value is greater (0.8611) than alpha (0.05), and fail to reject H0
It means, that two means is equal to each other. Even their difference are negative value (-
0.0023), statistically it is mean no differences and no important.
c. Comparing the Means of Two Related/Matched Populations
SAT score and Season are related with each other, SAT score is passed and get score in one
season.
H0: μ1-μ2=0
H1: μ1-μ2≠0
. sdtest SATScore , by( Season1 )
Variance ratio test

------------------------------------------------------------------------------
---------+--------------------------------------------------------------------
0 | 241 885.3942 10.75647 166.9854 864.205 906.5833
1 | 491 905.5397 7.606684 168.5529 890.594 920.4855
---------+--------------------------------------------------------------------
combined | 732 898.9071 6.216525 168.1912 886.7027 911.1115
------------------------------------------------------------------------------
ratio = sd(0) / sd(1) f = 0.9815

Pr(F < f) = 0.4385 2*Pr(F < f) = 0.8771 Pr(F > f) = 0.5615
p-value is greater than alpha and we fail to reject null hypothesis. Standard deviation
difference is 1.5675≈1.
ztest SATScore = Season1 , sddiff(1.5675)
Paired z test
------------------------------------------------------------------------------
---------+--------------------------------------------------------------------
diff | 250 20.1455 .0579365 1.5675 898.1228 898.3499
------------------------------------------------------------------------------
mean(diff) = mean(SATScore - Season1) z = 1.6e+04
Ho: mean(diff) = 0
Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0

Pr(Z < z) = 1.0000 Pr(|Z| > |z|) = 0.0000 Pr(Z > z) = 0.0000
Here, differences are 20.1455, statistically it is large number and we cannot say that two
means are equally.
Reject H0, if z-stat<critical value or z-stat>upper tail critical value
Z-statistic is greater than 1.96 (rejection region), that’s why we will reject the null
hypothesis (H0: mean(diff) is zero).
P-value is less (0.0000) than alpha (0.05), therefore reject H0. There is significant difference
between SAT score and Semester.
3. Confidence interval (Ϭ unkown)- SAT score
ẋ±zα/2*Ϭ/n1/2
N=732
zα/2=1.96
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
SATScore | 732 898.9071 168.1912 450 1430
899.9071±1.96*168.1912/7321/2
887.7227 ≤μ≤912.0914
(887.7227; 912.0914) Thus, We are 95% confident that the mean amount of SAT score is
somewhere between 887.72 and 912.09 scores.
4. Statistical Power
Above we have analysis with z-test comparing means related population, and it was reject null
hypothesis. Here, statistical power will define the probability that it will reject a false null
hypothesis.
. power twomeans (885.3942) (905.5397), sd1(166.9854) sd2(168.5529) n(732)
Estimated power for a two-sample means test

Satterthwaite's t test assuming unequal variances
Ho: m2 = m1 versus Ha: m2 != m1
Study parameters:
alpha = 0.0500
N = 732
N per group = 366
delta = 20.1455
m1 = 885.3942
m2 = 905.5397
sd1 = 166.9854
sd2 = 168.5529
Estimated power:
power = 0.3680
Statistical power is inversely related to the probability of making a Type II error. To get perfect
research, probability of statistical power must be 0.80-1.
Statistical Power is 36.80%, it means that we are researcher, we have Type-2 error. Our sample size
is not enough to research deeper and get high statistical power.
Now let’s assume that sample size is 2500. According the stata analysis, we see that with 2500
sample size, we get 0.85 statistical power. It means that our research is true and do not have type-2
error.
power twomeans (885.3942) (905.5397), sd1(166.9854) sd2(168.5529) n(2500)
Estimated power for a two-sample means test
Satterthwaite's t test assuming unequal variances
Ho: m2 = m1 versus Ha: m2 != m1
Study parameters:
alpha = 0.0500
N = 2500
N per group = 1250
delta = 20.1455
m1 = 885.3942
m2 = 905.5397
sd1 = 166.9854
sd2 = 168.5529
Estimated power: power = 0.8510

Stata Commands:
ttest CumGPA ==3.0
sdtest HSRank , by( FootballPlayer1 )
ttest HSRank , by( FootballPlayer1 )
ttest CumGPA = TermGPA
sdtest SATScore, by( SemesterF1S2 )
ztest SATScore==1000, sd(168.1912)
sdtest VerdivMath , by( White1 )
ztest VerdivMath , by( White1 ) sd1(0.1611) sd2(0.1519)
sdtest SATScore , by( Season1 )
ztest SATScore = Season1 , sddiff(1.5675)

Comparing Academic Performance Between Groups

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Comparing Academic Performance Between Groups

Uploaded by

Copyright:

Available Formats

Altynai Kubatova Fin-18

a. One-Sample Test for the Mean (σ Unknown)

ttest CumGPA ==3.0

Decision making: critical value is +-1.96

Variance ratio test

Ha: ratio < 1 Ha: ratio != 1 Ha: ratio > 1

Two-sample t test with equal variances

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

CumGPA | 732 2.080861 .0365773 .9896168 2.009052 2.15267

TermGPA | 732 2.330246 .0280262 .7582622 2.275225 2.385267

diff | 732 -.2493852 .0381588 1.032405 -.3242991 -.1744714

mean(diff) = mean(CumGPA - TermGPA) t = -6.5355

Ho: mean(diff) = 0 degrees of freedom = 731

Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0

H0: μ ≥1000 (null hypothesis)

Ha: ratio < 1 Ha: ratio != 1 Ha: ratio > 1

b. Comparing the Means of Two Independent Populations

Variance ratio test

Ha: ratio < 1 Ha: ratio != 1 Ha: ratio > 1

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

Variance ratio test

Ha: ratio < 1 Ha: ratio != 1 Ha: ratio > 1

Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0

Estimated power for a two-sample means test

Estimated power for a two-sample means test

Satterthwaite's t test assuming unequal variances

Ho: m2 = m1 versus Ha: m2 != m1

N per group = 1250

Estimated power: power = 0.8510

You might also like