You are on page 1of 17

Name : Meca Angela L.

Baliva
Program : MAED-ST

FINAL EXAM in
EFD 502 Advanced Statistics with Computer Applications

General Instruction: Do as indicated. Copy the questions for each item followed by your answers.
Attach only the tables you need from the SPSS to support your claims, exclude the unnecessary ones.
Arrange your tables and explanations as orderly and neatly as possible. No two papers must have the
same exact answers. In the event of identical answers, points will be divided as to how many shared
the same answers. Submit the pdf file of your PS in the Google Classroom with the filename in the
following format: EXAM - SURNAME, FIRST NAME, MI. Ensure submission on or before December
30, 2023 (Saturday).

1. (30 pts) Consider the data set in the excel file named Exam Data Set. Sheet 1 contains data on the years
of experience of teachers and their corresponding salaries. Implement the data in the SPSS and do the
following:
a. Show the scatter plot of the data. (5 pts)

b. Compute and show the sample size, the mean, and the standard deviation of all the variables. (5
pts)

Descriptive Statistics
N Minimum Maximum Mean Std. Deviation
YearsofExperience 30 1.10 10.50 5.3133 2.83789
Salary 30 37731.00 122391.00 76003.0000 27414.42978
Valid N (listwise) 30

c. Compute the Pearson’s correlation coefficient and interpret the result (use the general guidelines
in Chapter 6a lecture notes). (10 pts)

Correlations
YearsofExperienc
e Salary
YearsofExperience Pearson Correlation 1 .978**
Sig. (2-tailed) .000
N 30 30
**
Salary Pearson Correlation .978 1
Sig. (2-tailed) .000
N 30 30
**. Correlation is significant at the 0.01 level (2-tailed).

Since the Pearson Correlation for both variables are above 0.7, therefore there is a very strong
positive correlation between the two variables. Which means that the increase in years of experience greatly
influences the salary.
d. Compute using a simple linear regression, display the required tables, and interpret the results.
(You may do the same as in Chapter 6b in your interpretation) (10 pts)

Descriptive Statistics
Mean Std. Deviation N
Salary 76003.0000 27414.42978 30
YearsExperience 5.3133 2.83789 30

Correlations
Salary YearsExperience
Pearson Correlation Salary 1.000 .978
YearsExperience .978 1.000
Sig. (1-tailed) Salary . .000
YearsExperience .000 .
N Salary 30 30
YearsExperience 30 30
Variables Entered/Removeda
Variables
Model Variables Entered Removed Method
b
1 YearsExperience . Enter
a. Dependent Variable: Salary
b. All requested variables entered.

Model Summaryb
Adjusted R Std. Error of the
Model R R Square Square Estimate
a
1 .978 .957 .955 5788.31505
a. Predictors: (Constant), YearsExperience
b. Dependent Variable: Salary

ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 20856849300.332 1 20856849300.332 622.507 .000b
Residual 938128551.668 28 33504591.131
Total 21794977852.000 29
a. Dependent Variable: Salary
b. Predictors: (Constant), YearsExperience

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 25792.200 2273.053 11.347 .000
YearsExperience 9449.962 378.755 .978 24.950 .000
a. Dependent Variable: Salary
Residuals Statisticsa
Minimum Maximum Mean Std. Deviation N
Predicted Value 36187.1602 125016.8047 76003.0000 26817.93616 30
Residual -7958.00781 11448.02539 .00000 5687.64102 30
Std. Predicted Value -1.485 1.828 .000 1.000 30
Std. Residual -1.375 1.978 .000 .983 30
a. Dependent Variable: Salary
In the standardized coefficients column B shows that for every unit increase in years of experience, there is
an increase of 9949.962 in salary with an error variation of 378.755 which means that income may vary up to
378.755 in every year of experience increase.
A linear regression was used to test if years of experience significantly increased Salary. The fitted
regression model was:
Salary = (25792.2) + (9449.962) (Years of Experience)
The overall regression was statistically significant (R2=.957, F=622.507,p<0.001). It was found that years of
experience (B=9449.962, p<0.001)
significantly increases salary.

2. (30 points) Consider the data set in Exam Data Set in Sheet 2 which is the Student Performance
Dataset, a dataset designed to examine the factors influencing academic student performance. The
dataset consists of 210 student records, with each record containing information about various
predictors and a performance index: The following are the independent variables:
 Hours Studied: The total number of hours spent studying by each student.
 Previous Scores: The scores obtained by students in previous tests.
 Sleep Hours: The average number of hours of sleep the student had per day.
 Sample Question Papers Practiced: The number of sample question papers the student
practiced.
While below is the dependent variable:
 Performance Index: A measure of the overall performance of each student. The
performance index represents the student's academic performance and has been rounded to
the nearest integer. The index ranges from 10 to 100, with higher values indicating better
performance.
Implement the data in the SPSS and do the following:
a. Show the scatter plot for each independent variable paired with the Performance index. (5 pts)
b. Compute and show the sample size, the mean and the standard deviation of all the variables. (5
pts)

Descriptive Statistics
N Minimum Maximum Mean Std. Deviation
HoursStudied 210 1.00 9.00 5.0762 2.66358
PreviousScore 210 40.00 99.00 69.6952 17.08164
SleepHours 210 4.00 9.00 6.4048 1.78858
SampleQPPracticed 210 .00 9.00 4.2905 2.87147
PerfIndex 210 15.00 100.00 55.3429 19.51117
Valid N (listwise) 210

c. Using the Pearson’s correlation coefficient for each independent variable paired with
Performance Index, which of the variable/s has/have significant relationship with Performance
Index. Display the required tables from SPSS and interpret only the results of the variables with
significant correlation to the Performance Index (use the general guidelines in Chapter 6a lecture
notes). (10 pts)
Correlations
HoursStudied PreviousScore SleepHours SampleQPPracticed PerfIndex
HoursStudied Pearson 1 .049 -.051 -.055 .432**
Correlation
Sig. (2- .478 .465 .429 .000
tailed)
N 210 210 210 210 210
PreviousScore Pearson .049 1 .021 -.001 .915**
Correlation
Sig. (2- .478 .763 .985 .000
tailed)
N 210 210 210 210 210
SleepHours Pearson -.051 .021 1 -.071 .039
Correlation
Sig. (2- .465 .763 .309 .572
tailed)
N 210 210 210 210 210
SampleQPPracticed Pearson -.055 -.001 -.071 1 .007
Correlation
Sig. (2- .429 .985 .309 .924
tailed)
N 210 210 210 210 210
** **
PerfIndex Pearson .432 .915 .039 .007 1
Correlation
Sig. (2- .000 .000 .572 .924
tailed)
N 210 210 210 210 210
**. Correlation is significant at the 0.01 level (2-tailed).

Hours Studied has a moderate positive correlation with Performance Index.


Previous Score has a very high positive correlation with Performance Index.
Sleep hours has a weak positive correlation with Performance Index
Sample Question Papers Practiced has a weak correlation with Performance Index
d. Compute using a multiple linear regression, display the required tables, and interpret the results.
(You may do the same as in Chapter 6b in your interpretation) (10 pts)

Model Summaryb
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .995a .991 .991 1.88573
a. Predictors: (Constant), SampleQPPracticed, PreviousScore, SleepHours, HoursStudied
b. Dependent Variable: PerfIndex

ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 78834.337 4 19708.584 5542.363 .000b
Residual 728.978 205 3.556
Total 79563.314 209
a. Dependent Variable: PerfIndex
b. Predictors: (Constant), SampleQPPracticed, PreviousScore, SleepHours, HoursStudied

Coefficientsa
Unstandardized Standardized 95.0% Confidence Interval
Coefficients Coefficients for B
Lower Upper
Model B Std. Error Beta t Sig. Bound Bound
1 (Constant) -34.419 .793 -43.429 .000 -35.982 -32.857
HoursStudied 2.871 .049 .392 58.384 .000 2.774 2.968
PreviousScore 1.023 .008 .895 133.718 .000 1.008 1.038
SleepHours .464 .073 .043 6.340 .000 .320 .609
SampleQPPracticed .219 .046 .032 4.804 .000 .129 .309
a. Dependent Variable: PerfIndex

Residuals Statisticsa
Minimum Maximum Mean Std. Deviation N
Predicted Value 14.8249 96.4251 55.3429 19.42158 210
Residual -5.32386 4.25761 .00000 1.86760 210
Std. Predicted Value -2.086 2.115 .000 1.000 210
Std. Residual -2.823 2.258 .000 .990 210
a. Dependent Variable: PerfIndex

In the standardized coefficients column B shows that for every unit increase in hours studied, there is an
increase of 2.871 in Performance Index with an error variation of 0.49 which means that Performance Index
may vary up to 0.49 in every Hours Studied increase.
In the standardized coefficients column B shows that for every unit increase in previous score, there is an
increase of 1.023 in Performance Index with an error variation of 0.008 which means that Performance Index
may vary up to 0.008 in every Previous Score increase.
In the standardized coefficients column B shows that for every unit increase in Sleep hours, there is an
increase of 0.464 in Performance Index with an error variation of 0.73 which means that Performance Index
may vary up to 0.73 in every Sleep hours increase.
In the standardized coefficients column B shows that for every unit increase in sample questionnaire papers
practiced, there is an increase of 0.219 in Performance Index with an error variation of 0.46 which means
that Performance Index may vary up to 0.46 in every sample questionnaire papers practiced increase.
A multiple linear regression was used to test if hours studied, previous score, Sleep hours, and sample
questionnaire papers practiced significantly increased Performance Index. The fitted regression model was:
Performance Index = (-34.419) + (2.871) (Salary) + (1.023) (Previous Score) + (.464) (Sleep Hours)
+ (.219)(Sample Questionnaire Paper Practiced)
The overall regression was statistically significant (R2=.991, F=5542.363,p<0.001). It was found that hours
studied (B=2.871, p<0.001).
, previous score (B=1.023, p<0.001), Sleep hours (B=.464, p<0.001), and sample questionnaire papers
practiced (B=.219, p<0.001) significantly increases Performance Index.

3. (50 points) Consider the data set in Exam Data Set in Sheet 3 which displays the data of the Junior High
School students together with their gender, ethnicity, and the exam score in mathematics. Implement the
data in the SPSS.
a. Show the sample size, the mean and the standard deviation of the exam score. (5 pts)

Exam Score
Cumulative
Frequency Percent Valid Percent Percent
Valid 75.00 11 5.2 5.6 5.6
76.00 10 4.8 5.1 10.7
77.00 6 2.9 3.1 13.8
78.00 16 7.6 8.2 21.9
79.00 11 5.2 5.6 27.6
80.00 9 4.3 4.6 32.1
81.00 12 5.7 6.1 38.3
82.00 5 2.4 2.6 40.8
83.00 8 3.8 4.1 44.9
84.00 8 3.8 4.1 49.0
85.00 7 3.3 3.6 52.6
86.00 6 2.9 3.1 55.6
87.00 8 3.8 4.1 59.7
88.00 3 1.4 1.5 61.2
89.00 5 2.4 2.6 63.8
90.00 13 6.2 6.6 70.4
91.00 5 2.4 2.6 73.0
92.00 4 1.9 2.0 75.0
93.00 9 4.3 4.6 79.6
94.00 4 1.9 2.0 81.6
95.00 8 3.8 4.1 85.7
96.00 7 3.3 3.6 89.3
97.00 4 1.9 2.0 91.3
98.00 10 4.8 5.1 96.4
99.00 7 3.3 3.6 100.0
Total 196 93.3 100.0
Missing System 14 6.7
Total 210 100.0

b. Show the sample sizes, the means and the standard deviations of the exam score according to
gender. (5 pts)

Descriptives
Gender Statistic Std. Error
Exam Score 1.00 Mean 86.2268 .75989
95% Confidence Interval for Lower Bound 84.7184
Mean Upper Bound 87.7352
5% Trimmed Mean 86.1735
Median 85.0000
Variance 56.011
Std. Deviation 7.48402
Minimum 75.00
Maximum 99.00
Range 24.00
Interquartile Range 14.00
Skewness .151 .245
Kurtosis -1.347 .485
2.00 Mean 85.3838 .75083
95% Confidence Interval for Lower Bound 83.8938
Mean Upper Bound 86.8738
5% Trimmed Mean 85.2043
Median 84.0000
Variance 55.810
Std. Deviation 7.47063
Minimum 75.00
Maximum 99.00
Range 24.00
Interquartile Range 12.00
Skewness .348 .243
Kurtosis -1.102 .481

c. Show the sample sizes, the means and the standard deviations of the exam score according to
ethnicity. (5 pts)

Descriptives
Ethnicity Statistic Std. Error
Exam Score 1.00 Mean 82.7647 1.47661
95% Confidence Interval for Mean Lower Bound 79.6344
Upper Bound 85.8950
5% Trimmed Mean 82.3497
Median 82.0000
Variance 37.066
Std. Deviation 6.08820
Minimum 75.00
Maximum 98.00
Range 23.00
Interquartile Range 7.50
Skewness .906 .550
Kurtosis .980 1.063
2.00 Mean 86.6190 1.09319
95% Confidence Interval for Mean Lower Bound 84.4113
Upper Bound 88.8268
5% Trimmed Mean 86.6032
Median 86.5000
Variance 50.193
Std. Deviation 7.08469
Minimum 75.00
Maximum 99.00
Range 24.00
Interquartile Range 12.25
Skewness .033 .365
Kurtosis -1.221 .717
3.00 Mean 85.4828 1.01865
95% Confidence Interval for Mean Lower Bound 83.4429
Upper Bound 87.5226
5% Trimmed Mean 85.3314
Median 85.5000
Variance 60.184
Std. Deviation 7.75783
Minimum 75.00
Maximum 99.00
Range 24.00
Interquartile Range 15.00
Skewness .195 .314
Kurtosis -1.385 .618
4.00 Mean 85.5294 1.08514
95% Confidence Interval for Mean Lower Bound 83.3498
Upper Bound 87.7090
5% Trimmed Mean 85.3540
Median 84.0000
Variance 60.054
Std. Deviation 7.74946
Minimum 75.00
Maximum 99.00
Range 24.00
Interquartile Range 15.00
Skewness .417 .333
Kurtosis -1.192 .656
5.00 Mean 87.5714 1.41555
95% Confidence Interval for Mean Lower Bound 84.6670
Upper Bound 90.4759
5% Trimmed Mean 87.6190
Median 86.5000
Variance 56.106
Std. Deviation 7.49038
Minimum 75.00
Maximum 99.00
Range 24.00
Interquartile Range 14.00
Skewness .053 .441
Kurtosis -1.348 .858
d. Do the test for normality (Shapiro-Wilk) of the exam score and equal variances (Levene’s Test)
in the SPSS. What do the results of the tests have to say about the distribution? (5 pts each test)

Tests of Normality
Kolmogorov-Smirnova Shapiro-Wilk
Ethnicity Statistic df Sig. Statistic df Sig.
*
Exam Score 1.00 .121 17 .200 .937 17 .285
2.00 .136 42 .050 .950 42 .062
3.00 .149 58 .003 .918 58 .001
4.00 .127 51 .040 .913 51 .001
5.00 .125 28 .200* .937 28 .092
*. This is a lower bound of the true significance.
a. Lilliefors Significance Correction

Test of Homogeneity of Variances


Levene Statistic df1 df2 Sig.
Exam Score Based on Mean 1.419 4 191 .229
Based on Median 1.255 4 191 .289
Based on Median and with 1.255 4 182.235 .289
adjusted df
Based on trimmed mean 1.416 4 191 .230

Tests of Normality
Kolmogorov-Smirnova Shapiro-Wilk
Gender Statistic df Sig. Statistic df Sig.
Exam Score 1.00 .129 97 .000 .929 97 .000
2.00 .115 99 .002 .932 99 .000
a. Lilliefors Significance Correction

Test of Homogeneity of Variances


Levene Statistic df1 df2 Sig.
Exam Score Based on Mean .162 1 194 .688
Based on Median .209 1 194 .648
Based on Median and with .209 1 191.380 .648
adjusted df
Based on trimmed mean .184 1 194 .668
All Shapiro-Wilk tests confirm that the distribution of exam scores for ethnicities 1, 2, and 5
are distributed normally while exam scores of ethnicities 3 and 4 are not distributed normally. This
test also shows that when exam scores are according to gender, both male and female exam scores
are not normally distributed.
Levene’s test shows that according to ethnicity, the variances are not significantly different
from each other (F=1.419, p = .229). It also shows the similar interpretation when exam scores were
grouped based on gender (F=.162, p = .688)
e. If your answer in 3c either coincides with the assumption of normality or violates it, please
proceed as it is and using 75 as the test value, compute the one-sample t-test for exam score.
Test whether there is a significant difference between the student performance and the average
performance. Show all the hypothesis testing steps. (10 pts)

One-Sample Test
Test Value = 75
95% Confidence Interval of the
Difference
t df Sig. (2-tailed) Mean Difference Lower Upper
Exam Score 20.243 195 .000 10.80102 9.7487 11.8533
Gender -2052.708 195 .000 -73.49490 -73.5655 -73.4243
Ethnicity -857.782 195 .000 -71.84184 -72.0070 -71.6767

A. Hypotheses: H0: µ = µ0
Ha: µ ≠ µ0
B. Test statistic: t= 20.243
C. P =value: 0.000
D. Significance Level: P< 0.05. the test is said to be statistically significant.
E. Conclusion: Therefore, there is a significant difference between the exam score and the gender
and ethnicity(t=20.243, p=0.000).

f. If your answer in 3c either coincides with the assumptions of normality and equality of variances
or violates them, please proceed as it is and compute the independent t-test for exam score as the
dependent variable and gender as the grouping variable. Test whether there is a significant
difference between the gender. Show all the hypothesis testing steps. (10 pts)

Independent Samples Test


Levene's Test for
Equality of
Variances t-test for Equality of Means
95% Confidence
Interval of the
Sig. (2- Mean Std. Error Difference
F Sig. t df tailed) Difference Difference Lower Upper
Exam Equal variances .162 .688 .789 194 .431 .84297 1.06824 -1.26388 2.94981
Score assumed
Equal variances .789 193.904 .431 .84297 1.06826 -1.26393 2.94986
not assumed

A. Hypotheses: H0: µ = µ0
Ha: µ ≠ µ0
B. Test statistic: t= .789
C. P =value: .688
D. Significance Level: P< 0.05. the test is said to be statistically significant.
E. Conclusion: Therefore, there is no significant difference between the exam score and the
gender and ethnicity (t=.789, p=688).

g. If your answer in 3c either coincides with the assumptions of normality and equality of variances
or violates them, please proceed as it is and compute the analysis of variance for exam score as
the dependent variable and ethnicity as the factor. Test whether there is a significant difference
among the groups. Show all the hypothesis testing steps. (10 pts)

ANOVA
Exam Score
Sum of Squares df Mean Square F Sig.
Between Groups 282.230 4 70.558 1.271 .283
Within Groups 10599.009 191 55.492
Total 10881.240 195

A. Hypotheses: H0: µ = µ0
Ha: µ ≠ µ0
B. Test statistic: F=1.271
C. P =value: .283
D. Significance Level: P< 0.05. the test is said to be statistically significant.
E. Conclusion: Therefore, there is no significant difference between the exam score and the
gender and ethnicity(F=1.271, p=.283).

h. Bonus Question (5 pts): Is a Post Hoc test necessary to be conducted based on the result in item
g? Why or why not?
There is no need to conduct a Post Hoc test because there was no significant difference in the data.
Therefore, the difference between groups cannot be observed.

You might also like