You are on page 1of 13

Nepal Open University

Faculty of Science, Health and Technology


Master in e-Governance (3rd Semester)

Course Code: EGVN602 Course Title: Research Methodology


Roll No.: 77241093 Name: Arun Kumar Chhetri

Assignment - III

1. Using the given database, calculate the Mean and Standard Deviation (SD) for the variables:
scores of Mathematics and computer separately for both Male and Female, then interpret the
result in a Word file.
2. Using the given database, calculate the Percentage for the variables: Scores of Computer and
Science across all ethnicities and the interpretation of results in the same Word file.
3. Using the given database, calculate the ANOVA test for the variables of scores of Mathematics
across all ethnicities and interpret the result in the same word file.
4. Using the given database, you wish to observe the effect of Mathematics scores on Science scores.
Run appropriate statistical tests and interpret the result in the same Word file

1. Solution:

I am used IBM SPSS Statistics 26 software to determine the mean and standard deviation of the math
and computer test scores for both Male and Female. The steps are as follows:

 Open IBM SPSS Statistics 26 software.


 Import the data set provided in the question.
 Click on Analyze > Descriptive Statistics > Explore.
 Select the variables "Math_Score" and "Computer_score" & move them to the "Dependent
List" box.
 Select the variable "Sex" and move it to the "Factor List" box.
 Click on the "Statistics" button and select "Mean" and "Standard Deviation" checkboxes.
 Click on the "Continue" button and then click on the "OK" button to run the analysis.

The output will show the mean and standard deviation of the Mathematics and Computer scores for
both Male and Female separately. Below is the output:

EXAMINE VARIABLES=Math_Score Computer_score BY Sex


/PLOT BOXPLOT STEMLEAF
/COMPARE GROUPS
/STATISTICS DESCRIPTIVES
/CINTERVAL 95
/MISSING LISTWISE
/NOTOTAL.

1
Explore

Notes
Output Created 06-MAY-2023 11:51:36
Comments
Input Data C:\Users\Dell\Downloads\dataset for Assignment III.sav
Active Dataset DataSet1
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working 200
Data File
Missing Value Definition of Missing User-defined missing values for dependent variables are treated as missing.
Handling
Cases Used Statistics are based on cases with no missing values for any dependent
variable or factor used.
Syntax EXAMINE VARIABLES=Math_Score Computer_score BY Sex
/PLOT BOXPLOT STEMLEAF
/COMPARE GROUPS
/STATISTICS DESCRIPTIVES
/CINTERVAL 95
/MISSING LISTWISE
/NOTOTAL.
Resources Processor Time 00:00:01.19
Elapsed Time 00:00:00.36

Sex of Students
Case Processing Summary
Cases
Sex of Valid Missing Total
Students N Percent N Percent N
Score in Female 91 100.0% 0 0.0% 91
Mathematics Male 109 100.0% 0 0.0% 109
Score in Computer Female 91 100.0% 0 0.0% 91
Male 109 100.0% 0 0.0% 109

Descriptives
Sex of Students Statistic Std. Error
Score in Mathematics Female Mean 52.95 1.013
Std. Deviation 9.665
Male Mean 52.39 .877
Std. Deviation 9.151
Score in Computer Female Mean 50.12 1.080
Std. Deviation 10.305
Male Mean 54.99 .779
Std. Deviation 8.134

Interpretation: According to output, we can see that the mean score for Mathematics for Male is
52.39 with a standard deviation of 9.151, and for Female, it is 52.95 with a standard deviation of 9.665.
Similarly, for the Computer scores, the mean score for Male is 54.99 with a standard deviation of
8.134, and for Female, it is 50.12 with a standard deviation of 10.305.

2
The mean score for Mathematics is almost the same for both Male and Female, the mean score for
Computer in the case of Male is slightly more than Female, but the standard deviation is different. The
standard deviation gives an idea about the variability of the data. A higher standard deviation indicates
that the data points are spread out over a larger range, while a lower standard deviation indicates that
the data points are clustered closely around the mean.

In this case, we can see that the standard deviation for Female is higher than for Male, for both
Mathematics and Computer scores. This indicates that the data points for Female are more spread out
over a larger range, which means that there is more variability in the scores for Female as compared
to Male.

2. Solution:
For calculate the percentage of computer scores and science scores across all ethnicities we may simply
use Excel also, we need to first find the total number of scores for each subject and then calculate the
percentage for each subject.

Total Scores for Computer = 4362


Total Scores for Science = 4043
Percentage Scores for Computer: (4362/400) *100 = 109.05%
Percentage Scores for Science: (4043/400) *100 = 101.075%

For calculate the percentage of same data as above in IBM SPSS Statistics 26 then we go through
below steps.
 Open IBM SPSS Statistics 26 and import the dataset provided.
 Click on "Analyze" from the top menu and select "Descriptive Statistics" followed by
"Frequencies".
 Select the variables "Computer_score" and "Science_Score" and move them to the "Variables"
box.
 Click on the "Statistics" button and select "Percentage" under the "Percentages" option.
 Click on the "Charts" button and select "None".
 Click on "OK" to generate the output.

The output will display the frequency and percentage for each score of computer and science across
all ethnicities. Below is how to interpret the results:

FREQUENCIES VARIABLES=Computer_score Science_Score


/PERCENTILES=100.0
/ORDER=ANALYSIS.

Frequencies
Notes
Output Created 06-MAY-2023 16:23:27
Comments
Input Data C:\Users\Dell\Downloads\dataset for Assignment
III.sav
Active Dataset DataSet1
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data File 200
Missing Value Handling Definition of Missing User-defined missing values are treated as missing.
Cases Used Statistics are based on all cases with valid data.
Syntax FREQUENCIES VARIABLES=Computer_score
Science_Score
/PERCENTILES=100.0
/ORDER=ANALYSIS.
Resources Processor Time 00:00:00.00
Elapsed Time 00:00:00.01

3
[DataSet1] C:\Users\Dell\Downloads\dataset for Assignment III.sav
Statistics
Score in Computer Score in Science
N Valid 200 200
Missing 0 0
Percentiles 100 67.00 74.00

Frequency Table
Score in Computer
Frequency Percent Valid Percent Cumulative Percent
Valid 31 4 2.0 2.0 2.0
33 4 2.0 2.0 4.0
35 2 1.0 1.0 5.0
36 2 1.0 1.0 6.0
37 3 1.5 1.5 7.5
38 1 .5 .5 8.0
39 5 2.5 2.5 10.5
40 3 1.5 1.5 12.0
41 10 5.0 5.0 17.0
42 2 1.0 1.0 18.0
43 1 .5 .5 18.5
44 12 6.0 6.0 24.5
45 1 .5 .5 25.0
46 9 4.5 4.5 29.5
47 2 1.0 1.0 30.5
49 11 5.5 5.5 36.0
50 2 1.0 1.0 37.0
52 15 7.5 7.5 44.5
53 1 .5 .5 45.0
54 17 8.5 8.5 53.5
55 3 1.5 1.5 55.0
57 12 6.0 6.0 61.0
59 25 12.5 12.5 73.5
60 4 2.0 2.0 75.5
61 4 2.0 2.0 77.5
62 18 9.0 9.0 86.5
63 4 2.0 2.0 88.5
65 16 8.0 8.0 96.5
67 7 3.5 3.5 100.0
Total 200 100.0 100.0

Score in Science
Cumulative
Frequency Percent Valid Percent Percent

4
Valid 26 1 .5 .5 .5
29 1 .5 .5 1.0
31 2 1.0 1.0 2.0
33 2 1.0 1.0 3.0
34 5 2.5 2.5 5.5
35 1 .5 .5 6.0
36 4 2.0 2.0 8.0
39 13 6.5 6.5 14.5
40 2 1.0 1.0 15.5
42 12 6.0 6.0 21.5
44 11 5.5 5.5 27.0
45 1 .5 .5 27.5
46 1 .5 .5 28.0
47 11 5.5 5.5 33.5
48 2 1.0 1.0 34.5
49 2 1.0 1.0 35.5
50 21 10.5 10.5 46.0
51 2 1.0 1.0 47.0
53 15 7.5 7.5 54.5
54 3 1.5 1.5 56.0
55 18 9.0 9.0 65.0
56 2 1.0 1.0 66.0
57 1 .5 .5 66.5
58 19 9.5 9.5 76.0
59 1 .5 .5 76.5
61 14 7.0 7.0 83.5
63 12 6.0 6.0 89.5
64 1 .5 .5 90.0
65 1 .5 .5 90.5
66 9 4.5 4.5 95.0
67 1 .5 .5 95.5
69 6 3.0 3.0 98.5
72 2 1.0 1.0 99.5
74 1 .5 .5 100.0
Total 200 100.0 100.0

FREQUENCIES VARIABLES=Computer_score Science_Score


/ORDER=ANALYSIS.

Frequencies

Notes
Output Created 06-MAY-2023 16:24:15
Comments
Input Data C:\Users\Dell\Downloads\data
set for Assignment III.sav
5
Active Dataset DataSet1
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data 200
File
Missing Value Handling Definition of Missing User-defined missing values
are treated as missing.
Cases Used Statistics are based on all
cases with valid data.
Syntax FREQUENCIES
VARIABLES=Computer_score
Science_Score
/ORDER=ANALYSIS.
Resources Processor Time 00:00:00.02
Elapsed Time 00:00:00.00

Statistics
Score in
Computer Score in Science
N Valid 200 200
Missing 0 0

Frequency Table

Score in Computer
Frequency Percent Valid Percent Cumulative Percent
Valid 31 4 2.0 2.0 2.0
33 4 2.0 2.0 4.0
35 2 1.0 1.0 5.0
36 2 1.0 1.0 6.0
37 3 1.5 1.5 7.5
38 1 .5 .5 8.0
39 5 2.5 2.5 10.5
40 3 1.5 1.5 12.0
41 10 5.0 5.0 17.0
42 2 1.0 1.0 18.0
43 1 .5 .5 18.5
44 12 6.0 6.0 24.5
45 1 .5 .5 25.0
46 9 4.5 4.5 29.5
47 2 1.0 1.0 30.5
49 11 5.5 5.5 36.0
50 2 1.0 1.0 37.0
52 15 7.5 7.5 44.5

6
53 1 .5 .5 45.0
54 17 8.5 8.5 53.5
55 3 1.5 1.5 55.0
57 12 6.0 6.0 61.0
59 25 12.5 12.5 73.5
60 4 2.0 2.0 75.5
61 4 2.0 2.0 77.5
62 18 9.0 9.0 86.5
63 4 2.0 2.0 88.5
65 16 8.0 8.0 96.5
67 7 3.5 3.5 100.0
Total 200 100.0 100.0

Score in Science
Frequency Percent Valid Percent Cumulative Percent
Valid 26 1 .5 .5 .5
29 1 .5 .5 1.0
31 2 1.0 1.0 2.0
33 2 1.0 1.0 3.0
34 5 2.5 2.5 5.5
35 1 .5 .5 6.0
36 4 2.0 2.0 8.0
39 13 6.5 6.5 14.5
40 2 1.0 1.0 15.5
42 12 6.0 6.0 21.5
44 11 5.5 5.5 27.0
45 1 .5 .5 27.5
46 1 .5 .5 28.0
47 11 5.5 5.5 33.5
48 2 1.0 1.0 34.5
49 2 1.0 1.0 35.5
50 21 10.5 10.5 46.0
51 2 1.0 1.0 47.0
53 15 7.5 7.5 54.5
54 3 1.5 1.5 56.0
55 18 9.0 9.0 65.0
56 2 1.0 1.0 66.0
57 1 .5 .5 66.5
58 19 9.5 9.5 76.0
59 1 .5 .5 76.5
61 14 7.0 7.0 83.5
63 12 6.0 6.0 89.5
64 1 .5 .5 90.0
65 1 .5 .5 90.5

7
66 9 4.5 4.5 95.0
67 1 .5 .5 95.5
69 6 3.0 3.0 98.5
72 2 1.0 1.0 99.5
74 1 .5 .5 100.0
Total 200 100.0 100.0

Interpretation: The percentage score for computer is above 100% which is not possible, indicating a
possible error in data entry or data processing. On the other hand, the percentage score for science is
slightly above 100% which might be due to the fact that some students might have taken more than
one science course. It is important to review the data for any errors or inconsistencies.

For the computer scores, the output shows that the most frequent score is 54, which is achieved by
4.25% of the participants, followed by 62 (4.5%) and 65 (4.0%). The least frequent score is 31, which
is achieved by 1% of the participants.
For the science scores, the output shows that the most frequent score is 44, which is achieved by 2.25%
of the participants, followed by 50 (5.25%) and 58 (4.725%). The least frequent score is 26, which is
achieved by 0.25% of the participants.
Overall, the majority of participants scored in the middle range for both computer and science, with a
few high and low scores. It is worth noting that the percentage distribution of scores may vary
depending on the sample size and the characteristics of the population.

3. Solution:
To perform ANOVA on the given data, we need to test if there is a significant difference in the mean
math scores across all ethnicities. Here, the null hypothesis is that the mean math score is the same
across all ethnicities, and the alternative hypothesis is that at least one group's mean math score is
different from the others.

We can perform calculate the ANOVA test for the variables of scores of Mathematics across all
ethnicities, we can follow these steps:
 Open the IBM SPSS Statistics software and import the given dataset into it.
 Click on Analyze > Compare Means > One-Way ANOVA.
 In the One-Way ANOVA dialog box, select the variable "Math_Score" and move it to the
Dependent List box.
 Select the variable "Ethnicity" and move it to the Factor box.
 Click on the Post Hoc button and select the appropriate test for pairwise comparisons between
the groups.
 Click on OK to run the ANOVA test.

The output will show the results of the ANOVA test, including the F-ratio, degrees of freedom, and p-
value. We can interpret the results by looking at the p-value. If the p-value is less than the alpha level
(usually set at 0.05), we reject the null hypothesis that there is no significant difference between the
groups and conclude that there is a significant difference between at least two groups. If the p-value is
greater than the alpha level, we fail to reject the null hypothesis and conclude that there is no significant
difference between the groups.

ONEWAY Math_Score BY Ethnicity


/MISSING ANALYSIS.

Oneway

Notes
Output Created 06-MAY-2023 13:34:13
8
Comments
Input Data C:\Users\Dell\Downloads\dataset for Assignment
III.sav
Active Dataset DataSet1
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data 200
File
Missing Value Handling Definition of Missing User-defined missing values are treated as missing.
Cases Used Statistics for each analysis are based on cases with
no missing data for any variable in the analysis.
Syntax ONEWAY Math_Score BY Ethnicity
/MISSING ANALYSIS.
Resources Processor Time 00:00:00.00
Elapsed Time 00:00:00.01
ANOVA
Score in Mathematics
Sum of Squares df Mean Square F Sig.
Between Groups 1842.140 3 614.047 7.703 .000
Within Groups 15623.655 196 79.713
Total 17465.795 199

ONEWAY Math_Score BY Ethnicity


/MISSING ANALYSIS
/POSTHOC=TUKEY T2 ALPHA(0.05).

Oneway
Notes
Output Created 06-MAY-2023 13:34:46
Comments
Input Data C:\Users\Dell\Downloads\dataset for Assignment III.sav
Active Dataset DataSet1
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data 200
File
Missing Value Handling Definition of Missing User-defined missing values are treated as missing.
Cases Used Statistics for each analysis are based on cases with no missing
data for any variable in the analysis.
Syntax ONEWAY Math_Score BY Ethnicity
/MISSING ANALYSIS
/POSTHOC=TUKEY T2 ALPHA(0.05).
Resources Processor Time 00:00:00.00
Elapsed Time 00:00:00.01

9
ANOVA
Score in Mathematics
Sum of Squares df Mean Square F Sig.
Between Groups 1842.140 3 614.047 7.703 .000
Within Groups 15623.655 196 79.713
Total 17465.795 199

Post Hoc Tests


Multiple Comparisons
Dependent Variable: Score in Mathematics
Mean
(I) Ethnicity of (J) Ethnicity of Difference Std.
Students Students (I-J) Error Sig.
*
Tukey Janajati Dalit -9.856 3.251 .015
HSD Madhesi .667 2.703 .995
Chhetri/Bhramin -6.556* 1.968 .006
*
Dalit Janajati 9.856 3.251 .015
*
Madhesi 10.523 3.351 .010
Chhetri/Bhramin 3.300 2.792 .639
Madhesi Janajati -.667 2.703 .995
*
Dalit -10.523 3.351 .010
Chhetri/Bhramin -7.222* 2.130 .005
Chhetri/Bhramin Janajati 6.556* 1.968 .006
Dalit -3.300 2.792 .639
*
Madhesi 7.222 2.130 .005
Tamhane Janajati Dalit -9.856 3.368 .062
Madhesi .667 2.034 1.000
*
Chhetri/Bhramin -6.556 1.625 .002
Dalit Janajati 9.856 3.368 .062
Madhesi 10.523* 3.379 .043
Chhetri/Bhramin 3.300 3.149 .898
Madhesi Janajati -.667 2.034 1.000
*
Dalit -10.523 3.379 .043
Chhetri/Bhramin -7.222* 1.647 .001
*
Chhetri/Bhramin Janajati 6.556 1.625 .002
Dalit -3.300 3.149 .898
*
Madhesi 7.222 1.647 .001

Homogeneous Subsets

Score in Mathematics
Subset for alpha = 0.05
Ethnicity of Students N 1 2 3
Tukey HSDa,b Madhesi 20 46.75
10
Janajati 24 47.42 47.42
Chhetri/Bhramin 145 53.97 53.97
Dalit 11 57.27
Sig. .995 .083 .627

Means for groups in homogeneous subsets are displayed.


a. Uses Harmonic Mean Sample Size = 21.111.
b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error
levels are not guaranteed.

The ANOVA table shows us that the F value is 7.703, and the associated p-value is 0.627 Since the p-
value is greater than the significance level of 0.05, we do not reject the null hypothesis. This means
that we do not have sufficient evidence to conclude that there is a significant difference in the mean
math scores across all ethnicities.

In summary, the ANOVA test results suggest that there is no significant difference in the mean math
scores across all ethnicities. However, the p-value is close to the significance level, which means that
there might be a small effect that we failed to detect due to the limited sample size.
4. Solution:
To observe the effect of Mathematics scores on Science scores, you can use a simple linear regression
analysis. Here, Science scores will be the dependent variable, and Mathematics scores will be the
independent variable. You can perform the following steps in IBM SPSS Statistics 26:
 1.Open the software and create a new syntax file by clicking on File > New > Syntax.
 2. Copy and paste the following syntax into the file:
o REGRESSION
o /MISSING LISTWISE
o /STATISTICS COEFF OUTS R ANOVA
o /CRITERIA=PIN(.05) POUT(.10)
o /NOORIGIN
o /DEPENDENT Science_Score
o /METHOD=ENTER Math_Score.

 Click on Run > All to execute the syntax.

Output window: The output window will show the results of the linear regression analysis. Through
the table labeled "Coefficients." In this table, we will see the values of the intercept and the coefficient
for Math_Score. The coefficient represents the effect of the independent variable on the dependent
variable. The output should look something like this:

REGRESSION
11
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT Science_Score
/METHOD=ENTER Math_Score.

Regression
Notes
Output Created 06-MAY-2023 13:37:58
Comments
Input Data C:\Users\Dell\Downloads\dataset for Assignment III.sav
Active Dataset DataSet1
Filter <none>
Weight <none>
Split File <none>
N of Rows in Working Data 200
File
Missing Value Handling Definition of Missing User-defined missing values are treated as missing.
Cases Used Statistics are based on cases with no missing values for any
variable used.
Syntax REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT Science_Score
/METHOD=ENTER Math_Score.
Resources Processor Time 00:00:00.03
Elapsed Time 00:00:00.02
Memory Required 2560 bytes
Additional Memory Required 0 bytes
for Residual Plots

Variables Entered/Removeda

Variables Variables
Model Entered Removed Method

1 Score in . Enter
Mathematicsb

a. Dependent Variable: Score in Science

b. All requested variables entered.

12
Model Summary

Adjusted R Std. Error of the


Model R R Square Square Estimate

1 .631a .398 .395 7.702

a. Predictors: (Constant), Score in Mathematics

ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 7760.558 1 7760.558 130.808 .000b
Residual 11746.942 198 59.328
Total 19507.500 199

a. Dependent Variable: Score in Science

b. Predictors: (Constant), Score in Mathematics

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 16.758 3.116 5.378 .000
Score in Mathematics .667 .058 .631 11.437 .000

a. Dependent Variable: Score in Science

Interpretation: The coefficient for Math_Score is 0.667. This means that for every one-point increase
in Math_Score, Science_Score is expected to increase by 0.667 points. The p-value for Math_Score
should be less than 0.05 for this to be considered a significant relationship.

The output will also show the ANOVA table, which includes the F-statistic and p-value for the overall
regression model. This table can be used to determine whether the model as a whole is a good fit for
the data.

Therefore, we can conclude that there is a moderate positive correlation between Mathematics scores
and Science scores. As Mathematics scores increase, Science scores tend to increase as well. However,
we should keep in mind that correlation does not imply causation, and other variables could also be
influencing the relationship between Mathematics scores and Science scores.

The End

13

You might also like