You are on page 1of 9


Data Analysis and Application (DAA)

Student Name

Capella University

Data Analysis and Application (DAA)

When trying to assess the strength and direction of a linear relationship between two

variables. When these two variables are normally distributed, then the Pearson’s correlation

coefficient is used, otherwise we use Spearman’s correlation coefficient (Mukaka, 2012). It is

important to note that Pearson correlation analysis should only be used when the two variables

are quantitative, continuous and normally distributed. However, if the variables are of the ordinal

level of measurement, they are continuous but non-normally distributed and they have relevant

outliers, then the Spearman correlation should be used instead (Schober, Boer & Schwarte,

2018). In this present analysis, a description of the data set used in this analysis is offered as well

as the correlation test used. This involved testing assumptions, highlighting the research

question, hypotheses and the level of significance. Finally, the results of the analysis are

interpreted and a conclusion offered.

Section 1: Data File Description

For this analysis, the grades.sav data set was used, which included various variables such

as the names of the participants, their gender, ethnicity, GPA scores, scores in five different

quizzes, and score of the final test. The data also included the participants’ total score, total

percentage and the grade received as well as whether the total score was a pass or fail. However,

the present analysis makes use of only four variables: gender, GPA, total and final. Gender

describes the sex of a participant – male is represented by 1 and female is represented by 2. Final

describes the participants’ score in the final exam whereas total describes the participants’ total

score in the course (five quizzes and the final exam). Finally, GPA describes the participants’

cumulative grade point average scores.


All four variables are numeric with different levels of measurement including ordinal,

nominal, ratio or interval. While gender is a nominal categorical variable, GPA, final and total

are all continuous scale variables. Specifically, GPA is a ratio measurement, total is an interval

measurement, and final is an ordinal measurement. Therefore, for the interval or ratio variables,

the Pearson’s r is preferred whereas Spearman’s r applied to the ordinal variable and the point-

biserial r applied to where one of the variables was nominal. Additionally, the sample size for the

grades.sav data set was N = 105.

Section 2: Testing Assumptions

When performing a correlation analysis between final and GPA, four key assumptions

ought to be tested. First of all, the variables should occur either at the interval or ratio level.

Secondly, there is the assumption that the relationship between the variables in linear and thirdly,

it is assumed that the data set is normally distributed (Gogtay & Thatte, 2017). Finally, it was

assumed that that there are no significant outliers within the data set (Hazra & Gogtay, 2016).

Histogram for Final and GPA


The SPSS output above represents the histogram of the final variable and the GPA

variable. The results show that the participants had a mean score of M = 61.48 (SD = 7.94) and a

mean GPA score of M = 2.86 (SD = .67) with a sample size of N = 105. Additionally, the

histogram for GPA represents the number of participants who had a GPA score between 1.00 and

4.00 whereas the histogram for final represents the number of students who scored between 40

and 75 in the final exam. Visually analyzing the histograms shows that the data for both the final

and GPA variable are normal since the superimposed black line on the histograms are bell-


Kurtosis and Skewness for Final and GPA

De s criptiv e S tatis tics

N Skewness Kurtosis
Statistic Statistic Std. Error Statistic Std. Error
gpa 105 -0.275 0.236 -0.390 0.467
final 105 -0.335 0.236 -0.332 0.467
Valid N (listwise) 105

As can be seen in the table above, the two variables are moderately skewed to the left

(negatively skewed), which means that the data for the two variables are not perfectly

symmetrical. However, seeing that the skewness for both variables falls between -.5 and .5, it can

be argued that the data for both variables are fairly symmetrical (Lomax & Hahs-Vaughn, 2013).

With a skewness of -.275 and -.335, the sample data for the participants’ GPA score and final

exam scores respectively are roughly symmetric. Additionally, the kurtosis for both variables are

less than 0. Which means that the data distribution is slightly platykurtic (Lomax & Hahs-

Vaughn, 2013).

Scatter Plot for Final and GPA

The scatter plot for final versus GPA shows that the participants’ score in the final exam

increases so does their GPA score. Accordingly, this suggests that there is a positive linear

relationship between the two variables. Further a visual inspection of the graph shows that there

were no existing outliers in the sample data set.

Conclusion of Assumptions

The results of the above analysis show that the correlation assumptions were met. For

example, the measurement of the two variables was at the ratio or interval level. The scatter plot

showed that there is a positive linear relationship between the two variables and a visual

inspection of the scatter plot diagram showed that there were no significant outliers. Finally, the

histograms for both variables were bell-shaped hence the two variables were roughly normally

distributed. As such, it was concluded that the assumptions of correlation were met.

Section 3: Research Question, Hypotheses, and Alpha Level

Research Question

Is there a relationship between final exam score and GPA score? Is the relationship


Null Hypothesis

There is no relationship between final exam score and GPA score.

Alternative Hypothesis:

There is a relationship between final exam score and GPA score.

Alpha Level of Significance

In this case, the most commonly assumed level of significance was used (α = .05). This

means that there is a 5% chance of concluding that a linear relationship exists where there is no

actual linear relationship.

Section 4: Interpretation

Co rre latio ns
gender gpa final total
gender Pearson Correlation 1 -.319 ** -0.140 -0.143
Sig. (2-tailed) 0.001 0.156 0.145
N 105 105 105 105
gpa Pearson Correlation -.319 ** 1 .524 **
.466 **
Sig. (2-tailed) 0.001 0.000 0.000
N 105 105 105 105
final Pearson Correlation -0.140 .524 ** 1 .891 **
Sig. (2-tailed) 0.156 0.000 0.000
N 105 105 105 105
total Pearson Correlation -0.143 .466 **
.891 ** 1
Sig. (2-tailed) 0.145 0.000 0.000
N 105 105 105 105
**. Correlation is significant at the 0.01 level (2-tailed).

Lowest Magnitude

The analysis showed that for the 105 participants surveyed, there was a significant

negative relationship between gender and GPA score, r(103) = -.32, p = .001. An effect size (r)

of -.32 shows that this relationship is a moderate negative linear one. Additionally, since the p-

value (p = .001) is less than the assumed level of significance (α = .05), we reject the null

hypothesis that there is no linear relationship between gender and GPA score in favor of the

alternative hypothesis that a significant negative linear relationship exists between participants’

gender and GPA score.

Highest Magnitude

The analysis showed that for the 105 participants surveyed, there was a significant

positive relationship between total score in the course (five quizzes and the final exam) and the

participants’ score in the final exam, r(103) = .89, p < .001. An effect size (r) of .89 shows that

this relationship is a strong positive linear one. Additionally, since the p-value (p < .001) is less

than the assumed level of significance (α = .05), we reject the null hypothesis that there is no

linear relationship between total score in the course (five quizzes and the final exam) and the

participants’ score in the final exam in favor of the alternative hypothesis that a significant

positive linear relationship exists between participants’ total score in the course (five quizzes and

the final exam) and the participants’ score in the final exam.

GPA and Final

The analysis showed that for the 105 participants surveyed, there was a significant

negative relationship between GPA and final, r(103) = .52, p < .001. An effect size (r) of .52

shows that this relationship is a strong positive linear one. Additionally, since the p-value (p = .

001) is less than the assumed level of significance (α = .05), we reject the null hypothesis that

there is no linear relationship between GPA and final in favor of the alternative hypothesis that a

significant positive linear relationship exists between participants’ GPA score and their score in

the final exam.

Section 5: Conclusion

Considering that a significant positive linear relationship exists between participants’

GPA score and their score in the final exam, it can be concluded that there is a relationship

between students’ final exam score and their GPA score. Correlational analysis was the preferred

statistical test in this case because of its strengths. Correlational analysis allows the researcher to

calculate the strength and direction of a linear relationship and whether the relationship is

significant or not. However, further statistical tests may be needed because a correlational

analysis does not assume cause and effect hence, even a strong positive correlation between two

variables may mislead the conclusions a researcher makes. Also, further statistical tests may be

necessary because lack of correlation does not mean that a relationship does not exist, it may be a

non-linear one.


Gogtay, N. J., & Thatte, U. M. (2017). Principles of correlation analysis. Journal of the

association of physicians of India, 65(3), 78-81.

Hazra, A., & Gogtay, N. (2016). Biostatistics series module 6: Correlation and linear

regression. Indian Journal Of Dermatology, 61(6), 593. doi: 10.4103/0019-5154.193662

Lomax, R., & Hahs-Vaughn, D. (2013). An Introduction to Statistical Concepts (3rd ed.). New

York: Routledge: Taylor and Francis.

Mukaka, M. M. (2012). A guide to appropriate use of correlation coefficient in medical research.

Malawi Medical Journal, 24(3), 69-71.

Schober, P., Boer, C., & Schwarte, L. (2018). Correlation Coefficients. Anesthesia &

Analgesia, 126(5), 1763-1768. doi: 10.1213/ane.0000000000002864

You might also like