You are on page 1of 7

Unidad Educativa

Colegio Alberto Einstein

Statistical analysis of two quantitative variables

Relationship between the time students study and their


final test scores
B.) INFORMATION AND MEASUREMENTS
Table #1 Correlation between minutes of study and final test score: The data presented in this table
shows the relationship between the time that a student studies and the score they get on the final
test.(https://digitalcommons.odu.edu/cgi/viewcontent.cgi?article=1292&context=ots_masters_projects)
Graph #1:

Looking at the data, you can see that there is a strong positive correlation between the two variables,
because most points are very close together.

C.) Mathematical Procedures

C.1) Analytical Method

To perform the correlation of the variables x (minutes of study time) and y (final test scores), the standard
deviation and covariance data must be obtained, this is a process of several steps:

a) Arithmetic mean calculations:

x (Mean) y (mean)

𝑥 𝑦
𝑥=∑ 𝑛
𝑦=∑ 𝑛

7715 3891,9
𝑥= 42
= 183, 6904762 𝑦= 42
= 92, 66428571
b.) Specific data calculations to obtain standard deviations and covariance

Standard deviation in x:

𝑥 − 𝑥̅= 120 −183,6904762=-63,6904762


2
(𝑥 − 𝑥̅) = 4056,47676

Standard deviation in y:
𝑦 − 𝑦̅ = 95.6-92,66428571 =2.93571429
2
(𝑦 − 𝑦) =8.61841839
Covariance:
(𝑥 − 𝑥̅)(𝑦 − 𝑦̅) = (-63,6904762)(2.93571429) = -186.977041
Next, the table where the calculations were made for every singular data is placed. Title: Correlation
between minutes of study and final test score.
Table #2:Data summary table for calculation of correlation coefficient

c) Standard deviation and covariance calculations:

(x) Standard deviation (y) Standard deviation Covariance

2 2 ∑(𝑥−𝑥𝑢)(𝑦−𝑦𝑢)
∑(𝑥−𝑥𝑢) ∑(𝑦−𝑦𝑢)
𝑆𝑥 = 𝑆𝑥𝑦 = 𝑛
𝑛 𝑆𝑦 = 𝑛

1604652,98 −2750,96
𝑆𝑥 = 965,59 𝑆𝑥𝑦 = 42
42 𝑆𝑦 = 42

𝑆𝑥 = 195, 46 𝑆𝑦 =4,79 𝑆𝑥𝑦 = − 65, 5

d.) Calculation of the correlation coefficient:

𝑆𝑥𝑦
𝑟= 𝑆𝑥·𝑆𝑦

−65,5
𝑟= 195,46·4.79
=− 0, 07

The correlation coefficient obtained is -0.07, that indicates that the correlation is a very weak
negative correlation, next, the regression line is drawn.

e) Linear regression line


Slope:

𝑆𝑥𝑦
𝑚 = 2
(𝑆𝑥)
−65.5
𝑚= 2 = 0, 0004
195.46

Equation of the linear regression line

𝑆𝑥𝑦
𝑦 − 𝑦̅ = 2 ∙ (𝑥 − 𝑥̅)
(𝑆𝑥)
𝑦 − 92, 66 = − 0, 002(𝑥 − 183, 69)
𝑦 =− 0. 002𝑥 + 0. 36738 + 92. 66
𝑦 =− 0. 002𝑥 + 93. 02738
It is seen that the slope in the regression line is negative, and that signifies that the variables
have an inverse proportional relationship, so when the x variable increases, the y variable
decreases. The cutoff ar y when x=0 is positive, meaning that there is still a average of
93.02738% in the final test score if you study 0 minutes.

C.2.)Technological method

Method #2: Line of best fit between time of study and final test score.

2
𝑟 = 0. 0049
𝑟 = 0. 07
𝑦 =− 0. 002𝑥 + 93. 02738

As it can be seen in the data gathered and the graph, the coefficient of determination is 0.0049,
so most of the data is scattered across the Cartesian plane, and not very close to the regression
line. Because 0.0049 is a number very close to 0, the data cannot be able to predict the Y
variable in the future.
Degree of association:

The degree of correlation association is negative, but the data cannot be predicted with these
data. Both analytical and technological method were the same for the correlation coefficient (r),
both are very weak correlations.

Causation:

The data gathered shows that there could possibly be a causation effect between these two
variables, but I think that the causation effect could just be a coincidence. Actually studying does
probably improve final test score by a lot, because you actually have the information clear in
your mind, meaning that the people who got a good score and studied a lot probably got that
score because they studied. On the counterpart, the people who actually feel that they already
have the information clear, can just enter the test more confidently, and get the same results as
the people who studied.

You might also like