You are on page 1of 12

STATISTICAL RELATIONSHIPS AMONG VARIABLES

Statistical relationships between variables rely on notions and correlation. This concept aims to
describe the way in which variables relates to one another.
Relationships in probability and statistics can generally be one of three things:
deterministic, random, or statistical. A statistical relationship is a mixture of
deterministic and random relationships.

CORRELATION

- Correlation test are used to determine how strongly the scores of two variables are
associated. Or correlated to each other.
- is a measure of relationship strength. Pearson’s R is one numerical measure of the
correlation strength. A value of 0 means no relationship and 1 means a perfect
relationship.
- correlation denotes positive or negative association between variables in the study.
- The two variables are positively associated when larger values of one tend accompanied
by the larger values of the other.
- The two variables are negatively associated when larger values of one tend to be
accompanied by smaller values of the other.

Positive Correlation
A positive correlation occurs when two variables move in the same direction. For example, there
is a positive correlation between smoking and lung cancer.
Example:
1. The longer people are alive, the more job experience they might have.
2. Temperature and cold soda sale.
Hotter temperature correlate more cold soda sale.

 Negative Correlation
A negative correlation occurs when two variables move in opposite directions.
For example,
1. There is a negative correlation between exercise and obesity.
2. Hotter temperature correlates with hot chocolate sale.
No Correlation
When two variables are uncorrelated, there is no relationship between them.
For example:
there is no correlation between shoe size and IQ.
HOW TO EVALUATE CORRELATION
Correlation is calculate as the level and direction of a relationships between two variables X & Y,
the range of values of a correlation coefficient is from “ -1 “ to “ +1 “.

Strength of correlation Negative Positive


Negligible “ -0.20 “ – “ 0.00 “ “ +0.20 “ – “ 0.00 “
Low correlation “ -0.41” – “ -0.70 “ “ -0.41” – “ +0.70 “
Moderate correlation “ -0.71”- “ -0.90” “ +0.71”- “ +0.90”
Very high correlation “ -0.91” – “ -0.99” “ +0.91” – “ +0.99”
Perfect correlation “ -1.00 “ “ +1.00 “

The pearson product moment correlation (r) is the most common “ correlation coefficient”.
Interval or ratio data is required for both variables to calculate a “ pearson’s r”.
CORRELATION DOES NOT IMPLY CAUSATION

While correlation does not imply causation, it is often used as a starting point for investigating
causal relationships between variables.

For example: if we observe a positive correlation between smoking and lung cancer, we may
hypothesize that smoking causes lung cancer and design experiments to test this hypothesis

STATISTICAL SIGNIFICANCE OF A CORRELATION

The table (r) is used to determined the significance of r. df represent the degrees of freedom:
df = N – 2
Steps to be followed,
Step 1. Find the degrees of freedom in the lest column. df = N (pairs) of data minus 2.
Step 2. The heading at the top of each column indicates the odds of a chance occurrences. (the
probability of error when declaring r to be significant). P = .10 is the 10%, p=0.5 is the 5%, p
= .01 is the 1% of the probability level.
Table r . values the correlation coefficient (r)

df P = .10 P = 0.5 P = .01


1 .9877 .9969 .999
2 .900 .950 .990
3 .805 .878 .959
4 .729 .811 .917
5 .669 .754 .875
6 .621 .707 .834
7 .582 .666 .798
8 .549 .632 .765
9 .521 .602 .735
10 .497 .576 .708
11 .476 .553 .684
12 .457 .532 .661
13 .441 .514 .641
14 .426 .497 .623
15 .412 .482 .606
16 .400 .468 .590
17 .389 .456 .575
18 .378 .444 .561
19 .369 .433 .549
20 .360 .423 .537
25 .323 .381 .487
30 .296 .349 .449
35 .275 .325 .418
40 .257 .304 .393
45 .243 .288 .372
50 .231 .273 .354
60 .211 .250 .325
70 .195 .232 .302
80 .183 .217 .283
90 .173 .205 .267
100 .164 .195 .254

PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENT


- The most widely used type of correlation coefficient, it also called linear or product
moment correlation.
- is a measure of the strength of a linear association between two variables and is
denoted by r.
- using non technical language , one can say that the correlation coefficient determines
the extent to which values of X and Y variables are “proportional’ to each other. The
value of correlation does not depend on the specific measurement unit used;

For example;

The correlation between height and weight will be identical regardless of whether inches
and pounds, or centimeter and kilograms are used as measurement units.

- Proportional means linearly related ; that is, the correlation is high if it can be
approximated by the straight line ( sloped upwards or downwards).
The line is called regression line or least square line, because it is determined such
that the sum of the squared distance of all data points from the line is the lowest
possible. Pearson correlation assumes that the two variables are measured on at least
interval scales.

computation formula;
Where:
N = represent the number of pairs of data
Σ = Denotes the summation of the items indicated
ΣX = denotes the some of all X score
ΣX² = indicates that each X score should be squared and then those squares summed
(ΣX)² = indicates that the X scores should be summed and the total squared ( the square of
the sum of the score )
ΣY = denotes the sum of all Y-score
ΣY² = indicate that each y score should be squared and then those squares summed
(ΣY)² = indicates that the y scores should be summed and the total squared
ΣXY = indicates that each x scores should be first multiplied by it’s corresponding y scores and
the product (XY) summed.

Example;
1. Calculate the correlation between Mathematics (x) and Physics (y) for the 10 student
whose scores appeared in the table below.
Correlation between math and physics

Student Math (X) Physics (Y) X² Y² XY


1 3 11 9 121 33
2 7 1 49 1 7
3 2 19 4 361 38
4 9 5 81 25 45
5 8 17 64 289 136
6 4 3 16 9 12
7 1 15 1 225 15
8 10 9 100 81 90
9 6 15 36 225 90
10 5 8 25 64 40
Sum 55 103 385 1401 506

Computation;
N ∑ xy −( ∑ x )( ∑ y )
r=
√¿¿¿
( 10 )( 506 ) −(55)(103)
=
√( 10 )( 385 )−( 55 ) ² ( 10 ) ² ( 1401 ) ²−( 103 ) ²
5060−5665
=
√ ❑350−3025
−605
=
( 28.723 ) (58.318 )
−605
=
1675.0679
r=-0.36
Statement: The correlation we obtained is -0.36, showing us that there is a low negative
correlation between mathematics and physics.
2. Relationship of scores on 20 point English and Solving Problem quizzes of five ( 5 )
students.

Student English ( X ) Solving X² Y² XY


problem ( Y )
A 11 11 121 121 121
B 13 10 169 100 130
C 18 17 324 289 306
D 12 13 144 169 156
E 16 14 256 196 224
N=5 70 65 1014 875 937

( 5 )( 937 )−(70)(65)
=
√( 5 )( 1014 )−( 70 ) ² ( 5 ) ² ( 875 ) ²−( 65 ) ²
4685−4550
=
√(5070−4900)(4375−4225)
135
=
( 13.038 ) (12.274 )
r ❑= 0.845
Statement: There is indeed a high positive correlation between the English and Solving problem
scores.
Testing Statistical Hypothesis
Steps in Testing Statistical Hypothesis
1. State the null hypothesis and the alternative hypothesis based on your research
question.

H₀ : r S= 0
H₁ : r S˃ 0

Note: Our null hypothesis, for the pearson r, states that r is 0. The alternative hypothesis states
that r has a significant positive value.
2. Set the alpha level
ꭤ = .05
Note: As usual we will set our alpha level at .05, we have 5 chances in 100 of making a type I
error.
3. Calculate the value of the appropriate statistic. Also indicate the degree of freedom for
the statistical test if necessary.

r = 0.845
df = N – 2 = 5 – 2 = 3

4. Write the decision rule for rejecting the null hypothesis.

Reject H₀ if r S ˃= .878

5. Write a summary statement based on the decision.

Reject H₀, p ˂ .05, one tailed

Note: Since our calculated value of r of .845 is lesser than .878 we accept the null
hypothesis.

6. Write a statement of results.


There is no significant correlation between the English and problem solving.

SPEARMAN RHO
The spearman Rho correlation tells you the magnitude and direction of the association
between two variables that are on an interval or ratio scale.
Hypothesis: is a proposed explanation for a phenomena.
Null: There is no association between the two variables.
Alternate: There is an association between the two variables.
How to describe output from a Bi-variance correlation ( spearman’s Rho )
A bi-variance correlation is a statistic that measures how well two variables fit together. In this
case we are using spearman’s Rho correlation coefficient which is a nonparametric statistic. This
just means that the two variables compared do not have a normal distribution.

Correlation coefficient can vary from -1.0 to +1.0. The Rho here of .382 is positive
which means that a higher score on the commitment scale. Thus we can reject the null
hypothesis that these variable are not related. A Rho of 0.0 would mean that there is no
correlation between the two variables. A Rho of .382 means there is a moderate correlation
between the two scales. It appears that individuals who scores high
Converting these two to ranks would result in the following;
X Y
2 1
1 2
3 4
4 3

The formula is;

( )
2
6ΣD
r S= 1− 3
N −N

Where:
D = Rank of X – rank of Y ( difference score )
N = Number of cases
Σ = Summation symbol
Illustrations;
1. Compute the relationship of accounting and mathematics performances of five students
given below.

Student Accounting Mathematics


performance performance
A 3 3
B 1 ( highest ) 2
C 2 1 ( highest )
D 5 4
E 4 5
N=5

Student English ( X ) Solving D D²


problem ( Y )
A 3 3 0 0
B 1 2 -1 1
C 2 1 1 1
D 5 4 1 1
E 4 5 -1 1
N=5 0 4

( )
2
6ΣD
Computations; r S= ¹− 3
N −N

( 5³−5
= ¹−
6( 4)
)
=( 125−5 )
¹−24

=(¹−
125−5 )
24

=(¹−
120 )
24

= ( 1 - .2 )
r S= 0.80 ; High positive correlation

Now we are looking at rankings on two variables and can use the Spearman Rank – Difference
correlation to test the significance of the relationship. The two sets of ranks, as well as the
difference between the pairs of ranks (D) and the difference squared ( D2 ¿ are shown in the
following table,
2. Oral communication test and teacher’s ranking on speaking ability
Oral communication Teacher’s Ranking D D²
test Score ranking on speaking Ability
1 3 -2 4
2 2 0 0
3 1 2 4
4 4 0 0
5 5 0 0
6 6 0 0
7 8 -1 1
8 7 1 1
9 10 -1 1
10 9 1 1
Total 12

Computations;
( )
2
¹−6 Σ D
r S= 3
N −N

(
= ¹−
6 (12)
10³−10 )
= (¹− 1000−100
72
)

(
= ¹−
72
990 )
=( 1−0.072 )
r S= 0.93 ; very high positive correlation

df = N – 2 = 10 – 2 = 8
Testing statistical Hypothesis
1. State the null hypothesis and the alternative hypothesis based on your research
question.

Note : Our null hypothesis states that there is no significant relationship


between the two variables. The alternative hypothesis state that is significant
positive correlation between the two variables.
2. Set the alpha level
ꭤ = .05
As usual we will set our alpha level at .05, we have 5 chances in 100 of
making a type I error.
3. Calculate the value of the appropriate statistic. Also indicate the degree of freedom for
the statistical test if necessary
r S=.93

df = N – 2 = 10 – 2 = 8
4. Write the decision rule for rejecting the null hypothesis.

To write the decision rule we had to know the critical value for r S ,with an alpha
level of .05, and 8 degrees of freedom. We can do this by looking at Appendix
Table.
5. Write a summary statement based on the decision
Reject H₀ if r S ˃= .878

Since our calculated value of r S ( .93 ) is greater than .632, we reject the null
hypothesis and accept the alternative hypothesis.

6. Write a statement of results.


There is no significant positive correlation between the student’s rank on an
oral communication test and their teacher’s ranking of them on speaking
ability.

You might also like