You are on page 1of 19

Correlation Analysis

A Performance Task Presented to Ms. Ivy M. Geronimo (Subject Instructor)

Cafino, John Spain D.


Cordero, Kristine M.
Estefani, Louie Althea M.
Galdores, Gabriel Rhen J.
Guinto, Derik Connery F.
Padilla, Dennis Angelo M.
Paguio, Shania M.
Quijano, Anthony Lewi S.

11 STEM - A
INTRODUCTION

An inferential statistical test of correlation is used to ascertain whether there is


a statistically significant relationship or connection between two variables. With the
aid of the bivariate data, learners can determine, interpret, and analyze the two
quantitative variables. This data can help to: 

 Identify trends and patterns


 Identify cause-and-effect relationships
 Researchers make predictions
 Inform decision-making

The following types of bivariate analysis are used to conduct correlation and
regression line analyses: 

 Scatter plots 
 Correlation
 Regression

In line with this, listed below are the sets of bivariate data (a minimum of 70
pairs in the sample) that are relevant to everyday life:

I. Academic Performance of 70 Senior High School Students

The group wants to study the relationship between the number of hours
students spend reviewing and their scores on a standardized test.

Hours of Reviewing Test Scores (over 100)

6 85

3 78

4 57

2 45

7 90

4 70
12 96

3 42

6 62

7 68

1 50

1 75

3 80

4 85

5 78

8 86

11 83

6 100

3 62

14 94

9 67

6 78

2 40

4 43

4 36

5 80

3 89

2 69

4 56

8 75

7 60

1 34
0.5 50

5 77

7 87

6 86

9 89

10 91

12 93

2 64

4 68

5 69

1 21

2 64

5 89

6 89

2 70

13 100

4 94

1 54

6 94

9 32

0.5 65

2 78

3 59

12 98

4 53

2 77
6 89

7 64

8 75

0.5 21

10 98

2 25

7 88

12 97

1.5 66

0.5 65

4 73

8 87

II. Allowance of 70 Senior High School Students

The group wants to study the relationship between age and the amount of
allowance that senior high school students receive from their parents.

Age Money Earned (peso/week)

16 500

18 350

15 460

17 600

17 643

17 575

19 410

18 255
17 650

16 245

16 350

18 380

19 285

17 300

16 180

19 540

18 690

17 245

18 700

16 650

18 260

16 270

16 385

17 460

17 470

17 230

18 440

19 345

16 560

18 640

16 430

18 740

16 330

15 220
16 540

17 525

18 430

16 760

17 586

18 630

17 530

19 750

17 520

16 640

18 420

18 350

17 950

16 640

17 750

18 540

19 420

18 330

17 720

16 640

18 420

17 320

18 240

17 140

17 820

19 520
19 740

17 405

18 640

19 530

16 540

17 720

17 650

16 690

16 530

19 480

III. Time Spent of 70 STEM Students on Social Media

The group wants to study the relationship between the amount of time STEM
students spend on social media and their confidence level.

Time on Social Media (minutes) Confidence Level (1-10)

60 7

120 4

40 6

140 3

70 6

80 7

140 5

40 9

60 6.5

75 7

95 4
105 6

180 3

80 7

20 6

80 9.5

50 6

80 7

60 5

30 9

40 4

55 7

70 7

85 6

115 5.5

60 8

30 6

170 3

140 2

80 7

75 3.5

105 1

45 4

50 8

85 7

65 6

60 5
40 5

20 4.5

10 9

60 8

80 8

95 7

85 6

74 5

56 6.5

90 4

25 8

60 9

45 6

60 4

45 5.5

50 6

40 7

110 3

130 2

60 8

30 4.5

55 8

40 6

30 5

45 4

30 5
100 4.5

105 4

240 1

55 3

75 6

60 6

90 9

The bivariate data that we will be using consists of three variables: academic
performance, allowance, and social media. Academic performance refers to the
hours of reviewing that affect a student's test score. Allowance refers to the financial
support that a student receives from their family and how age affects the money
earned. Social media refers to the time spent on social media that affects their
confidence level.

By examining the relationship between these variables using correlation analysis, we


can gain insights into how they are related and better understand the factors that
influence students.

 Academic Performance - Hours of reviewing and Test Score


 Allowance - Age and Money earned
 Social Media - Time on social media and confidence level

The importance of our bivariate data is to let the researchers determine and
describe the relationship between the two variables and how they are related to each
other. The research presented here either confirms or denies the relationship and
association hypotheses. This is crucial in understanding the underlying patterns and
trends in the data, which can inform future research and decision-making.
Additionally, bivariate data analysis can also help identify potential confounding
variables that may affect the relationship between the two variables of interest. The
value of a dependent variable can be predicted using changes in the value of an
independent variable. This can be useful in a variety of study fields, including social

science, medicine, marketing, and others. We used scatter plots to show how the two
variables are related. The main purpose of scatter plots is to examine and display
correlations between two numerical variables. The patterns displayed by the dots in a
scatter plot are in addition to the values of the individual data points when the data as
a whole is viewed.

The things we want to learn from the data are: (1) how much time people
spend on social media; and (2) how much they compare themselves socially. By
analyzing the scatter plot, we can identify any correlation between time spent on
social media and social comparison. This information can be useful in understanding
the impact of social media on individuals' mental health and well-being. (3) The
number of hours students spend reviewing subjects that affect how well they perform
on tests; and lastly, (4) How much allowance do senior high school students receive
from their parents, and at what age do most have a lot?  
HYPOTHESIS TESTING AND COMPUTATION

After creating the three sets of bivariate data, the group went ahead and
identified the null and alternative hypotheses, scatter plots, the computation for r, and
regression line. The data are presented below from the academic performance,
allowance, and social media respectively.

I. Academic Performance of 70 Senior High School Students

a) Null and Alternative Hypothesis


H 0 : There is a correlation between the hours of reviewing and the test
scores of 70 Senior High School students.
H A : There is no correlation between the hours of reviewing and the test
scores of 70 Senior High School students

b) Scatter Plot

Academic Performance of 70 Students


Test Scores (over 100) Linear (Test Scores (over 100))
120

100
f(x) = 3.55279010301454 x + 52.5286858207314
R² = 0.369596638718595
80
Range of Scores

60

40

20

0
0 2 4 6 8 10 12 14 16
Hours of Reviewing
c) The computation for r
Excel Formula For Pearson R: 0.607944602
Correlation Data Analysis Table:

Hours of Reviewing Test Scores (over 100)


d) The computation
Money for the regression line
Earned (peso/week) 1
Age Regression Statistics 0.607944602 1

Multiple R 0.607944602

R Square 0.369596639

Adjusted R Square 0.360326001

Standard Error 16.19347792

Observations 70

ANOVA
df SS MS F Significance F
Regression 1 10454.3894 10454.3894 39.86744516 2.38361E-08
Residual 68 17831.55345 262.2287273
Total 69 28285.94286
Coefficients Standard Error t Stat
Intercept 52.52868582 3.511512081 14.95899334
Hours of Reviewing 3.552790103 0.562678534 6.314067244
P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
3.41734E-23 45.52157256 59.53579908 45.52157256 59.53579908
2.38361E-08 2.429982569 4.675597637 2.429982569 4.675597637

y= 3.55279010301455x + 52.5286858207314

II. Allowance of 70 Senior High School Students

a) Null and Alternative Hypothesis


H 0 : There is a relationship between the age and the amount of allowance
70 Senior High School students receive from their parents.
H A : There is no relationship between the age and the amount of
allowance 70 Senior High School students receive from their parents.

b) Scatter Plot

Allowance of 70 Senior High School Students


Money Earned (peso/week) Linear (Money Earned (peso/week))
Range of Money Allowance Quantity

1000
900
800
700
600
500
400 f(x) = 3.61770981507823 x + 435.443456614509
R² = 0.000486221370092577
300
200
100
0
14.5 15 15.5 16 16.5 17 17.5 18 18.5 19 19.5
Age

c) The computation for r


Excel Formula For Pearson R: 0.022050428
Correlation Data Analysis Table:

Hours of Reviewing Test Scores (over 100)


Money Earned (peso/week) 1
Age 0.022050428 1
d) The computation for the regression line

Regression Statistics

Multiple R 0.022050428

R Square 0.000486221

Adjusted R Square -0.014212511

Standard Error 178.2913523

Observations 70

ANOVA
df SS MS F Significance F

Regression 1 1051.513199 1051.513199 0.033079137 0.85622057

Residual 68 2161570.83 31787.80632

Total 69 2162622.343

Coefficients Standard Error t Stat

Intercept 435.4434566 343.3554575 1.268200191

Hours of Reviewing 3.617709815 19.89100118 0.181876708

P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%

0.209051307 -249.7116912 1120.598604 -249.7116912 1120.598604


0.85622057 -36.07416505 43.30958468 -36.07416505 43.30958468

y= 3.61770981507823x + 435.443456614509
III. Time Spent by 70 STEM Students on Social Media

a. Null and Alternative Hypothesis


H 0 : There is a connection between the amount of time 70 STEM students
spent on social media along with their confidence level.
H A : There is no connection between the amount of time 70 STEM
students spent on social media along with their confidence level.

b. Scatter Plot

Time Spent of 70 STEM Students on Social Media


10 Confidence Level(1-10) Linear (Confidence Level(1-10))

9
8
Confidence Level Ranking

7 f(x) = − 0.0254716855137361 x + 7.58631983225568


R² = 0.269073629443458
6
5
4
3
2
1
0
0 50 100 150 200 250 300
Range of Minutes Spent

c. The computation for r


Excel Formula For Pearson R: -0.518723076
Correlation Data Analysis Table:

Hours of Reviewing Test Scores (over 100)


Money Earned (peso/week) 1
Age -0.518723076 1
d. The computation for the regression line

Regression Statistics

Multiple R 0.518723076

R Square 0.269073629

Adjusted R Square 0.258324712

Standard Error 1.706164918

Observations 70

ANOVA
df SS MS F Significance F

Regression 1 72.86994374 72.86994374 25.03262646 4.22041E-06

Residual 68 197.9479134 2.910998727

Total 69 270.8178571

Coefficients Standard Error t Stat

Intercept 7.586319832 0.424873109 17.85549536

Hours of Reviewing -0.025471686 0.005091016 -5.003261582

P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%

2.26363E-27 6.738498743 8.434140922 6.738498743 8.434140922


4.22041E-06 -0.03563065 -0.015312721 -0.03563065 -0.015312721

y= -0.0254716855137361x + 7.58631983225568
RESULTS AND IMPLICATIONS
(you may include the answer for the essential question in this part)
(add if the results of each scatter plots are either strong positive, strong negative,
moderately positive, moderately negative, weak or no correlation)

DOCUMENTATION

You might also like