You are on page 1of 19

Box Plot or Five-Number Summary

How to create a Box Plot: An example

https://support.microsoft.com/en-us/office/create-a-box-and-whisker-
chart-62f4219f-db4b-4754-aca8-4743f6190f0d
4.3 Linear Correlation
Stud # IQ GPA
Question: Is there a relationship
1 110 1.4
between IQ and GPA?
2 112 1.8
3 127 1.2
:
n

• The correlation, denoted by r, measures the amount of linear association between two
variables. The value of r is always between -1 and 1 inclusive.

• The R-squared value, denoted by R2, is called the Coefficient of determination.


• It measures the proportion of variation in the dependent variable that can be attributed
to the independent variable. The value of R2 is always between 0 and 1 inclusive.
Scatter diagrams and correlation

Figure 3-5 p61


How to interpret Pearson r

Positive Negative Interpretation


+ 1.00 – 1.00 Perfect
+ 0.80 to + 0.99 – 0.80 to – 0.99 Very strong
+ 0.60 to + 0.79 – 0.60 to – 0.79 Strong
+ 0.40 to + 0.59 – 0.40 to – 0.59 Moderate
+ 0.20 to + 0.39 – 0.20 to – 0.39 Weak
+ 0.01 to + 0.19 – 0.01 to – 0.19 Very weak / negligible
Source: Adopted from Salkind (2009)

So, for example, a correlation coefficient of - 0.59 would be considered a moderate negative
relationship ; while an r value of 0.15 would be considered a very weak positive.
Correlation r = 0.0; R-squared = 0.0. No association.
There is no association between the variables.
Correlation r = -0.3. Small negative association.

There is a complex equation that can be used to arrive at the correlation coefficient, but the
quicker way or most effective way to calculate it is to use Data Analysis ToolPak in Excel:
https://www.excel-easy.com/examples/correlation.html
Sample data

Suppose we have the following dataset that has the information for 1,000 students:
A graph showing a positive r

p61
Q: Is there a relationship between Sit-ups and Push-ups?
Q: Is there a relationship between Cost of a Fill-up and Tank Capacity?

p59
ACTIVITY 1.

IS there a relationship
between body weight and
water ?

(Use Excel Analysis ToolPak to


compute r.
Interpret the value.

Obtain the scatterplot of the data.


Say something about it.
ACTIVITY 2. Comment about this graph.

Source: https://covid19stats.ph/stats/by-location/cebu-city
4.4 Simple Linear Regression

Regression analysis: Finds the equation of the line that best describes the relationship
between two variables to help make accurate / reliable predictions

Regression equation or “line of best fit” : y = b0 + b1 x found by using the
“method of least squares”

predicted value of y y-intercept slope

Another complex equation can be used to arrive at the estimates of the y-intercept
and slope. A quicker way or most effective way to calculate is by using Data Analysis
ToolPak in Excel: https://www.excel-easy.com/examples/regression.html
ACTIVITY 1.

Refer to the same data under


Correlation, and using Excel
Analysis ToolPak, do the
following:

1) Obtain the scatterplot.


2) Determine the line of
best fit that passes
through the scattered
points.
3) Write the equation of
of the regression line.
ACTIVITY 2.
Use Excel Analysis ToolPak to
obtain the scatterplot of the
height and weight data given
here. Compare it with the
scatterplot presented.

1) Determine the line of best


fit that passes through the
scatterplot produced by
your Excel.
2) Write the equation of
of the regression line.
3) Estimate weight when
height is 55 inches; 5’10” .
Table 3-12 p67

You might also like