Professional Documents
Culture Documents
2
Correlation Analysis
SCATTER PLOT AND CORRELATION
• Scatter plot (or scatter diagram) is used to show the relationship
between two variables
• Correlation analysis is used to measure strength of the association
(linear relationship) between two variables
4
SCATTER PLOT EXAMPLES
5
SCATTER PLOT EXAMPLES
6
SCATTER PLOT EXAMPLES
7
CORRELATION COEFFICIENT
8
FEATURES OF CORRELATION COEFFICIENT
a. Unit free
b. A correlation coefficient of -1.00 or +1.00
c. The closer to -1.00, the stronger the negative linear relationship
d. The closer to +1.00, the stronger the positive linear relationship
e. The closer to 0, the weaker the linear relationship
9
EXAMPLES OF APPROXIMATE r VALUES
10
CALCULATION OF COEFFICIENT CORRELATION
• Sample correlation coefficient :
𝑺𝒙𝒚
𝒓=
𝑺𝒙𝒙 𝑺𝒚𝒚
With: Where:
r : sample correlation coefficient
𝑆𝑥𝑥 = σ𝑛𝑖=1(𝑥𝑖 − 𝑥)ҧ 2 , n : sample size
𝑆𝑦𝑦 = σ𝑛𝑖=1(𝑦𝑖 − 𝑦)
ത 2, xi : value of observation i in independent variable
yi : value of observation i in dependent variable
𝑆𝑥𝑦 = σ𝑛𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦),
ത 𝑥ҧ : average value of independent variable
𝑦ത : average value of dependent variable
11
EXAMPLE OF COEFFICIENT CORRELATION CALCULATION
We want to evaluate the relationship between the number of
sales calls and the number of products sold.
12
EXAMPLE OF COEFFICIENT CORRELATION CALCULATION
Calls Sales
Calls 1
13
SIGNIFICANCE TEST OF THE CORRELATION COEFFICIENT
𝐻𝑜 : 𝜌 = 0 𝑛𝑜 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛
𝐻1 : 𝜌 ≠ 0 (𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑒𝑥𝑖𝑠𝑡𝑠)
Test statistic:
𝒓 𝒏−𝟐
𝑻=
𝟏 − 𝒓𝟐
With n-2 degrees of freedom
14
SIGNIFICANCE TEST OF THE CORRELATION COEFFICIENT
a/2 a/2
15
EXAMPLE OF SIGNIFICANCE TEST OF r
1. State the hypothesis
𝐻𝑜 : 𝜌 = 0 𝑛𝑜 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛
𝐻1 : 𝜌 ≠ 0 (𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑒𝑥𝑖𝑠𝑡𝑠)
5. Conclusion
At 5% significance level,
there is a positive correlation between the number of calls and the number of sales in the population
REGRESSION ANALYSIS
17
INTRODUCTION TO REGRESSION ANALYSIS
Regression analysis is used to:
▪ Predict the value of a dependent variable based on the value of at least one
independent variable
▪ Explain the impact of changes in an independent variable on the dependent variable
Independent variable:
• the variable we use to explain the dependent variable
• Predictor variable → use to predict the expected value of dependent variable
Dependent variable:
• the variable we wish to explain
• The variable that is being predicted or estimated
18
SIMPLE LINEAR REGRESSION MODEL
19
TYPES OF REGRESSION MODELS
20
SIMPLE LINEAR REGRESSION MODEL
A linear relationship form between the response Y and the
regressor x:
𝒀 = 𝜶 + 𝜷𝒙
Interpretation:
✓ The quantity Y is a random since 𝝐 is random
✓ The value regressor x is not random
22
SIMPLE LINEAR REGRESSION ASSUMPTIONS
23
POPULATION AND SAMPLE REGRESSION MODEL
Unknown 𝑦ො = 𝑎 + 𝑏𝑥
relationship
𝑦 = 𝛼 + 𝛽𝑥
24
ESTIMATED REGRESSION MODEL
25
LINEAR REGRESSION MODEL
In regression analysis, the objective is to use the data to position a line
that best represent the relationship between the two variables
→How do we find the best fitted line for the data?
26
SCATTER DIAGRAM
1. Plot of all (Xi,Yi) pairs
2. Suggest how well model will fit
27
SCATTER DIAGRAM
How would you draw a line through the points? How do you determine which line ‘fits best’?
28
SCATTER DIAGRAM
How would you draw a line through the points? How do you determine which line ‘fits best’?
29
SCATTER DIAGRAM
How would you draw a line through the points? How do you determine which line ‘fits best’?
30
SCATTER DIAGRAM
How would you draw a line through the points? How do you determine which line ‘fits best’?
31
LEAST SQUARES PRINCIPLE
• We would like to choose a line that would minimize the error
between the actual data and the line → residual
Error in Fit:
Given a set of regression data [(xi,yi );i:1,2,…,n] and a fitted model
𝑦ො𝑖 = 𝑎 + 𝑏𝑥, the ith residual ei is given by
𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖
32
LEAST SQUARES EQUATION
Given a set of regression data [(xi,yi );i:1,2,…,n], the least squares estimate a and
b of the regression coefficients α and β are computed from the formulas:
σ𝒏𝒊=𝟏 𝒙𝒊 − ഥ ഥ) 𝑺𝒙𝒚
𝒙 (𝒚𝒊 − 𝒚
𝒃= 𝒏 𝟐
=
σ𝒊=𝟏(𝒙𝒊 − ഥ
𝒙) 𝑺𝒙𝒙
σ𝒏𝒊=𝟏 𝒚𝒊 − 𝒃 σ𝒏𝒊=𝟏 𝒙𝒊
𝒂= ഥ − 𝒃ഥ
=𝒚 𝒙
𝒏
33
INTERPRETATION OF LEAST SQUARES MODEL
➢ a is the estimated average value of y when the value of x is zero
➢ b is the estimated change in the average value of y as a result of a one-unit
change in x
Note:
The regression equation is not generally used for the points outside the range of
the sample values
34
SIMPLE LINEAR REGRESSION EXAMPLE
Recall the previous example!
We want to evaluate whether the number of sales calls affects the number of
products sold?
Calls (xi) Sales (yi) (𝑥𝑖 − 𝑥)ҧ (𝑥𝑖 − 𝑥)ҧ 2 ത
(𝑦𝑖 − 𝑦) ത 2
(𝑦𝑖 − 𝑦) ҧ 𝑖 − 𝑦)
(𝑥𝑖 −𝑥)(𝑦 ത
20 $ 30.00 -2 4 $ -15.00 $ 225.00 $ 30.00
40 $ 60.00 18 324 $ 15.00 $ 225.00 $ 270.00
-2 4 $ -5.00 $ 25.00 $ 10.00
900
20 $ 40.00 𝑏= = 1.18
30 $ 60.00 8 64 $ 15.00 $ 225.00 $ 120.00 760
10 $ 30.00 -12 144 $ -15.00 $ 225.00 $ 180.00 𝑎 = 45 − 1.18 22 = 18.95
10 $ 40.00 -12 144 $ -5.00 $ 25.00 $ 60.00
20 $ 40.00 -2 4 $ -5.00 $ 25.00 $ 10.00
20 $ 50.00 -2 4 $ 5.00 $ 25.00 $ -10.00
20 $ 30.00 -2 4 $ -15.00 $ 225.00 $ 30.00
30 $ 70.00 8 64 $ 25.00 $ 625.00 $ 200.00
𝑥ҧ =22 𝑦ത = $ 45.00 0 𝑺𝒙𝒙 =760 0 𝑺𝒚𝒚 =$ 1,850.00 𝑺𝒙𝒚 =$ 900.00
35
INTERPRETATION THE MODEL
Linear regression equation:
𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒔𝒂𝒍𝒆𝒔 = 𝟏𝟖. 𝟗𝟓 + 𝟏. 𝟏𝟖(𝐧𝐮𝐦𝐛𝐞𝐫 𝐨𝐟 𝐜𝐚𝐥𝐥𝐬)
36
INTERPRETATION THE MODEL
Linear regression equation:
𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒔𝒂𝒍𝒆𝒔 = 𝟏𝟖. 𝟗𝟓 + 𝟏. 𝟏𝟖(𝐧𝐮𝐦𝐛𝐞𝐫 𝐨𝐟 𝐜𝐚𝐥𝐥𝐬)
37
Selesai