Professional Documents
Culture Documents
Chapter 6 Corelation and Regression QM
Chapter 6 Corelation and Regression QM
www.globsyn.com
Session 6
Quantitative Methods 1
11
www.globsyn.edu.in
www.globsyn.com
22
www.globsyn.edu.in
www.globsyn.com
Linear Regression
Fitting the Regression Line using Least
Squares Method
Correlation
Correlation Coefficient
Coefficient of Determination
To be able to compute
Correlation Coefficient
Coefficient of Determination
33
www.globsyn.edu.in
www.globsyn.com
Regression Analysis
Regression and Correlation Analysis will show us
how to determine both the nature and strength of
relationships between two variables
Introduced in 1877 by Francis Galton. He made a
study that height of children born to tall parents will
regress towards the mean height of the population
Dependent and Independent Variable :the known
variable is called Independent variable. The variable
we are trying to predict is the dependent variable.
Example : Sales depending on advertising expenditure
GNP depending on final consumption spending
4
4
www.globsyn.edu.in
www.globsyn.com
About Regression
Regression refers to the statistical technique of
modeling the relationship between variables
In simple linear regression, we model the
relationship between two variables
One of the variables, denoted by Y, is called
the dependent variable and the other, denoted
by X, is called the independent variable
The model we will use to depict the
relationship between X and Y will be a
straight-line relationship
A graphical sketch of the pairs (X, Y) is called a
scatter plot
55
www.globsyn.edu.in
www.globsyn.com
A Scatter Plot
This scatterplot locates
pairs of observations of
advertising
expenditures on the x-
axis and sales on the y-
axis. We notice that:
Larger (smaller) values
of sales tend to be The scatter of
associated with larger points tends to be
(smaller) values of distributed around
advertising a positively sloped
straight line
66
www.globsyn.edu.in
www.globsyn.com
A Scatter Plot
The pairs of values of
advertising expenditures
and sales are not located
exactly on a straight line
The scatter plot reveals a
more or less strong
tendency rather than a
precise linear The scatter of
relationship points tends to be
The line represents the distributed around
nature of the relationship a positively sloped
on average straight line
77
www.globsyn.edu.in
www.globsyn.com
Example
Student A B C D E F G H
Entrance 74 69 85 63 82 60 79 91
Exam Score
88
www.globsyn.edu.in
www.globsyn.com
9
9
www.globsyn.edu.in
www.globsyn.com
Direct Relationship
10
10
www.globsyn.edu.in
www.globsyn.com
11
11
www.globsyn.edu.in
www.globsyn.com
Y= a + bX
a is Y intercept
b= Y2 – Y1/ X2 – X1
= 7 – 5/ 2 – 1 = 2
a=3
Equation
Y= 3 + 2X
12
12
www.globsyn.edu.in
www.globsyn.com
Y= a + bX
a is Y intercept
b= Y2 – Y1/ X2 – X1
= 3 – 6/ 1 – 0 = -3
a=6
Equation
Y= 6 - 3X
13
13
www.globsyn.edu.in
www.globsyn.com
14
14
14
www.globsyn.edu.in
www.globsyn.com
15
15
www.globsyn.edu.in
www.globsyn.com
16
16
16
www.globsyn.edu.in
www.globsyn.com
Y = a + bX
17
17
www.globsyn.edu.in
www.globsyn.com
Example
Truck No Age(X) Repair XY X2
(n=4) yrs. Expense( in
thousands of
rupees(Y)
101 5 7 35 25
102 3 7 21 09
103 3 6 18 09
104 1 4 04 01
∑X = 12 ∑Y = 24 ∑XY= 78 = 44= ∑ X2
18
18
18
www.globsyn.edu.in
www.globsyn.com
1983 5 31 155 25
1982 11 40 440 121
1981 4 30 120 16
1980 5 34 170 25
1979 3 25 75 9
1978 2 20 40 4
∑X = 30 ∑Y = 180 ∑XY= 1000 = 200= ∑ X2
a = 30 – 2 x5 = 20 b=(1000 - 6x5x30)/200-6x52 = 2
Y = 20 + 2 X
19
19
19
www.globsyn.edu.in
www.globsyn.com
20
20
20
www.globsyn.edu.in
www.globsyn.com
21
21
21
www.globsyn.edu.in
www.globsyn.com
Errors in Regression
Y
the observed data point
Y b0 b1 X the fitted regression line
Yi .
Yi
{
Error ei Yi Yi
Yi the predicted value of Y for X
i
X
Xi
22
22
www.globsyn.edu.in
www.globsyn.com
se = √ ∑ ( Y – Ŷ)2 / n-2
Y - Values of the dependent variable
Ŷ - Estimating values from the estimating equation that
corresponds to each Y value
23
23
23
www.globsyn.edu.in
www.globsyn.com
X Y Ŷ = 3.75+0.75 X Individual (Y – Ŷ )2
Error (Y-Ŷ )
5 7 3.75 + (0.75)(5) 7 – 7.5 = -0.5 0.25
3 7 3.75 + ( 0.75)(3) 7 – 6.0 = 1.0 1.00
3 6 3.75 + ( 0.75)(3) 6 – 6.0 = 0 0
1 4 3.75 + ( 0.75)(1) 4 – 4.5 = -0.5 0.25
∑ ( Y – Ŷ)2 1.50
se = √ ∑ ( Y – Ŷ)2 / n-2
= √ 1.50 /2 = 0.866
24
24
24
www.globsyn.edu.in
www.globsyn.com
Correlation Analysis
Correlation Analysis is the statistical tool which describes the degree
to which one variable is linearly related to another. It denotes the
strength of the association between two variables
a) Co-efficient of Determination
b) Co-efficient of Correlation
25
25
25
www.globsyn.edu.in
www.globsyn.com
Correlation Analysis
26
26
26
www.globsyn.edu.in
www.globsyn.com
Interpretation of r2
= 1 – 0/672 = 1
27
27
www.globsyn.edu.in
www.globsyn.com
PERFECT CORRELATION
BETWEEN X AND Y
28
28
28
www.globsyn.edu.in
www.globsyn.com
Interpretation of r2
POINT X Y ∑( Y – Ŷ)
1ST 1 4 (6- 9)2 = 9 (6- 9)2 = 9
2nd 2 8 (12- 9)2 = 9 (12- 9)2 = 9
3rd 3 12 (6- 9)2 = 9 (6- 9)2 = 9
4th 4 16 (12- 9)2 = 9 (12- 9)2 = 9
5th 5 20 (6- 9)2 = 9 (6- 9)2 = 9
6th 6 24 (12- 9)2 = 9 (12- 9)2 = 9
7th 7 28 (6- 9)2 = 9 (6- 9)2 = 9
8th 8 32 (12- 9)2 =9 (12- 9)2 =9
∑ ( Y – Ŷ)2 = 72
= 1 – 72/72 = 0
29
29
29
www.globsyn.edu.in
www.globsyn.com
NO CORRELATION
30
30
30
www.globsyn.edu.in
www.globsyn.com
Co efficient of Correlation
31
31
www.globsyn.edu.in
www.globsyn.com
Illustrations of Correlation
Y Y
= -1 =0 Y
=1
X X X
Y
= -.8 Y
=0 Y
= .8
X X X
32
32
www.globsyn.edu.in
www.globsyn.com
Caselet 1
American Express Company has long believed that
its cardholders tend to travel more extensively than
others—both on business and for pleasure. As part
of a comprehensive research effort undertaken by a
New York market research firm on behalf of
American Express, a study was conducted to
determine the relationship between travel and
charges on the American Express card. The research
firm selected a random sample of 25 cardholders
from the American Express computer file and
recorded their total charges over a specified period.
33
33
www.globsyn.edu.in
www.globsyn.com
Caselet 1
For the selected cardholders, information was
also obtained, through a mailed questionnaire, on
the total number of miles traveled by each
cardholder during the same period. The data for
this study are given in the following table
34
34
www.globsyn.edu.in
www.globsyn.com
Continued
Miles (X) Dollars (Y) Miles (X) Dollars (Y)
1849 2332 3466 4244
2026 2305 3643 5298
2133 3016 3852 4801
2253 3385 4033 5147
2400 3090 4267 5738
2468 3694 4498 6420
2699 3371 4533 6059
2806 3998 4804 6426
3082 3555 5090 6321
3209 4692 5233 7026
1211 1802 5439 6964
1345 2405 1422 2005
1687 2511
35
35
www.globsyn.edu.in
www.globsyn.com
Continued
n Miles (X) Dollars (Y) X2 Y2 XY
1 1849 2332
2 2026 2305
3 2133 3016
4 2253 3385
5 2400 3090
6 2468 3694
7 2699 3371
8 2806 3998
9 3082 3555
10 3209 4692
11 3466 4244
12 3643 5298
37
37
www.globsyn.edu.in
www.globsyn.com
Continued
13 3852 4801
14 4033 5147
15 4267 5738
16 4498 6420
17 4533 6059
18 4804 6426
19 5090 6321
20 5233 7026
21 5439 6964
22 1211 1802
23 1345 2405
24 1422 2005
25 1687 2511
Sum
Mean
38
38
www.globsyn.edu.in
www.globsyn.com
Continued
39
39
www.globsyn.edu.in
www.globsyn.com
Continued
n Miles (X) Dollars (Y) X2 Y2 XY
1 1849 2332 3418801 5438224 4311868
2 2026 2305 4104676 5313025 4669930
3 2133 3016 4549689 9096256 6433128
4 2253 3385 5076009 11458225 7626405
5 2400 3090 5760000 9548100 7416000
6 2468 3694 6091024 13645636 9116792
7 2699 3371 7284601 11363641 9098329
8 2806 3998 7873636 15984004 11218388
9 3082 3555 9498724 12638025 10956510
10 3209 4692 10297681 22014864 15056628
11 3466 4244 12013156 18011536 14709704
12 3643 5298 13271449 28068804 19300614
13 3852 4801 14837904 23049601 18493452
14 4033 5147 16265089 26491609 20757851
15 4267 5738 18207289 32924644 24484046
40
40
www.globsyn.edu.in
www.globsyn.com
Continued
16 4498 6420 20232004 41216400 28877160
17 4533 6059 20548089 36711481 27465447
18 4804 6426 23078416 41293476 30870504
19 5090 6321 25908100 39955041 32173890
20 5233 7026 27384289 49364676 36767058
21 5439 6964 29582721 48497296 37877196
22 1211 1802 1466521 3247204 2182222
23 1345 2405 1809025 5784025 3234725
24 1422 2005 2022084 4020025 2851110
25 1687 2511 2845969 6305121 4236057
Sum 79448 106605 293426946 521440939 390185014
Mean 3177.9 4264.2
41
41
www.globsyn.edu.in
www.globsyn.com
42
42
www.globsyn.edu.in
www.globsyn.com
Y = 281.3379 + 1.2533X
43
43
www.globsyn.edu.in
www.globsyn.com
Continued
44
44
www.globsyn.edu.in
www.globsyn.com
Rank Correlation
45
45
45
www.globsyn.edu.in
www.globsyn.com
46
46
46
www.globsyn.edu.in
www.globsyn.com
= 1 - 6 x58/11(121-1)
= 1 – 0.264 = 0.736
47
47
47
www.globsyn.edu.in
www.globsyn.com
THANK YOU…
All information, including graphical representations, etc provided in this presentation is for exclusive use of current GBS
students and faculty. No part of the document may be reproduced in any form or by any means, electronic or otherwise, without
written permission of the owner.
48
48
48
www.globsyn.edu.in