Professional Documents
Culture Documents
Instructor:
Dr. Md. Sohel Rana
Associate Professor of Statistics,
Department of Mathematical & Physical Sciences, EWU
Email: srana@ewubd.edu
Correlation Analysis
= −
= 2 ∑ ∑
2 ∑ ∑ −() ()
2
y
S yi
2
x S xi
yy i
n
xx i
= −
∑∑ ∑
( )( )
xy
S x yi i
xy i i
Example
• A major airline wants to estimate the relationship
between the number of reservations and the
actual number of passengers who show up for
flight ABC.
• Information gathered over 12 randomly selected
days for flight ABC is given in the table below:
Dr. Md. Sohel Rana, Stat, EWU
Airline Data
Day No. of No. of
Reservations Passengers
1 250 210
2 548 405
3 156 120
4 121 89
5 416 304
6 450 320
7 462 319
8 508 410
9 307 275
10 311 289
11 265 236
12 189 170
Example
s
r
450
e
g 400
350
n
s
300
a
250
200
P
o
150 Reservations
100
50
0
0 100 200 300 400 500 600 No. of
113564.2
r 154483.2
==
0.97
(223736.9)(113564.2)
Comment: ?
and
01
errors as small
as possible.
ˆˆ
• To do this, we choose to minimize the
β β and
01
value of the sum of the squares of the errors (SSE),
nn
ˆˆ
[ ( )]
=−+
∑∑
22
εββyx
iii
01
ii
==
11
ˆˆ
[ ( )]
=−+
∑∑
22
εββyx
iii
01
ii
==
11
as small as possible.
Dr. Md. Sohel Rana, Stat, EWU
Estimating Regression
Parameters • The Least Square Method:
Minimize SSE
ˆˆ
ˆi i y x = + β β
01
Dr. Md. Sohel Rana, Stat, EWU
Estimating Regression
ˆˆ
β0and β1, which we call , are given by
β β and
01
S
yx
ˆ
ˆ ˆ and xy
βββ==−
101
xx
S
Dr. Md. Sohel Rana, Stat, EWU
Example
Refer to the Airline Example
a) Which is the dependent variable (Y) and which
is the explanatory variable (X) in this problem?
b) Draw a scatter diagram with X and Y. Is the
relationship looks linear?
c) Fit the regression line of Y on X. Or, fit a linear
regression model to these data with no. of
Passengers being the response variable and
no. of Reservations the explanatory variable.
d) From the output, identify and interpret the slope
and the intercept.
Dr. Md. Sohel Rana, Stat, EWU
Example
Solution:
a) Since the no. of Passengers depends on the
no. of Reservations, the dependent variable
(Y) in this example is the no. of Passengers
and the explanatory (or independent) variable
(X) is the no. of Reservations .
Example
b) The Scatter Plot of Reservations (X) and no. of
Passengers (Y) is shown below.
It is clear from the diagram that there is a positive
relationship between the variables. It is also
reasonably clear that there is a linear trend in the
data.
350
300
250
200
s
150
r
e 100
50
g
e
0
s
s
0 100 200 300 400 500 600 No. of
a
. Reservations
o
450
400
Example - 1
• For this data:
�� =3147 ��2=
��=398 ����=11
��2=15457 938865
3 61 99025
Syy=113564.2�� 1 =������
154483.2
������=
223736.9= 0.69
Example
d) Interpretation of intercept (β0) and slope
(β1): (i) Slope β1:
• In general this tells us how we expect Y to
change, on average, if X is increased by 1 unit
• In this example, β1 = 0.69. Thus, for every
additional Reservations, the no. of
Passengers will increase by an average of
0.69.
• Since the slope is positive, we expect Y to
increase as X increases.
• If the slope were negative, we would expect
Y to decrease as X increases.
Dr. Md. Sohel Rana, Stat, EWU
Example
Coefficient of determination
2
R=r
2
xy
• greater than 0.80 usually indication of a good
2
R
fitted model