Estimation of Causal Relationships I: Illustration 1

Illustration 1
Estimation
of  How Salary and Expenditure is related?
Causal Relationships  How GDP of a country depends on
various economic parameters?
I
 Salary vs. Years of Experience
 Salary vs. Levels of Educational
Regression Analysis & Diagnosis
attainments
1 2
Illustration 1 Illustration 2
The product manager in charge of a particular brand of A gold speculator is considering a major purchase of gold
children’s breakfast cereal would like to predict the bullion. He would like to forecast the price of gold 2 years
demand for the cereal during the next year. To use a from now (his planning horizon), using a forecasting
forecast technique, she and her staff list the following Technique. In preparation, he produces the following list
variables as likely to affect sales:
of variables:
3 4
Simple Linear Regression

Learning Objectives
 Managerial decisions often are based on the
 Simple Linear Regression Model relationship between two or more variables.
 Least Squares Method  Regression analysis can be used to develop an
equation showing how the variables are related.
 Coefficient of Determination
 Model Assumptions  The variable being predicted is called the dependent
variable and is denoted by y.
 Testing for Significance
 The variables being used to predict the value of the
dependent variable are called the independent
variables and are denoted by x.
5 6
1
Simple Linear Regression Simple Linear Regression Model
 Simple linear regression involves one independent  The equation that describes how y is related to x and
variable and one dependent variable. an error term is called the regression model.
 The relationship between the two variables is  The simple linear regression model is:
approximated by a straight line.
 Regression analysis involving two or more y = b0 + b1x +e
independent variables is called multiple regression. where:
b0 and b1 are called parameters of the model,
e is a random variable called the error term.
7 8
Simple Linear Regression Equation Simple Linear Regression Equation
 The simple linear regression equation is:  Positive Linear Relationship
E(y) = b 0 + b 1x E(y)
• Graph of the regression equation is a straight line. Regression line

• b0 is the y intercept of the regression line.
• b1 is the slope of the regression line. Intercept Slope b 1
• E(y) is the expected value of y for a given x value. b0 is positive
9 10
Simple Linear Regression Equation Simple Linear Regression Equation
 Negative Linear Relationship  No Relationship
E(y) E(y)
Intercept
b0 Regression line Intercept Regression line
b0
Slope b 1
Slope b 1 is 0
is negative
x x
11 12
2
Estimated Simple Linear Regression Equation Estimation Process
 The estimated simple linear regression equation Regression Model Sample Data:
y = b0 + b1x +e x y
Regression Equation x1 y1
ŷ  b0  b1 x E(y) = b0 + b1x . .
Unknown Parameters . .
b 0, b 1 xn yn
• The graph is called the estimated regression line.
• b0 is the y intercept of the line.
• b1 is the slope of the line.
• ŷ is the estimated value of y for a given x value. Estimated
b0 and b1 Regression Equation
provide estimates of ŷ  b0  b1 x
b0 and b1 Sample Statistics
b0 , b1
13 14
Least Squares Method Least Squares Method
 Least Squares Criterion  Slope for the Estimated Regression Equation
min  (y i  y i ) 2
b1 
 ( x  x )( y  y )
i i
where:  (x  x )
i
2
yi = observed value of the dependent variable

where:
for the ith observation
xi = value of independent variable for ith
y^i = estimated value of the dependent variable observation
for the ith observation yi = value of dependent variable for ith
_ observation
x = mean value for independent variable
_
y = mean value for dependent variable
15 16
Least Squares Method Simple Linear Regression
 y-Intercept for the Estimated Regression Equation  Example: Reed Auto Sales
Reed Auto periodically has a special week-long sale.
b0  y  b1 x As part of the advertising campaign Reed runs one or
more television commercials during the weekend
preceding the sale. Data from a sample of 5 previous
sales are shown on the next slide.
17 18
3
Simple Linear Regression Estimated Regression Equation
 Example: Reed Auto Sales  Slope for the Estimated Regression Equation
Number of Number of b1   ( x  x )( y  y )  20  5
i i
TV Ads (x) Cars Sold (y) (x  x ) i 42
1 14  y-Intercept for the Estimated Regression Equation

3 24
b0  y  b1 x  20  5(2)  10
2 18
1 17  Estimated Regression Equation
3 27
yˆ  10  5x
Sx = 10 Sy = 100
x2 y  20
19 20
Regression Model Diagnosis Coefficient of Determination
It helps to justify the regression model from all the aspect  Relationship Among SST, SSR, SSE
of the study.
SST = SSR + SSE
The Diagnosis Parameters need to be tested are:
 Coefficient of determination (R-square value) (y i  y )2   ( yˆ i  y )2   ( y i  yˆ i )2
 Sign of the correlation coefficient
where:
 Testing the significance of b 1. (F-test)
SST = total sum of squares
 Check whether b 1 =0 or not and the C.I of b 1. (t-test) SSR = sum of squares due to regression
 P-values for explanatory variable(s) SSE = sum of squares due to error
21 22
Coefficient of Determination Coefficient of Determination
 The coefficient of determination is:

R2 = SSR/SST = 100/114 = .8772
R2 = SSR/SST
The regression relationship is very strong; 87.72%
where: of the variability in the number of cars sold can be
SSR = sum of squares due to regression explained by the linear relationship between the
number of TV ads and the number of cars sold.
SST = total sum of squares
23 24
4
Sample Correlation Coefficient Sample Correlation Coefficient
rxy  (sign of b1 ) Coefficient of Determination rxy  (sign of b1 ) r 2

rxy  (sign of b1 ) r 2
The sign of b1 in the equation yˆ  10  5 x is “+”.
where:
rxy = + .8772
b1 = the slope of the estimated regression
equation yˆ  b0  b1 x rxy = +.9366
25 26
Assumptions About the Error Term e Testing for Significance
1. The error e is a random variable with mean of zero. To test for a significant regression relationship, we
must conduct a hypothesis test to determine whether
2. The variance of e , denoted by  is the same for
2,
the value of b 1 is zero.
all values of the independent variable.
Two tests are commonly used:
3. The values of e are independent.
t Test and F Test
4. The error e is a normally distributed random
variable.
Both the t test and F test require an estimate of  2,
the variance of e in the regression model.
27 28
Testing for Significance Testing for Significance
 An Estimate of  2  An Estimate of 
The mean square error (MSE) provides the estimate • To estimate  we take the square root of  2.
of  2, and the notation s2 is also used. • The resulting s is called the standard error of
the estimate.
s 2 = MSE = SSE/(n  2)
where: SSE
s  MSE 
n2
SSE   ( yi  yˆ i ) 2   ( yi  b0  b1 xi ) 2
29 30
5
Testing for Significance: t Test Testing for Significance: t Test
 Hypotheses  Rejection Rule
H0 : b1  0 Reject H0 if p-value < 

H a : b1  0 or t < -t or t > t
 Test Statistic where:

t is based on a t distribution
b1 s
t where sb1  with n - 2 degrees of freedom
sb1 S( xi  x )2
31 32
Testing for Significance: t Test Testing for Significance: t Test
1. Determine the hypotheses. H0 : b1  0 5. Compute the value of the test statistic.

H a : b1  0 b1 5
t   4.63
2. Specify the level of significance.  = .05 sb1 1.08
6. Determine whether to reject H0.

b1
3. Select the test statistic. t
sb1 t = 4.541 provides an area of .01 in the upper
tail. Hence, the p-value is less than .02. (Also,
4. State the rejection rule. Reject H0 if p-value < .05 t = 4.63 > 3.182.) We can reject H0.
or |t| > 3.182 (with
3 degrees of freedom)
33 34
Confidence Interval for b1 Confidence Interval for b1
 We can use a 95% confidence interval for b 1 to test  The form of a confidence interval for b 1 is: t /2 sb1
the hypotheses just used in the t test. is the
b1  t /2 sb1 margin
 H0 is rejected if the hypothesized value of b 1 is not b1 is the of error
included in the confidence interval for b 1. point where t / 2 is the t value providing an area
estimator
of /2 in the upper tail of a t distribution
with n - 2 degrees of freedom
35 36
6
Confidence Interval for b1 Testing for Significance: F Test
 Rejection Rule  Hypotheses

Reject H0 if 0 is not included in
H 0 : b1  0
the confidence interval for b 1.
H a : b1  0
 95% Confidence Interval for b 1
b1  t / 2 sb1 = 5 +/- 3.182(1.08) = 5 +/- 3.44  Test Statistic
or 1.56 to 8.44 F = MSR/MSE

 Conclusion
0 is not included in the confidence interval.
Reject H0
37 38
Testing for Significance: F Test Testing for Significance: F Test
 Rejection Rule
1. Determine the hypotheses. H0 : b1  0
Reject H0 if H a : b1  0
p-value < 
or F > F 2. Specify the level of significance.  = .05
where:
3. Select the test statistic. F = MSR/MSE
F is based on an F distribution with
1 degree of freedom in the numerator and
4. State the rejection rule. Reject H0 if p-value < .05
n - 2 degrees of freedom in the denominator or F > 10.13 (with 1 d.f.
in numerator and
3 d.f. in denominator)
39 40
Testing for Significance: F Test Some Cautions about the

Interpretation of Significance Tests
 Rejecting H0: b 1 = 0 and concluding that the
5. Compute the value of the test statistic.
relationship between x and y is significant does
F = MSR/MSE = 100/4.667 = 21.43 not enable us to conclude that a cause-and-effect
relationship is present between x and y.
6. Determine whether to reject H0.
 Just because we are able to reject H0: b 1 = 0 and
F = 17.44 provides an area of .025 in the upper demonstrate statistical significance does not enable
tail. Thus, the p-value corresponding to F = 21.43 us to conclude that there is a linear relationship
is less than .025. Hence, we reject H0. between x and y.
The statistical evidence is sufficient to conclude
that we have a significant relationship between the
number of TV ads aired and the number of cars sold.
41 42
7
Illustration 3
Car dealers across North America use the so-called Blue Book to help
them determine the value of used cars that their customers trade in
when purchasing new cars. The book, which is published monthly,
lists the trade-in values for all basic models of cars. It provides
alternative values for each car model according to its condition and
optional features. The values are determined on the basis of the
average paid at recent used-car auctions, the source of supply for
many used-car dealers. However, the Blue Book does not indicate the
value determined by the odometer reading and color, despite the fact
that a critical factor for used-car buyers is how far the car has been
driven and color choice. To examine this issue, a used-car dealer
randomly selected 100 3-year old Toyota Camrys that were sold at
auction during the past month. Each car was in top condition and
equipped with all the features that come standard with this car. The
dealer recorded the price ($1,000) and the number of miles
(thousands) on the odometer and color. The dealer wants to establish
the relationship through a regression model. #DATA
43 44

Estimation of Causal Relationships I: Illustration 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Estimation of Causal Relationships I: Illustration 1

Uploaded by

Copyright:

Available Formats

Illustration 1

Simple Linear Regression

Simple Linear Regression Equation Simple Linear Regression Equation

 The simple linear regression equation is:  Positive Linear Relationship

• Graph of the regression equation is a straight line. Regression line

Simple Linear Regression Equation Simple Linear Regression Equation

 Negative Linear Relationship  No Relationship

Least Squares Method Least Squares Method

 Least Squares Criterion  Slope for the Estimated Regression Equation

yi = observed value of the dependent variable

Least Squares Method Simple Linear Regression

TV Ads (x) Cars Sold (y) (x  x ) i 42

1 14  y-Intercept for the Estimated Regression Equation

Regression Model Diagnosis Coefficient of Determination

Coefficient of Determination Coefficient of Determination

 The coefficient of determination is:

rxy  (sign of b1 ) Coefficient of Determination rxy  (sign of b1 ) r 2

Assumptions About the Error Term e Testing for Significance

Testing for Significance Testing for Significance

 Hypotheses  Rejection Rule

H0 : b1  0 Reject H0 if p-value < 

 Test Statistic where:

Testing for Significance: t Test Testing for Significance: t Test

1. Determine the hypotheses. H0 : b1  0 5. Compute the value of the test statistic.

6. Determine whether to reject H0.

Confidence Interval for b1 Confidence Interval for b1

 Rejection Rule  Hypotheses

or 1.56 to 8.44 F = MSR/MSE

Testing for Significance: F Test Testing for Significance: F Test

Testing for Significance: F Test Some Cautions about the

You might also like