Session 30

Session 30
Simple Linear Regression

Assumptions About the Error Term 
1. The error  is a random variable with mean of zero.
2. The variance of  , denoted by  2, is the same for all values of the

independent variable.
3. The values of  are independent.
4. The error  is a normally distributed random variable.
11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 2

Testing for Significance
• To test for a significant regression relationship, we must conduct a
hypothesis test to determine whether the value of b1 is zero.
• Two tests are commonly used:
t Test and F Test
• Both the t test and F test require an estimate of  2, the variance of  in

the regression model.

• An Estimate of  2
The mean square error (MSE) provides the estimate of  2, and the
notation s2 is also used.
s 2 = MSE = SSE/(n - 2)
where:
SSE=σ 𝑦𝑖 − 𝑦ො𝑖 2 = σ 𝑦𝑖 − 𝑏0 − 𝑏1 𝑥𝑖 2

• An Estimate of 
• To estimate s, we take the square root of s2.
• The resulting s is called the standard error of the estimate.
SSE
s = MSE =
𝑛−2

Testing for Significance: t Test
• Hypotheses
H0: b1 = 0
Ha: b1 ≠ 0
• Test Statistic
𝑏1 𝑠
𝑡= where 𝑠𝑏1 =
𝑠𝑏1 σ 𝑥𝑖 − 𝑥ҧ 2

• Rejection Rule
Reject H0 if p-value <  or t < -t or t > t
where:
t is based on a t distribution with n - 2 degrees of freedom

1. Determine the hypotheses. H0: b1 = 0
Ha: b1 ≠ 0
2. Specify the level of significance.  = .01
𝑏1
3. Select the test statistic. 𝑡=
𝑠𝑏1
4. State the rejection rule. Reject H0 if p-value < .005 or t > 3.355 (with
8 degrees of freedom)

𝑏1 5
5. Compute the value of the test statistic. 𝑡= = 0.5803 = 8.62
𝑠𝑏1
6. Determine whether to reject H0. As t = 8.6 > 3.355 We can reject H0.
𝑦ො = 10 + 5x = b1 = 5
𝑠= 𝑀𝑆𝐸 = 191.25 = 13.829

𝑠 13.829
𝑠𝑏1 = = = 0.5803
σ(𝑥𝑖 −𝑥)ҧ 2 568

Confidence Interval for b1
• The form of confidence interval for b1 is as follows:
𝑏1 ± 𝑡𝛼/2 𝑠𝑏1
where
b1 is the point estimator,
𝑡𝞪/2 𝑠𝑏1 is the margin of error, and
ta/2 is the t value providing an area of /2 in the upper tail of a
t distribution with n - 2 degrees of freedom
• We can use a 95% confidence interval for b1 to test the hypotheses just
used in the t test.
• H0 is rejected if the hypothesized value of b1 is not included in the

confidence interval for b1.
Confidence Interval for b1
• Rejection Rule
Reject H0 if 0 is not included in the confidence interval for b1.
• 99% Confidence Interval for b1

𝑏1 ± 𝑡𝞪/2 𝑠𝑏1 = 5 +/- 3.355(0.5803) = 5 +/- 1.95 or 3.05 to 6.95
• Conclusion
0 is not included in the confidence interval. Reject H0

Testing for Significance: F Test
• Hypotheses
H0: b1 = 0
Ha: b1 ≠ 0
• Test Statistic
F = MSR/MSE

• Rejection Rule
Reject H0 if p-value < α or F > Fα
where:
F is based on an F distribution with
1 degree of freedom in the numerator and n - 2 degrees of
freedom in the denominator

1. Determine the hypotheses. H0: b1 = 0
Ha: b1 ≠ 0
2. Specify the level of significance.  = .05
3. Select the test statistic. F = MSR/MSE

MSR = SSR/ Number of independent variables
4. State the rejection rule.Reject H0 if p-value < .05 or F > 10.13 (with 1 d.f. in
numerator and 3 d.f. in denominator)

5. Compute the value of the test statistic.
For Armand’s Pizza Parlor, MSR = SSR / 1 (as there is only one
independent variable, i.e. student population)
F = MSR/MSE = 14200/191.25 = 74.25
6. Determine whether to reject H0.

The F distribution table shows that with 1 numeration d.o.f., and 10-
2=8 denominator d.o.f., F=11.26 provides an area of 0.01 in the upper
tail. Thus, the upper tail of test statistic = 74.25 provides an area much
less than of .01 in the upper tail. Hence, we reject H0.

Using the Estimated Regression
Equation for Estimation and Prediction
• A confidence interval is an interval estimate of the mean value of y for a
given value of x.
• A prediction interval is used whenever we want to predict an individual

value of y for a new observation corresponding to a given value of x.
• The margin of error is larger for a prediction interval.

Using the Estimated Regression
Equation for Estimation and Prediction
• Confidence Interval Estimate of E(y*)
𝑦ො ∗ ± 𝑡𝛼/2 𝑠𝑦ො ∗
• Prediction Interval Estimate of y*

𝑦ො ∗ ± 𝑡𝛼/2 𝑠𝑝𝑟𝑒𝑑
where:
confidence coefficient is 1 -  and t/2 is based on a t distribution with
n - 2 degrees of freedom

Confidence Interval for E(y*)
• Estimate of the Standard Deviation of 𝑦ො ∗
1 𝑥 ∗ − 𝑥ҧ 2
𝑠𝑦ො ∗ =𝑠 +
𝑛 σ 𝑥𝑖 − 𝑥ҧ 2
1 10 − 14 2
𝑠𝑦ො ∗ = 13.829 +
10 568
𝑠𝑦ො ∗ = 13.829 0.1282 = 4.95

Confidence Interval for E(y*)
The 95% confidence of the mean quarterly sales for all Armand’s
restaurants located near campuses with 10,000 students is:
𝑦ො ∗ ± 𝑡𝛼/2 𝑠𝑦ො ∗
110 + 2.306(4.95) in 000’s
110,000 + 11,415
$98,585 to $121,415

Prediction Interval for y*
• Estimate of the Standard Deviation of an Individual Value of y*
1 𝑥 ∗ − 𝑥ҧ 2
𝑠𝑝𝑟𝑒𝑑 =𝑠 1+ +
𝑛 σ 𝑥𝑖 − 𝑥ҧ 2
1 10 − 14 2
𝑠𝑝𝑟𝑒𝑑 = 13. . 829 1 + +
10 568
spred = 13.829 1.282 = 14.69

Prediction Interval for y*
The 95% prediction interval for quarterly sales for the new Armand’s
restaurant located near Talbot college, a campus with 10,000 students is:
𝑦ො ∗ ± 𝑡𝛼/2 𝑠𝑝𝑟𝑒𝑑
110 + 2.306(14.69) in 000’s
110,000 + 33,875
$76,125 to $143,875

Residual Analysis
• If the assumptions about the error term  appear questionable, the
hypothesis tests about the significance of the regression relationship
and the interval estimation results may not be valid.
• The residuals provide the best information about  .

• Residual for observation i
𝑦𝑖 − 𝑦ො𝑖
• Much of the residual analysis is based on an examination of graphical plots.
• If the assumption that the variance of  is the same for all values of x is
valid, and the assumed regression model is an adequate representation of
the relationship between the variables, then the residual plot should give
an overall impression of a horizontal band of points.
Residual Plot Against x
𝑦 − 𝑦ො
Good Pattern
Residual

𝑦 − 𝑦ො
Non-constant Variance
Residual

𝑦 − 𝑦ො
Model Form Not Adequate
Residual

17-11-2021 (Wednesday) 2.15 PM - 4.15 PM
Thank you

Session 30

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Session 30

Uploaded by

Copyright:

Available Formats

Session 30

Simple Linear Regression

2. The variance of  , denoted by  2, is the same for all values of the

3. The values of  are independent.

4. The error  is a normally distributed random variable.

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 2

• Two tests are commonly used:

t Test and F Test

• Both the t test and F test require an estimate of  2, the variance of  in

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 3

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 4

• The resulting s is called the standard error of the estimate.

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 5

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 6

Reject H0 if p-value <  or t < -t or t > t

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 7

2. Specify the level of significance.  = .01

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 8

𝑠= 𝑀𝑆𝐸 = 191.25 = 13.829

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 9

• H0 is rejected if the hypothesized value of b1 is not included in the

Reject H0 if 0 is not included in the confidence interval for b1.

• 99% Confidence Interval for b1

0 is not included in the confidence interval. Reject H0

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 11

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 12

Reject H0 if p-value < α or F > Fα

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 13

2. Specify the level of significance.  = .05

3. Select the test statistic. F = MSR/MSE

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 14

F = MSR/MSE = 14200/191.25 = 74.25

6. Determine whether to reject H0.

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 15

• A prediction interval is used whenever we want to predict an individual

• The margin of error is larger for a prediction interval.

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 16

• Prediction Interval Estimate of y*

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 17

𝑠𝑦ො ∗ = 13.829 0.1282 = 4.95

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 18

110 + 2.306(4.95) in 000’s

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 19

spred = 13.829 1.282 = 14.69

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 20

110 + 2.306(14.69) in 000’s

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 21

• The residuals provide the best information about  .

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 23

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 24

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 25

11/14/2021 Confidential - Prof. Vasanth Kamath, TAPMI 27

You might also like