This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

Chapter 15 Multiple Regression Model Building

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-1

Learning Objectives

In this chapter, you learn: To use quadratic terms in a regression model To use transformed variables in a regression model To measure the correlation among independent variables To build a regression model, using either the stepwise or best-subsets approach To avoid the pitfalls involved in developing a multiple regression model

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-2

Nonlinear Relationships

The relationship between the dependent variable

and an independent variable may not be linear Can review the scatter plot to check for nonlinear relationships Example: Quadratic model

2 Yi β 0 β1X1i β 2 X1i ε i

The second independent variable is the square of the

first variable

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 15-3

**Quadratic Regression Model
**

Model form:

Yi β0 β1X1i β2 X εi

2 1i

where: β0 = Y intercept β1 = regression coefficient for linear effect of X on Y β2 = regression coefficient for quadratic effect on Y εi = random error in Y for observation i

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-4

**Quadratic Regression Model
**

Y

Y

X

residuals residuals

X

X

Linear fit does not give random residuals

X

Nonlinear fit gives random residuals

Chap 15-5

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

**Quadratic Regression Model
**

2 Yi β0 β1X1i β2 X1i εi

**Quadratic models may be considered when the scatter diagram takes on one of the following shapes:
**

Y Y Y Y

β1 < 0 β2 > 0

X1

β1 > 0 β2 > 0

X1

β1 < 0 β2 < 0

X1

β1 > 0 β2 < 0

X1

**β1 = the coefficient of the linear term β2 = the coefficient of the squared term
**

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 15-6

**Quadratic Regression Equation
**

Collect data and use Excel to calculate the

regression equation:

ˆ b b X b X2 Yi 0 1 1i 2 1i

Test for Overall Relationship

**H0: β1 = β2 = 0 (no overall relationship between X and Y) H1: β1 and/or β2 ≠ 0 (there is a relationship between X and Y)
**

MSR MSE

F-test statistic =

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-7

**Testing for Significance Quadratic Effect
**

Testing the Quadratic Effect

Compare quadratic regression equation

2 Yi b0 b1X1i b 2 X1i ^

with the linear regression equation

Yi b 0 b1X1i

^

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-8

**Testing for Significance Quadratic Effect
**

Testing the Quadratic Effect

Consider the quadratic regression equation

2 Yi b 0 b1X1i b 2 X1i ^

Hypotheses

H0: β2 = 0 H1: β2 0

(The quadratic term does not improve the model) (The quadratic term improves the model)

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-9

**Testing for Significance Quadratic Effect
**

Testing the Quadratic Effect Hypotheses

H0: β2 = 0

(The quadratic term does not improve the model) H1: β2 0 (The quadratic term improves the model)

The test statistic is

b2 β2 t Sb 2

d.f. n 3

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

where: b2 = estimated slope

**β2 = hypothesized slope (zero)
**

Sb2 = standard error of the slope

Chap 15-10

**Testing for Significance Quadratic Effect
**

Testing the Quadratic Effect

If the t test for the quadratic effect is significant, keep the quadratic term in the model.

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-11

**Quadratic Regression Example
**

Purity

3 7 8 15 22 33 40 54 67 70 Filter Time 1 2 3 5 7 8 10 12 13 14

20 0 0 5 10 Time 15 20 80 60 40 100

**Purity increases as filter time increases:
**

Purity vs. Time

78

85 87 99

15

15 16 17

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Purity

Chap 15-12

**Quadratic Regression Example
**

Simple (linear) regression results: ^ t statistic, F statistic, and Y = -11.283 + 5.985 Time 2

Coefficients Intercept Time -11.28267 5.98520 Regression Statistics R Square Adjusted R Square Standard Error 0.96888 0.96628 6.15997 F 373.57904 Significance F 2.0778E-10 Standard Error 3.46805 0.30966 t Stat -3.25332 19.32819 P-value 0.00691 2.078E-10

**adjusted r are all high, but the residuals are not random:
**

Time Residual Plot

10

Residuals

5 0 -5 0 -10 Time 5 10 15 20

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-13

**Quadratic Regression Example
**

Quadratic regression results:

Residuals

**^ Y = 1.539 + 1.565 Time + 0.245 (Time)2
**

Coefficients Intercept Time Time-squared 1.53870 1.56496 0.24516 Standard Error 2.24465 0.60179 0.03258 F 0.99494 0.99402 1080.7330 t Stat 0.68550 2.60052 7.52406 P-value 0.50722 0.02467 1.165E-05

**Time Residual Plot
**

10 5 0 -5 0 5 10 Time 15 20

**Time-squared Residual Plot
**

Regression Statistics R Square Adjusted R Square Significance F 2.368E-13

Residuals

10 5 0 -5 0 100 200 Time-squared 300 400

Standard Error

2.59513

The quadratic term is significant and improves the model: adjusted r2 is higher and SYX is lower, residuals are now random.

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-14

**Using Transformations in Regression Analysis
**

Idea: Non-linear models can often be transformed to a linear form Can be estimated by least squares if transformed Transform X or Y or both to get a better fit or to deal with violations of regression assumptions Can be based on theory, logic or scatter plots

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 15-15

**The Square Root Transformation
**

The square-root transformation

Yi β0 β1 X1i ε i

Used to

overcome violations of the equal variance assumption

fit a non-linear relationship

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-16

**The Square Root Transformation
**

Yi β0 β1X1i εi

Yi β0 β1 X1i ε i

**Shape of original relationship
**

Y

**Relationship when transformed
**

Y

b1 > 0

X Y Y

X

b1 < 0

X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

X

Chap 15-17

**The Log Transformation
**

The Multiplicative Model:

Original multiplicative model

β Yi β0 X1i1 ε i

Transformed multiplicative model

log Yi log β0 β1 log X1i log εi

**The Exponential Model:
**

Original multiplicative model Transformed exponential model

Yi e

β0 β1X1i β2 X2i

εi

ln Yi β0 β1X1i β2 X2i ln εi

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-18

Interpretation of Coefficients

For the multiplicative model:

**log Yi log β0 β1 log X1i log εi
**

When both dependent and independent variables are transformed:

The coefficient of the independent variable Xk can be

**interpreted as follows: a 1 percent change in Xk leads to an estimated bk percentage change in the mean value of Y.
**

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 15-19

Collinearity

Collinearity: High correlation exists among

two or more independent variables The correlated variables contribute redundant information to the multiple regression model

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-20

Collinearity

Including two highly correlated independent

**variables can adversely affect the regression results
**

No new information provided Can lead to unstable coefficients (large

standard error and low t-values) Coefficient signs may not match prior expectations

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-21

**Some Indications of Strong Collinearity
**

Incorrect signs on the coefficients

Large change in the value of a previous coefficient

when a new variable is added to the model A previously significant variable becomes insignificant when a new independent variable is added

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-22

**Detecting Collinearity Variance Inflationary Factor
**

VIFj is used to measure collinearity:

1 VIF j 2 1 R j

where R2j is the coefficient of determination from a regression model that uses Xj as the dependent variable and all other X variables as the independent variables

**If VIFj > 5, Xj is highly correlated with the other independent variables
**

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 15-23

Model Building

Goal is to develop a model with the best set of

independent variables

Easier to interpret if unimportant variables are removed Lower probability of collinearity

** Stepwise regression procedure
**

Provide evaluation of alternative models as variables are

added

Best-subset approach

Try all combinations and select the model with the

highest adjusted r2

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-24

Stepwise Regression

Idea: develop the least squares regression

equation in steps, adding one independent variable at a time and evaluating whether existing variables should remain or be removed

The coefficient of partial determination is the

measure of the marginal contribution of each independent variable, given that other independent variables are in the model

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 15-25

**Best Subsets Regression
**

Idea: estimate all possible regression equations using all possible combinations of independent variables

Choose the best model by looking for the highest adjusted r2 The model with the largest adjusted r2 will also have the smallest SYX

**Stepwise regression and best subsets regression can be performed using Excel with PHStat add in
**

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 15-26

**Alternative Best Subsets Criterion
**

Calculate the value Cp for each potential

regression model

Consider models with Cp values close to or

below k + 1

k is the number of independent variables in the

model under consideration

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-27

**Alternative Best Subsets Criterion
**

The Cp Statistic

2 (1 Rk )(n T ) Cp (n 2(k 1)) 2 1 RT

Where k = number of independent variables included in a particular regression model T = total number of parameters to be estimated in the full regression model 2 R k = coefficient of multiple determination for model with k independent variables R 2 = coefficient of multiple determination for full model with T all T estimated parameters

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 15-28

**8 Steps in Model Building
**

1. Choose independent variables to include in the model 2. Estimate full model and check VIFs and check if any VIFs > 5 If no VIF > 5, go to step 3 If one VIF > 5, remove this variable If more than one, eliminate the variable with the highest VIF and repeat step 2 3. Perform best subsets regression with remaining variables. 4. List all models with Cp close to or less than (k + 1).

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-29

**8 Steps in Model Building
**

5. Choose the best model. Consider parsimony. Do extra variables make a significant contribution? 6. Perform complete analysis with chosen model, including residual analysis. 7. Transform the model if necessary to deal with violations of linearity or other model assumptions. 8. Use the model for prediction and inference.

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-30

Model Validation

The final step in the model-building process is to validate the selected regression model.

Collect new data and compare the results. Compare the results of the regression model

to previous results. If the data set is large, split the data into two parts and cross-validate the results.

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-31

**Model Building Flowchart
**

Choose X1,X2,…,Xk Run regression to find VIFs Any VIF>5? Yes Remove variable with highest VIF Yes More than one? No Remove this X

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

No

Run best subsets regression to obtain “best” models in terms of Cp

Do complete analysis

**Add quadratic term and/or transform variables as indicated Perform predictions
**

Chap 15-32

**Pitfalls and Ethical Considerations
**

To avoid pitfalls and address ethical considerations:

Understand that interpretation of the estimated

regression coefficients are performed holding all other independent variables constant Evaluate residual plots for each independent variable Evaluate interaction terms

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-33

**Pitfalls and Ethical Considerations
**

To avoid pitfalls and address ethical considerations:

Obtain VIFs for each independent variable before

**determining which variables should be included in the model.
**

Examine several alternative models using best-subsets

regression.

Use other methods when the assumptions necessary for

least-squares regression have been seriously violated.

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.

Chap 15-34

Chapter Summary

In this chapter, we have

Developed the quadratic regression model Discussed using transformations in regression

models The multiplicative model The exponential model Described collinearity Discussed model building Stepwise regression Best subsets Addressed pitfalls in multiple regression and ethical considerations

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 15-35

- Metopen Kualitatif Untuk Mahasiswa
- Metopen Kualitatif Untuk Dosen
- contoh semiotik
- Contoh Etnografi Visual
- contoh etnografi
- contoh deskriptif kualitatif
- Contoh Case Study
- Workshop BLUD UMKM Jepara Des 2012
- Profil Blud p2ksm 12-12-2012-1
- Paparan Osp 5 Pnpm Mp Prov Jateng
- PAPARAN JEPARA Suyono Pnpm Jateng
- Model Pengembangan Blud
- 58.a.. Pengelolaan Dana Bergulir Jepara 20 Des 2012
- bab 17-18-19 gabungan.pdf
- Groebner Dkk_Business Statistics_ a Decision-Making Approach Scansoft_bagian 2
- Groebner Dkk_Business Statistics_ a Decision-Making Approach Scansoft_bagian 1
- Groebner Dkk_Business Statistics_ a Decision-Making Approach Scansoft_bagian 1
- iSpring Suite 6 2 B5 Upload
- iSpring QuizMaker 6_2_B5 Upload
- iSpring Kinetics 6 2 B5 Upload
- Levine Dkk_Statistics for Managers Using Microsoft Excel 5 Ed_bab18
- Levine Dkk_Statistics for Managers Using Microsoft Excel 5 Ed_bab17
- Levine Dkk_Statistics for Managers Using Microsoft Excel 5 Ed_bab16
- Levine Dkk_Statistics for Managers Using Microsoft Excel 5 Ed_bab14

Materi ini merupakan bahan ajar sebagai pelengkap e-materi mata kuliah statistika bisnis.
Levine, D. M., Stephan, D. F., Krehbiel, T. C. & Berenson, M. L. (2008). Statistics for Managers Using Mic...

Materi ini merupakan bahan ajar sebagai pelengkap e-materi mata kuliah statistika bisnis.

Levine, D. M., Stephan, D. F., Krehbiel, T. C. & Berenson, M. L. (2008). Statistics for Managers Using Microsoft Excel. Pearson.

Levine, D. M., Stephan, D. F., Krehbiel, T. C. & Berenson, M. L. (2008). Statistics for Managers Using Microsoft Excel. Pearson.

- chap01 quamet
- Chap 01
- Introduction to Modeling and Simulation.ppt
- Base Station Conformance Testing (FDD)
- Introducing SigmaXL Version 7 - Aug 13 2014
- Introducing SigmaXL Version 7 - Aug 13 2014
- 59627
- midstatistik
- Notational Analysisa Math Perspective
- Model-predictive Control Looks to the Future - CEE
- bman01
- SRR_SJN
- Business Application of Statistics
- Chapter 3
- Bootstrap 3
- 25141-5f0
- VTT-R-00241-13
- Bootstrap-After-Bootstrap Prediction Intervals for Auto Regressive Models
- D4_7.2.3_StadiumRedundancyTestPlan
- Towards Performance Measurement And Metrics Based Analysis of PLA Applications
- An Approach for Selective State Machine Based Regression Testing
- AutomatedGenerationOfAutomatedModelsFromIsa5-2
- Anomaly Detection
- IBM SPSS Forecasting
- business statistics:a decision making approach Chapter 1 PowerPoint
- 6. PT 8.2.5.4 Stadium Redundancy Test Plan Student
- DataMining Process 17.03.12
- 02_realWorldDataIsDirty
- Car
- Output spss

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd