You are on page 1of 44

Marketing Research

Ninth Edition

Chapter 15

Understanding
Regression Analysis
Basics

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Learning Objectives (1 of 2)
In this chapter you will learn
15.1 What bivariate linear regression analysis is, including
basic concepts such as terms, assumptions, and equations
15.2 What multiple regression analysis is, including the basic
underlying conceptual model, terms, assumptions, and
computations, including how to do it with SPSS
15.3 Special uses of multiple regression, including “dummy”
variables, standardized betas, and using multiple regression
as a screening device
15.4 What stepwise multiple regression is, including how to
do it with SPSS
Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Learning Objectives (2 of 2)
In this chapter you will learn
15.5 Some warnings regarding the use of multiple regression
analysis
15.6 How to report multiple regression analysis insights to
clients

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Where We Are
1. Establish the need for marketing research.
2. Define the problem.
3. Establish research objectives.
4. Determine research design.
5. Identify information types and sources.
6. Determine methods of accessing data.
7. Design data collection forms.
8. Determine the sample plan and size.
9. Collect data.
10. Analyze data.
11. Communicate insights.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Bivariate Linear Regression Analysis (1 of 2)
• Regression analysis is a predictive analysis technique in
which one or more variables are used to predict the level of
another by use of the straight-line formula.
• Bivariate regression means only two variables are being
analyzed, and researchers sometimes refer to this case as
“simple regression”.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Bivariate Linear Regression Analysis (2 of 2)
• With bivariate analysis, one variable is used to predict another
variable.
• The straight-line equation is the basis of regression analysis.
• Regression is directly related to correlation.
y  a  bx

where
y = the predicted variable
x = the variable used to predict y
a = the intercept, or point where the line cuts the y axis when x = 0
b = the slope or the change in y for any 1-unit change in x.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Figure 15.1

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Basic Regression Analysis Concepts
• Independent variable: used to predict the independent
variable (x in the regression straight-line equation)
• Dependent variable: that which is predicted (y in the
regression straight-line equation)

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Computing the Slope and the Intercept
• Least squares criterion: used in regression analysis;
guarantees that the “best” straight-line slope and intercept
will be calculated

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Formula for the Slope, b
Formula for b, the slope, in bivariate regression
n
 n  n 
n  X iYi    X i   Yi 
b  i 1  i 1   i 1 
2
n
 n

n  X i2    X i 
i 1  i 1 

where
xi = an x variable value
yi = a y value paired with each xi value
n = the number of pairs

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Improving Regression Analysis
• Identify any outlier -- a data point that is substantially
outside the normal range of the data points being
analyzed.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Figure 15.2

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Multiple Regression Analysis
• Multiple regression analysis uses the same concepts as
bivariate regression analysis, but uses more than one
independent variable.
• A general conceptual model identifies independent and
dependent variables and shows their basic relationships to
one another.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Figure 15.3

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Multiple Regression Analysis Described
• Multiple regression means that you have more than one
independent variable to predict a single dependent
variable.
• With multiple regression, the regression plane is the
shape of the dependent variables.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Basic Assumptions in Multiple
Regression (1 of 2)
Multiple regression equation
y  a  b1x1  b2 x2  b3 x3    bm x m

where
y = the dependent, or predicted, variable
xi = independent variable i
a = the intercept
bi = the slope for independent variable i
m = the number of independent variables in the equation

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Example of Multiple Regression (1 of 5)
• We wish to predict customers’ intentions to purchase a
Lexus automobile.
• We performed a survey that included an attitude-toward-
Lexus variable, a word-of-mouth variable and an income
variable.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Example of Multiple Regression (2 of 5)
The resultant equation:

Lexus purchase Intention to purchase a Lexus = 2 Notes:


intention multiple + 1.0 × attitude toward Lexus (1– a=2
regression equation 5 scale) −.5 × attitude toward b1 = 1.0
example current auto (1–5 scale) + 1.0 × b2 = −.5
income level (1–10 scale) b3 = 1.0

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Example of Multiple Regression (3 of 5)
• This multiple regression equation means that we can
predict a consumer’s intention to buy a Lexus level if you
know three variables:
– Attitude toward Lexus
– Friends’ negative comments about Lexus
– Income level using a scale with 10 income levels.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Example of Multiple Regression (4 of 5)
• Calculation of one buyer’s Lexus purchase intention using
the multiple regression equation:

Calculation of Lexus Intention to purchase Notes:


purchase intention a Lexus = 2 Intercept = 2
using the multiple + 1.0 × 4 Attitude toward
regression Equation −.5 × 3 Lexus (x1) = 4
+ 1.0 × 5 Attitude toward
= 9.5 current auto (x2) = 3
Income level (x3) = 5

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Example of Multiple Regression (5 of 5)
• Multiple regression is a powerful tool because it tells us:
– Which factors predict the dependent variable
– Which way (the sign) each factor influences the
dependent variable
– How much (the size of bi) each factor influences it

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Multiple R (1 of 3)
• Multiple R: also called the coefficient of determination,
is a measure of the strength of the overall linear
relationship in multiple regression.
• It indicates how well the independent variables can
predict the dependent variable.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Multiple R (2 of 3)
• Multiple R ranges from 0 to +1 and represents the amount
of the dependent variable that is “explained,” or accounted
for, by the combined independent variables.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Multiple R (3 of 3)
• Researchers convert the Multiple R into a percentage:
Multiple R of .75 means that the regression findings will
explain 75% of the dependent variable.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Basic Assumptions in Multiple
Regression (2 of 2)
• The inclusion of each independent variable preserves the
assumptions of multiple regression analysis. This is
sometimes known as additivity because each new
independent variable is added to the regression equation.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Basic Assumptions of Multiple
Regression (1 of 2)
• Independence assumption: the independent variables
must be statistically independent and uncorrelated with
one another (the presence of strong correlations among
independent variables is called multicollinearity)

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Basic Assumptions of Multiple
Regression (2 of 2)
• Variance inflation factor (VIF): can be used to assess
and eliminate multicollinearity
– VIF is a statistical value that identifies what
independent variable(s) contribute to multicollinearity
and should be removed
– Any variable with VIF of greater than 10 should be
removed

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Figure 15.4

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Figure 15.5

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
“Trimming” the Regression
• A trimmed regression means that you eliminate the
nonsignificant independent variables and, then, rerun the
regression.
• Run trimmed regressions iteratively until all betas are
significant.
• The resultant regression model expresses the salient
independent variables.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Figure 15.6

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Example of Trimmed Regression (1 of 2)
Independent variables Untrimmed Untrimmed Trimmed Trimmed Multiple
Multiple Multiple Multiple Regression
Regression Regression Regression Sig.
Beta Sig. Beta
Conference affilitation blank blank blank blank
ACC .37 .25 –* –*
BigTen .95 .01 .89 .01
Big12 .92 .01 .82 .01
SEC .76 .02 .70 .01
PAC .82 .01 .70 .01
Tickets Included .22 .08 – –
Other Events Included .04 .79 – –
Parking Included −.23 .26 – –
Food & Beverage −.03 .81 – –
Included
Private/Public Institution −.22 .47 – –
Facility Age .27 .45 – –
Renovation −.04 .52 – –

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Example of Trimmed Regression (2 of 2)
Table [Continued]
Independent variables Untrimmed Untrimmed Trimmed Trimmed Multiple
Multiple Multiple Multiple Regression
Regression Regression Regression Sig.
Beta Sig. Beta
Number of Suites −.06 .60 – –
Suite Capacity .43 .04 .45 .01
Winning Percentage .44 .04 .40 .01
County Population −.17 .02 −.13 .03
Per Capita County .94 .03 .78 .02
Income
College Basketball .21 .09 .17 .04
Competition
Institution Enrollment −.12 .61 – –
Constant 0.66 blank 2.55 blank

*Nonsignificant betas (sig > .05) and significance levels not reported

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Special Uses of Multiple Regression
• Dummy independent variable: scales with a nominal
0-versus-1 coding scheme
• Using standardized betas to compare independent
variables: allows direct comparison of each
independent variable
• Using multiple regression as a screening device:
identify variables to exclude

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Stepwise Multiple Regression (1 of 2)
• Stepwise regression is useful when there are many
independent variables, and a researcher wants to narrow
the set down to a smaller number of statistically significant
variables.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Stepwise Multiple Regression (2 of 2)
• The one independent variable that is statistically
significant and explains the most variance is entered first
into the multiple regression equation.
• Then, each statistically significant independent variable is
added in order of variance explained.
• All insignificant independent variables are excluded.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Table 15.2 Step-By-Step Procedure for Multiple
Regression Analysis Using SPSS (1 of 2)
Step Description

Step 1. Choose the The dependent variable (y) is the predicted variable, and the independent variables (xi’
dependent variable and s) are used to predict y. Typically, both y and x variables are scale variables (interval or
independent variables. ratio scales), although some dummy independent variables are allowable.

Step 2. Determine if a From the initial S PSS output, the A NOVA table reports a computed F value and
linear relationship exists associated Sig. level.
in the population (using a. If the Sig. value is .05 or less, there is a linear relationship among the chosen
95% level of confidence). variables in the population. Go to Step 3.
b. If the Sig. value is more than .05, there is no linear relationship among the chosen
variables in the population. Return to Step 1 with a new set of variables, or stop.

Step 3. Determine if the Also look at the Sig. level for the computed beta coefficient for each associated
chosen independent independent variable.
variables are statistically a. If the Sig. level is .05 or less, it is permissible to use the associated independent
significant (using 95% variable to predict the dependent variable with the y = a + bx linear equation.
level of confidence). b. If the Sig. level is more than .05, it is not permissible to use the associated
independent variable to predict the dependent variable.
c. If you find a mixture of a. and b., you should do “trimmed” or stepwise multiple
regression analysis (see the text on these techniques).

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Table 15.2 Step-By-Step Procedure for Multiple
Regression Analysis Using SPSS (2 of 2)
Step Description

Step 4. Determine In the SPSS output Model Summary table, R Square is the square of the
the strength of the correlation coefficient, and the Adjusted R Square reduces the R R 2 squared
relationship(s) in the by taking into account the sample size and number of parameters
linear model. estimated. Use Adjusted R Square as a measure of the “percent variance
explained” in the y variable using the linear equation to predict y.

Step 5. Interpret the With a result where only statistically significant independent variables are
findings. used in the analysis, use the standardized betas’ magnitudes and signs.
Then assess each independent variable’s relative importance and
relationship direction with the dependent variable.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Three Warnings Regarding Multiple
Regression Analysis
• Regression is a statistical tool, not a cause-and-effect
statement.
• Regression analysis should not be applied outside the
boundaries of data used to develop the regression model.
• Chapter 15 is simplified…regression analysis is complex
and requires additional study.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Table 15.3 Regression Analysis Basic
Concepts (1 of 2)
Concept Explanation
Regression analysis An analytic technique using the straight-line relationship of y = a + bx
Intercept The constant, or a, in the straight-line relationship that is the value of y when x = 0
Slope The b, or the amount of change in y for a one-unit change in x
Dependent variable y, the variable that is being estimated by the x(s) or independent variable(s)
Independent variable(s) The x variable(s) used in the straight-line equation to estimate y
Least squares criterion A statistical procedure that assures that the computed regression equation is the
best one possible for the data being used

R Square A number ranging from 0 to 1.0 that reveals how well the straight-line model fits the
scatter of data points; the higher, the better

Multiple regression analysis A powerful form of regression where more than one x variable is in the regression
equation

Additivity A statistical assumption that used more than one x variable in a multiple regression
equation by adding them in the form: yy =aa+bbsub
x 1 x sub 1 + b sub 2 x sub 2, and so on to,
b x b x
+ b sub m x sub m. 1 1 2 2 m m

Independence assumption A statistical requirement that when more than one x variable is used, no pair of x
variables has a high correlation

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Table 15.3 Regression Analysis Basic
Concepts (2 of 2)
Concept Explanation
Multicollinearity The term used to denote a violation of the independence assumption that causes
regression results to be in error
Variance inflation factor (V IF) A statistical value that identifies what x variable(s) contribute to multicollinearity
and should be removed from the analysis to eliminate it. Any variable with a V IF
of 10 or greater should be removed.

Multiple R Also called the coefficient of determination, a number that ranges from 0 to 1.0
that indicates the strength of the overall linear relationship in a multiple
regression; the higher, the better

Trimming The process of iteratively removing x variables in multiple regression which are
not statistically significant, rerunning the regression, and repeating until all the
remaining x variables are significant

Beta coefficients and Beta coefficients are the slopes (b values) determined by multiple regression for
standardized beta coefficients each independent variable x. These are normalized to be in the range of .00 to .
99, so they can be compared directly to determine their relative importance in y’s
prediction.

Dummy independent variable Use of an x variable that has a 0, 1, or similar nominal coding, used sparingly
when nominal variables must be in the independent variables set
Stepwise multiple regression A specialized multiple regression that is appropriate when there is a large number
of independent variables that need to be trimmed down to a small, significant set
and the researcher wishes the statistical program to do this automatically

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Reporting Findings to Clients
• Most important when used as a screening devise:
– Dependent variable
– Statistically significant independent variables
– Signs of beta coefficients
– Standardized beta coefficients for significant variables

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Reporting Example

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved
Copyright

This work is protected by United States copyright laws and is


provided solely for the use of instructors in teaching their
courses and assessing student learning. Dissemination or sale of
any part of this work (including on the World Wide Web) will
destroy the integrity of the work and is not permitted. The work
and materials from it should never be made available to students
except by instructors using the accompanying text in their
classes. All recipients of this work are expected to abide by these
restrictions and to honor the intended pedagogical purposes and
the needs of other instructors who rely on these materials.

Copyright © 2020, 2017, 2014 Pearson Education, Inc. All Rights Reserved

You might also like