Topic 5

Agenda

Background

Example with Real Data

Some Considerations

Key Terms

Summary

Background

Multiple Linear Regression is widely

used in academics and also in MR

We can consider it the start of

Multivariate Analysis, for our course

Any idea what the following are:

Multivariate analysis

Multiple Linear Regression (MLR)

Background

Multivariate analysis is hard to define well

Some say anytime you have more than 2

variables, it is multivariate

Some say that you need to have many

combinations of variables i.e. variates

Some say that you need to have multiple

dependent variables

considered multivariate

MLR

Factor analysis

Background

Discriminant analysis

Cluster Analysis

Conjoint analysis

Canonical Correlation

Structural Equation Modeling

discriminant and cluster analyses

Linear regression involves finding a linear

relationship between an independent variable

and dependent variable

Background

Different levels of an independent variable

are associated with corresponding changes

in the dependent variable

What is an IV? What is a DV?

IV is denoted by X, while DV is denoted Y

We can loosely say X causes Y

principle behind it? In what scale the IV is,

the DV is?

Assume one X, one Y

Background

Normally, the IV & DV continuous,

not discrete

Meaning?

fitted in the scatter-plot of X and Y

The line of best fit is the regression line

Background

X

Background

Let us plot the points

Drawing a line of best fit is childs play

The association is perfectly linear

perfect

We instead may find data that may be as follows

There is some error

But the idea is to minimise this error; how is this

done?

Background

The sum of least squares is followed

Different lines are fitted, the errors

squared and the line with the sum of

least squares is chosen finally

Sometimes, MLR is called OLS or

Ordinary least squares

and then add? Why not just add up?

Background

The idea is 2-fold

We cancel out +ve and ve errors

We penalise large errors

Impossible to show on the board

and perform a regression

Some Considerations

Can also handle non-metric or

categorical IVs e.g. gender influences

shopping time

This is called dummy coding

Basically dummy regression is the same

as an ANOVA

Both are forms of the General Linear

Model

prerequisites and limitations

Some Considerations

There should be not be collinearity between

the IVs

This creates biased estimates

First step is therefore to get the correlation

matrix in Excel/SPSS

How to remove this collinearity?

research on likely relationships

Else, may end up doing sample-specific data

mining

No guarantee about robustness of results

Some Considerations

The shot-gun approach should be avoided

MR firms may not agree

the DV

2 marks bonus for saying this orally in the final!

This can be got around by transforming the data

using log, inverse, square root

Consider the following data

Some Considerations

X

1

2

3

4

5

6

7

8

Y

1

4

9

16

25

36

49

64

Some Considerations

SPSS will give you a decent

regression but it misses the point

Have to use polynomial regression,

beyond scope

IVs put in, else may reach utterly

erroneous conclusions e.g.

Sales on Ad, leaving out Price, SP

Some Considerations

Ideally have some likely results in

mind before going in for data

collection

MR firms screw up here

We academics score big here

Why is this important?

there, use stepwise regression

It will give you the order of importance

Some Considerations

In exploratory research, ok to use it

Not a big fan of stepwise

Coefficient of Determination, R2,

gives the extent of variation in Y

explained by X (or X1, X2 and so on)

Also called variance explained

Better would be adjusted R2

is the standardised weight

Since different units may be there for diff

IVs

F-Value and t-value must be looked

at too

Any doubts?

Do you want to learn how regression

can handle

Categorical data

Interaction effects? What problems will

come here?

Need demo?

Summary

MLR is a very useful tool

It has wide applications

fundamental assumptions, mainly

multicollinearity

Esp. in MR

