You are on page 1of 121

Australian School of Business

Mark 3054: Market Analysis / MR- II

Module1:

Introduction, Recap and


Housekeeping

Rahul Govind

University of Pittsburgh
Carnegie Mellon University
University of Mississippi

Ford Motor Company


Hewlett Packard
J.D Powers

Rahul Govind
The University of New South Wales
Recap MARK 2052

Rahul Govind
The University of New South Wales

The Market Research Process

1. Establish need for information


2. Establish information needs

3. Determine research design/data sources


4. Develop data collection procedures

5. Sample design
Decision
6. Survey design

7. Collect data
8. Process data
9. Analyze data

10. Write report/present findings


11. Action

Rahul Govind
The University of New South Wales
Step 1: Establish Need for information

(Why) Do we need to conduct Marketing research?

1) Check out what the market is like!


(SWOT analysis-Proactive Research)
Explore a new opportunity
Check for any threats in the market
Identify our strengths
Identify our weaknesses

2) Check out if a strategy makes sense! (Dry run/Concept


Testing)
Introducing a new product will it make money?

Rahul Govind
The University of New South Wales

Step 1 contd. : Establish Need for


information

3) I see changes? Why do I see changes? (Reactionary


Research)

My product used to sell very well but its sales are now declining!

4) Have things changed from the past? (Tracking Study)

I had a 30% Market Share last year. What is it now?

5) The law wants me to! (Mandatory Studies) Location


based

Insurance and Medicine need to check for consumer satisfaction

Rahul Govind
The University of New South Wales
Step 2: Establish Information Needs
What are the questions that we need to ask (the
consumers)?
Will they give us answers to what we want to know in
step 1?
Show me the numbers!!!
Dont use toooo much intuition!!
What percentage of the time do you think you smell bad
amongst friends/co-workers?
What percentage of the time do you think your friends/co-
workers smell bad?
You think < 10%
Everybody else thinks 35%
Rahul Govind
The University of New South Wales

Step 3: Determine Design and Data


Sources
Can I get by using less money and time? (consider FMC)
Use Secondary Data!
General Motors had a similar problem and they conducted a
survey.
Since GM and FMC are perceived similarly, that data might be
used
Can we just get by observing what is going on around
us?
Observational Research
Ask people who come to the showroom and not buy the product
Observe what people who buy a competing brand (Honda) focus
on when they buy the product.
Ok, we do need to spend Time AND Money!
Survey Research
Ask Subjects questions that will help us in answering our questions
Are you satisfied with the reliability of Ford Cars?
Are you satisfied with the looks of Ford Cards?

Rahul Govind
The University of New South Wales
Secondary and Primary Data

Rahul Govind
The University of New South Wales

Step 4: Develop data collection procedure


What types of data do we need?
Attitudes: how do people think/feel about our products?
What is the first thing that comes to mind when you think of FMC?

Behaviors: how do people behave in the market?


Which was your last vehicle purchase?

Demographics: company, people, industry


Where do you live?
How many children under 12 do you have in your family?

How should we collect the data? Get it the right way!


Phone/Personal/Mail/E-mail (Surveys)
Secondary research
Observation

Rahul Govind
The University of New South Wales
Step 5: Sample Design (for survey research)
Who should be interviewed? (Population and Sample)
Want to sell to the entire US. How do I survey EVERYBODY?
Decision maker/influencer
Dad who buys the car (purchaser) or
Child that drives it to school (user)

How should these respondents be contacted?


Telephone interviews (Can respondent visualize color, design etc.?)
Personal interviews (very expensive)
Mail interviews (too slow? Can we wait that long?)

How many people should we interview?


At what level of confidence can we project the results to the
population?
What possible amount of error can we live with?

Rahul Govind
The University of New South Wales

Step 6: Survey Design


Ask as few questions as possible and in a logical
fashion!! Humans are lazy (Cognitive Misers)
Keep it simple and stupid (KISS)
Use language that is convenient and comfortable for the
respondent
Speak in the respondents language
Would you consider yourself a turophile? Actual Survey Qn by
Kraft
Ask the hard questions first and easy ones last.
Humans are not only lazy, but they easily get tired
(Survey Fatigue)

Rahul Govind
The University of New South Wales
Step 7/8: Collect Process Data
Interview quality control procedures
Interviewer quality
Make sure that you haven't hired a lazy guy
Confirm that he is doing his job
Data processing quality
Data cleaning procedures
Qn 6 asked - Do you prefer an automatic transmission?
Respondent did not answer
TRASH
Logic checks
Qn 5 asked - Do you own a car? Respondent said No.
Qn 6 asked Is it an American Car? Respondent said
Yes.
TRASH
Rahul Govind
The University of New South Wales

Step 9: Analyze Data


Does the data tell you anything without using Math
Go back and review information needs Rural people like FMC
better
Everybody hates our reliability
Visualize results A picture says a 1000 words
Think, use your brains!

Now use Math Techniques


Cross-tabulations
Multivariate analysis (regression, semantic scales, conjoint)
More complex designs and analyses

Dont just make the analysis very simple but dont just go
crazy with math either
Rahul Govind
The University of New South Wales
Step 10/11: Write a report/Take Action
Writing reports
User friendly reports
Bullets and dashes format
Picture is worth a 1000 words
BLOT strategy (bottom line on top)

Presenting findings
Be sensitive to the audiences level of knowledge
Be a manager, not a statistician
Sometimes numbers dont mean a thing! But most of the times
nothing means anything without numbers

Action
Design changes in the marketing mix
Go back and see if you need more data to make better decisions

Rahul Govind
The University of New South Wales

What is Market Analysis?


It is NOT a course in statistical formulae
The focus is on understanding the bases of the
techniques
It is NOT about plugging numbers into formulae
It is about gaining hands on experience using
analytical software SPSS
It is about how output can be used
interpretation and communication

Rahul Govind
The University of New South Wales
Outcomes of this course
Use SPSS to analyze a variety of data typically
collected by marketers.

Explain when and how a range of statistical


techniques may be applied to marketing situations.

Translate the output from statistical analyses into a


language that is understandable to marketing
managers

Competently and confidently communicate (verbal


and written) the true meaning of statistical output.

Adequately self-reflect and self-assess behavior in


teamwork situations.
Rahul Govind
The University of New South Wales

But most importantly


Predict and hone your needs for answering certain
business questions
Visualize numbers

And both of these translate to what we call....

the ability to think

Rahul Govind
The University of New South Wales
The two stories of Marketing
Build a better mousetrap and the world will beat a
path to your door - Ralph Waldo Emerson

Before we build a mousetrap, we should check if


there are any mice out there Yogi Berra

Rahul Govind
The University of New South Wales

Why do new products fail?

Rahul Govind
The University of New South Wales
Begin 3054

Rahul Govind
The University of New South Wales

Statistical Analyses
Used to find out
What people have in common
How they differ
Predict how they will act in the future

(Burns & Bush 2000)

Rahul Govind
The University of New South Wales
Types of statistical analyses
Descriptive Analysis (who are they?)
Used to describe the data to reveal general pattern of
responses
To portray the typical respondent

What are the demographics of the sample?


What percentage of people have purchased a new car in the
last 2 years?
Are people satisfied with their bank (if not which ones?)
What cause-related products do people buy?

Rahul Govind
The University of New South Wales

Types of statistical analyses


Inferential Analysis
To draw conclusions about
1. the population
2. previous samples
3. future samples
based on the current sample

Is the satisfaction level of current students the same as past


years of students?

Rahul Govind
The University of New South Wales
Types of statistical analyses
Differences Analyses
To determine if differences exist between groups

Is there a difference between males and females in what they


want from a holiday destination?
Is there a difference between local and international students in
their motivation to purchasing cause-related product?

Rahul Govind
The University of New South Wales

Types of statistical analyses


Associative Analyses
To determine the strength and direction of
relationships between 2 or more variables

Is there an association between the importance of quality of


interior fittings of a car and the importance of comfort?
Are certain types of students more satisfied with their degree
choices than others?
What are the broad types of motivations towards buying
products? Can you classify people based on these broad types?
If so, what are these classifications?

Rahul Govind
The University of New South Wales
Types of statistical analyses
Predictive Analyses
Allows forecasts of future events
Estimate the level of Y given the amount of X

Does the provision of certain characteristics help predict the


likelihood of purchasing a product?
Which features of a holiday resort have the greatest impact on
satisfaction?

Rahul Govind
The University of New South Wales

Within each type of analysis there


are a range of techniques -

Descriptive Analysis
E.g., Means, medians, frequency, standard deviation

Inferential and Difference Analysis


E.g., T-tests, ANOVA

Associative Analysis
E.g., Correlation, Crosstabs with chi-square, Factor
analysis, Cluster analysis

Predictive Analysis
E.g., Regression
Rahul Govind
The University of New South Wales
How do you know what method to use?

To answer this you need to:


Identify the variables of interest
How many variables are there?
Determine whether you are dividing the variables of
interest into dependent and independent variables
Determine the scale of each of these variables, i.e.
type of data

Rahul Govind
The University of New South Wales

Choice of Technique one variable


(Kinnear & Taylor 1991; Churchill 1999)

Single Analysis Variable

Nominal Data Ordinal Data Scale(d) Data


1. Mode 1. Median 1. Mean
2. Frequency 2. Inter-quartile range 2. St dev
3. Chi-square 3. z test
4. t test

Rahul Govind
The University of New South Wales
Choice of Technique Two or more variables
(Kinnear & Taylor 1991; Churchill 1999)

Yes Do you have Dependant (DV) No


& Independent Variables(IV)?
Scale of DV? N
Chi-square Scale of var?
O
Ordinal S STOP 1. Rank Corr
DV 2. Chi-Square S
N/O
Scaled S
1. Chi-Sq DV
2. Conjoint 1. Paired t-test
Nominal N/O 2. Correlation
DV 1. Indept t-test 3. Factor Analysis
S Regression
N/O 2. ANOVA 4. Cluster Analysis
3. Dummy var 5. MDS
Discriminant
Chi-square 4. Regression
Analysis
5. Conjoint
Rahul Govind
The University of New South Wales

Overview of the Stages of Data


Analysis

Editing

Coding

Data Entry Descriptive analysis

Data Analysis Univariate and


Multivariate
Analysis
Interpretation

Rahul Govind
The University of New South Wales
Some Editing issues
Preliminary questionnaire screening

Checks of Completed Questionnaires


Unsystematic/Systematic

What to Look For in Questionnaire Inspection


Incomplete Questionnaires
Non-responses to Specific Questions/Item Omissions
Yea- or Nay-Saying Patterns
Middle-of-the-Road Patterns
Unreliable Responses

Rahul Govind
The University of New South Wales

Coding
Aim of coding:
Retain as much information in the data file as on the hard
copies of the questionnaire
Facilitate data analysis

Rahul Govind
The University of New South Wales
Coding and Data entry some
issues
How often do you visit a dentist? (Tick ONE box only)
Every 6 months
Every year
Every 2 years
Only once in the last 5 years

CODE: One variable, coded as 1, 2, 3, 4

Rahul Govind
The University of New South Wales

Why have you not used the Optometry Clinic? (May tick
more than one option)

Too far from where I live


Dont have time
Didnt know there was one
Dont trust student examinations

CODE: 4 variables (one for each alternative), coded as


0/1

Rahul Govind
The University of New South Wales
What is the likelihood you would go to a show at
the Opera House within the next year?

Very Unlikely Neither Likely Very


Unlikely like or unlikely Likely

CODE: One variable, coded as


0, 1, 2, 3, 4
-2, -1, 0, 1, 2
1, 2, 3, 4, 5

Rahul Govind
The University of New South Wales

Measurement:
What are the various types of data?
Examples

Nominal What is your gender?


Male Female

Ordinal What is your age?


<18yrs 18-29yrs >29yrs
How satisfied are you?
Interval/Scale
V unsat Unsat Neither Sat V sat

Rahul Govind
The University of New South Wales
Australian School of Business
Mark 3054: Market Analysis / MR- II

Module 2:
Data Preparation &
Customer Profiling

Outline

Getting to Know your data


Starting to build a profile of your sample
Cross tabulation and Chi-square

Rahul Govind
Australian School of Business
Question..
Once you have the data coded and entered in a
data set, what do you do first?

Rahul Govind
Australian School of Business

Dataset for analysis


Objective of that research:
To understand how satisfied their customers were with the
company, and customers perceptions of the companys
performance.

Rahul Govind
Australian School of Business
Objectives
Res Obj, can be broken down to sub-objectives,
for example:
To characterise customers.
To understand what the customers want from a customer service
company.
To understand the performance of the regions and whether they
differ in any way.
To evaluate the companys performance.
To identify the factors that impact on customers satisfaction.

In following weeks we will cover a range of


techniques which will help us obtain this insight.

Rahul Govind
Australian School of Business

Getting to Know
the Data

Initial examination of the data


Data Reduction

Rahul Govind
Australian School of Business
Getting to know your data
Frequencies
These help to:
Graphical examination
Detect
Measures of central tendency
outliers
Measures of dispersion
Detect errors
Test
assumptions

Rahul Govind
Australian School of Business

Frequencies
What are they?
Type of data used?
Common uses of frequency tables
Data cleaning:
determine the degree of non-response
locate blunders
locate outliers
Determine empirical distribution of a variable
relates to graphing
Calculate summary statistics

Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business

Example Customer service data


From the frequency table -
4 (2%) people did not answer the question (coded as
missing)
sample size for the question (n) was 196
200-4
19% of respondents are from Region 6
There are 3 errors

Rahul Govind
Australian School of Business
How to use this information

Thus -If there is a coding error, correct the error and


re-run the frequency to obtain the correct
percentages

Regroup the categories, if appropriate


WHY?

Rahul Govind
Australian School of Business

Graphical Examination
Highlights:
the nature of the variable - the shape of the distribution
relationships between variables
unusual values
SPSS examples: What can you say from these
graphs?
bar chart and histogram

Rahul Govind
Australian School of Business
Assumptions
Common Assumptions for tests
Normality Kurtosis and Skewness
Linearity
Equal Variance

Rahul Govind
Australian School of Business

Data Reduction
As you get to know your data, you start the process
of data reduction

Why?
Summarise
Communicate

Rahul Govind
Australian School of Business
To help summarise and communicate:
MEASURES OF CENTRAL TENDENCY
Mode - the mode is the value that occurs most frequently in a data
set or a probability distribution

Median - is described as the numerical value separating the higher


half of a sample, a population, or a probability distribution, from the
lower half.

Mean
n

x i
Arithmetic Mean ( x ) = i 1
n

Rahul Govind
Australian School of Business

To help summarise and communicate:


MEASURES OF VARIABILITY/DISPERSION

Frequency Distribution

Range

Mean Absolute Deviation


n

Standard Deviation (s) = ( x x)


i 1
i

n 1

Rahul Govind
Australian School of Business
Measures of Variability
Standard deviation
How much do the responses vary?
Do most respondents answer the same?
Low variation in responses = low variance in opinion
Do survey participants respond all over the scale?
Represents the typical difference of any one value from
the mean

Rahul Govind
Australian School of Business

Standard Deviation
It shows how much variation or "dispersion" there is
from the average (mean, or expected value). A low
standard deviation indicates that the data points tend
to be very close to the mean, whereas high standard
deviation indicates that the data are spread out over
a large range of values.

Rahul Govind
Australian School of Business
If I don't have SD

Rahul Govind
Australian School of Business

Further investigation of s.d.


Investigate the variability in responses across
questions
Compare the s.d. for importance of helpful staff with that
for importance of local representative courteous

2.778 vs 2.063

What does this imply?

Rahul Govind
Australian School of Business
Other measures
Skewness: how much a distribution of
responses may be skewed to the left side or to
the right side

Kurtosis: how peaked of flat in shape the


distribution of responses are.

These as well as the other descriptive measures


can in calculated in SPSS
(eg. Analyse/Descriptive Statistics/Descriptives)

Rahul Govind
Australian School of Business

Rahul Govind
Australian School of Business
Leptokurtic vs Platykurtic

Rahul Govind
Australian School of Business

Information gathered from getting to know


your data helps you to begin to understand
profile or characterise - your customer. For
example:

How they answered questions on average


Range of responses
Frequency of different alternatives

Rahul Govind
Australian School of Business
So what do we now know about
our customer service data?
Can gain description of who is on our sample (age,
gender etc)
Describe the average and pattern of response to
key variables

Rahul Govind
Australian School of Business

So far ..
Started to summarise the data gain some facts
about respondents
E.g., average response, variability in response,
commonly used categories etc

But it also starts to raise more questions:


E.g. Do all the respondents have the same views? Do
different groups have varying views? Are there
differences between males and females, or the age
groups in what is important? Or in their assessment of
the companys performance?

For this we need more techniques!


Rahul Govind
Australian School of Business
Gaining Further Insight

Cross tabulations

Rahul Govind
Australian School of Business

Cross tabulations
Extends the frequency table to 2 variables
Variables are nominal / ordinal
Counts the number of observations in each
possible sub-group or cell
Need to be on the lookout for too many cells with very
small counts (<5) Reduces reliability
Analyse/Descriptive Statistics/Crosstabs

Rahul Govind
Australian School of Business
Lets follow up on a question:
The overarching questions here:
How can we characterise customers? What can we find
out about them? Are they the same or not?

This can be investigated from different angles


lets look at one - Age:
Is there an association between age groups and the
type of transaction they undertake?
OR
Are younger age groups less likely than older groups to
use the company for private transactions?

Rahul Govind
Australian School of Business

Questions from q/aire:


Age group
17-30, 30-40, 40-50, 50-60 and > 60

Type of transaction
Private
Business

Rahul Govind
Australian School of Business
Interpreting the output:

Look at percentages (not frequencies) to gain an


understanding
Direction of percentages?
From crosstab we might say:
More in old groups us the company for Public
transactions than the younger ones.
70% (>60) vs approx 51% for younger groups

Rahul Govind
Australian School of Business

Conclusion?
There appears to be some association between age
and type of transaction. However..

Is this tendency, or association, significant?


That is, is there enough evidence to conclude that
the ages differ in the type of transaction they use the
company for ?
If so, how strong is that evidence?

Rahul Govind
Australian School of Business
Implications

Need a benchmark or bar to help us decide when


there is enough evidence to say This situation could
not have occurred by chance .

Rahul Govind
Australian School of Business

Hypothesis Testing
From past experience we may have some
assumptions about the relationships between
variables or on how consumers may rate certain
aspects of our products
Aim: To examine whether a particular proposition
(hypothesis) concerning the population is likely to
hold
Is there enough evidence from the sample to
reject the hypothesis?
If not, we say that from this sample we cannot reject the
hypothesis - we do not say that we accept it!

Rahul Govind
Australian School of Business
Hypothesis testing creating a
benchmark
From our sample we can calculate a test statistic to
help us test our hypothesis.
This test statistic can take on a range of values,
depending on the sample we have drawn from the
population.
All these values together give you the distribution for
the statistic.

Rahul Govind
Australian School of Business

Values in the tails are possible if our hypothesis is true


but very unlikely! Therefore if the value of our test
statistic falls in these regions we reject our hypothesis

Rahul Govind
Australian School of Business
The cut-off points for the tails are usually defined such
that the probability of getting this value or greater is 0.05 .
This 0.05 is known as the significance level (the
probability of us making an incorrect decision), and the
cut-off point is the critical value
Therefore, we have established our benchmark or bar
to judge if the evidence from the sample is sufficient to
say that there is a statistical relationship
I.e., for a particular test statistic, if the probability
of the test statistic (p value ) is <0.05, there IS
evidence to REJECT the null hypothesis.

Rahul Govind
Australian School of Business

Lets now return to our


crosstabs

How is this knowledge


translated to the current
situation?
Rahul Govind
Australian School of Business
Chi-square Test
Statistical test (and test statistic) used in
connection with cross-tabulations

Tests the presence of an association between 2


nominal or ordinal variables

Based on the frequencies i.e. counts

Rahul Govind
Australian School of Business

Chi- Square cont.


Compares what you have observed (from your sample)
with what you would expect if there were no
relationship between the variables

If the difference between the observed and expected


frequencies is too large then you have evidence to
reject the null hypothesis

Rahul Govind
Australian School of Business
Is this association significant?
Hypothesis test - variables are
independent, i.e., there is no
association between them.
Null: There is no association between age and
type of transaction
Alternative: There is an association between
age and type of transaction.

Rahul Govind
Australian School of Business

A small exercise to try at home


Level Easy - Divide the population based on
gender
Conduct chi-square tests to see if there is a difference in
their preference

Level - Medium - Divide the population into two


groups based on age
Age >60
Age <60

Rahul Govind
Australian School of Business
Main Points from this module.

Examination of your data and data reduction help


you, the analyst, to get a feel for what the data
is about. A range of simple techniques helps with
this task.

Cross tabulation, with an associated Chi-


square test, allows you to find whether the
association between the two (or more)
variables is significant.
i.e. Chi square tests for the significance of the
cross tab.

Rahul Govind
Australian School of Business
Profiling the Customer:
Gaining Further Understanding of the Target Market

t-Tests
One Sample, Independent & Paired

Outline
Extending to techniques that allow us to test
assumptions about the data and differences between
groups of respondents.
One sample t-test
Independent t-test
Paired t-test

Rahul Govind
Australian School of Business
Key understanding from today:
So , from the techniques we cover today and next
week, we can:
Understand our target market more
Understand similarity of thoughts, behaviours etc between
different basic groups within our target market,
and hence start to understand whether there are patterns
in the data

Are men and women similar on their liking for _______?


Do older people like ______ more than younger ones?
Do people in ____ spend more time travelling than the ones in
_____?

Rahul Govind
Australian School of Business

Going from the sample to the


population
From the survey (i.e. the sample) we obtain
descriptive statistics
However, often need more information - need to
extend our findings from the sample to gain an
understanding of the population

How is this done?

Rahul Govind
Australian School of Business
Sample vs Population
Sample Population
(parameters) (parameters)

statistics parameters
mean x mean
st dev s st dev
percentage p percentage
slope b slope

Rahul Govind
Australian School of Business

Using our sample we can ...


Estimate a Parameter
How well does the sample information reflect the true
population?
Uses the sample information to compute an interval
which describes the range of the parameter

Test a Hypothesis
Does the sample reflect (a managers) prior belief about
the population?
Uses information/evidence from the sample to infer about
the population parameter

Rahul Govind
Australian School of Business
Hypothesis Testing
From past experience we may have some
assumptions about the relationships between
variables or on how consumers may rate certain
aspects of our products

Aim: To examine whether a particular proposition


(hypothesis) concerning the population is likely to
hold

Rahul Govind
Australian School of Business

A Sampling Distribution -
2-tail test, a0.05

a.025 a.025

0
Rahul Govind
Australian School of Business
Developing Hypotheses

Null
Hypothesis Alternative
Statement that Asserts
Hypothesis
the Status Quo.
Statement that Is
One Always Tested the Opposite of the
by Statisticians & Null Hypotheses.
Market Researchers.
Difference Is Not
Simply Due to
Random Error.

Rahul Govind
Australian School of Business

Lets put this in context


All techniques are conducted to help find out
information on our questions about the data.
Currently, we are aiming to understand the data
how respondents answered the questions, their
views etc.
So, lets look at one of the questions in the data
views on performance of staff

Rahul Govind
Australian School of Business
Example:
Past evidence suggests that on average, people
are indifferent to using the car as a source of
enjoyment (i.e. they always answer in the middle
of a scale neither agree nor disagree).
However, we think that this may not be so (our
hypothesis).

Rahul Govind
Australian School of Business

Example:
Null hypothesis:
H0: =4
i.e. On average, customers do not agree nor disagree
to the car being a major source of entertainment
Alternative hypothesis:
H1: 4
i.e. On average, customers agree or disagree to the car
being a major source of entertainment

Rahul Govind
Australian School of Business
Example
From the SPSS output, we found
sample size: 154
Sample mean: 3.84
test statistic (t value): -1.12
confidence level: 5% (0.05)
probability of obtaining test statistic or greater value
(sig (2-tailed) - p value): 0.264

Rahul Govind
Australian School of Business

(Students) T-distribution

Rahul Govind
Australian School of Business
General Interpretation
Reject H0 if :
test statistic is not in the allowable range i.e. it is less
than 0.05
If you reject H0 this means:
based on the sample there is NO evidence to indicate
that the population parameter is not equal to the
hypothesised (in H0) value
In other words, the sample indicates that there is a
large probability that the population parameter is equal
to the hypothesised value

Rahul Govind
Australian School of Business

Example
From the SPSS output, we found
sample size: 155
Sample mean: 3.35
test statistic (t value): -4.02
confidence level: 5% (0.05)
probability of obtaining test statistic or greater value
(sig (2-tailed) - p value): 0.00

Rahul Govind
Australian School of Business
General Interpretation
Reject H0 if :
test statistic is not in the allowable range i.e. p value is
less than 0.05
If you reject H0 this means:
based on the sample there IS evidence to indicate that
the population parameter is not equal to the
hypothesised value
In other words, the sample indicates that there is a
large probability that the population parameter is NOT
equal to the hypothesised value.

Rahul Govind
Australian School of Business

Implications from our example


Since our test statistic is not in the allowable
range, we Reject H0
There is evidence which indicates that for
average customers buying a luxury car is about
being good to oneself.
How do we know this?

Rahul Govind
Australian School of Business
But do we stop here with our
question?
We now have some understanding of peoples views
on the performance of the company, but do all
groups people think the same?

Rahul Govind
Australian School of Business

Further Understanding of the Target


Market
What we have already seen
do respondents care about a certain attribute (diff from
mean?)
Not necessarily the mean

The next question How do sub-groups in the target


market differ? A very important question for the
STP process and the first step in marketing strategy.
Sex
Education
Income
Rahul Govind
Australian School of Business
Some Techniques for Understanding
the target market
No. groups/samples

>2
2
Groups/Samples related?
N Y
ANOVA
Independent Paired
t test t test more than 2
Two groups of groups of
responses that are Two groups of responses responses
tested as though they that originated from the
may come from same population
different populations. (people)
Rahul Govind
Australian School of Business

Independent samples t test


Examines differences between 2 groups
E.g. differences in behaviour, attitude
Assumes samples are independent, i.e. cannot
belong to both groups
Uses the information in the samples to test whether
the 2 populations are distinct or not -
Is there a difference in the population averages?

Rahul Govind
Australian School of Business
Independent samples t test (2)
Variables required:
One to describe the groups (e.g., gender, usage)
Needs to be nominal (preferably)
One variable you are interested to see of there are
differences in
Needs to be scale (preferably)

Rahul Govind
Australian School of Business

Question ..Hypothesis
So, continuing with the overarching question what are
peoples view of the company:
Are there differences between males and females in their
thoughts on issue number ________?

Break down into more specific ANALYSIS


questions:
Enables you to look at the broader question from
different angles
Do males and females differ in how important comfort in
a car is?
Do males and females differ in the weight they place on
their familys opinion?

Rahul Govind
Australian School of Business
Lets follow this through .
Analysis Question:
Do males and females differ in the weight they place on
their familys opinion?

Need to translate each Analysis


question into a Hypothesis

Rahul Govind
Australian School of Business

First. Look at the


questionnaire
Q: What is your sex?
Male/Female

When buying a new luxury car, my family's


opinion is very important to me. Issue 12 on
Survey
Strongly Strongly
disagree agree

Samples distinct or not?


Rahul Govind
Australian School of Business
Developing the Hypothesis
First step in testing hypotheses is to develop the
hypotheses to be tested.

Hypotheses are developed prior to the collection of


data & are part of a research plan.

Hypotheses allow a researcher to make comparisons


between two groups of respondents and to determine if
there are important differences between the groups.

Rahul Govind
Australian School of Business

Null hypothesis
No difference between the groups - in terms of
their average value
No difference between the population
parameters
i.e. 1 - 2 = 0 OR 1= 2
In other words, for our example:
H0: Males and females do not differ in the importance
of their familys opinion

Rahul Govind
Australian School of Business
Alternative Hypothesis
Is a difference between the groups - in terms of
their average value
Is a difference between the population
parameters
i.e. 1 - 2 0 OR 1 2
In other words, for our example:
H1: Males and females DO differ in the importance of
their familys opinion

Rahul Govind
Australian School of Business

Understanding the results


By looking at the sample means you may see a
difference
Is this difference significant (ie is it statistically
meaningful)?
Large enough to say that the average value of these 2
groups is different
Implying that the 2 populations are distinct.
Depends on sample size and confidence level

Rahul Govind
Australian School of Business
Output from Example
From the SPSS output we get
sample sizes: 56 and 87
variances equal or not? Equal (p=0.09)
means: 3.66 and 3.34
t value (test statistic): -1.04
p value (sig value): 0.3
Decision: Cannot Reject H0

Rahul Govind
Australian School of Business

Interpretation (1)
There is enough evidence from the sample to reject
the null hypothesis
Males and females do differ in their average opinion
on a car being a source of fun.
Go back to the sample descriptives to describe
what the difference is, i.e. which group, on average,
agrees more strongly?

Rahul Govind
Australian School of Business
Interpretation (2)
There is a significant difference between Males
and Females in their opinion of a car being a
source of fun and excitement. Females tend to
agree more strongly with this statement
compared to males(average 4.27 compared to 3.5,
p=0.009)

Rahul Govind
Australian School of Business

Paired t test
Examines differences between 2 groups of
responses
One set of people, 2 sets of answers
before and after experiment
same respondent answering 2 questions
Check on consistency of answers
Matching of questions
(eg difference in importance and performance of a
range of attributes)
Assumes some connection between the questions

Rahul Govind
Australian School of Business
Overarching question what do customers want in a
service company:
What attributes are important?

Analysis questions:
Are all attributes equally important?
What are customers views on the importance of pleasant
interiors and importance of staff being helpful?

Analysis question Hypothesis

Rahul Govind
Australian School of Business

Question to investigate
Analysis question: Is there a difference in how
customers rate the importance of speed and
mileage?

1 sample, 2 sets of responses

Both variables are scale (preferably)

Rahul Govind
Australian School of Business
Firstgo back to the data
Q: On the following scale, please indicate your view:

Critical Minor
Import Import
Car Attribute Safety 1 2 3 4 5 6 7

Car Attribute Mileage 1 2 3 4 5 6 7

Rahul Govind
Australian School of Business

Null Hypothesis
No difference in the responses to the 2 questions
Mean of the differences is 0
i.e. diff = 0

In other words, for our example:

H0: On average, there is no difference in the way


customers rate the importance of speed and mileage.

Rahul Govind
Australian School of Business
Alternative hypothesis
There is a difference in the responses to the 2
questions
Mean of the differences is not 0
i.e. diff 0

In other words, for our example:

HA: On average, there is a difference in the way


customers rate the importance of speed and mileage.

Rahul Govind
Australian School of Business

Understanding the results


By looking at the mean of the differences (ie
difference between a respondents answer to QA and
answer to QB),

Is this mean significantly different from 0?

Rahul Govind
Australian School of Business
Variable used in the
paired t-test

ID Speed Mileage Diff

1 3 4 -1
2 7 7 0
3 2 5 -3
4 5 5 0
5 6 7 -1
6 4 4 0
7 2 7 -5
Rahul Govind
Australian School of Business

First .
Look at sample descriptives to gain an
understanding of
what to expect, e.g. does the difference appear large or
small?
How to interpret, e.g. what variable may be rated higher
(or lower)

Rahul Govind
Australian School of Business
Output from the Example
From the SPSS output we get:
sample size: 154
Correlation: NS
Mean of the differences: -0.26
t value: -0.181
p value (sig value): 0.856
Decision: Cannot Reject H0

Rahul Govind
Australian School of Business

Interpretation
There is enough evidence from the sample to reject
the null hypothesis
There is a difference in the way people rate the
importance of the two attributes
Always check: Significant - but is it a meaningful
difference?
Is difference large enough to act upon or is it just
significant due to a large sample size?

Rahul Govind
Australian School of Business
The three tests from this module
One sample t-test:
Tests assumption about the mean of a variable
Independent sample t test:
tests for a difference between means of 2 unrelated
samples
Paired t test:
tests whether there is a difference in the way one sample
has answered 2 questions.

Rahul Govind
Australian School of Business
ANOVA Setup and Analysis

Examining Multiple Groups

Understanding sub-groups within


your target market

In the previous module we looked at the target


market in terms of 2 distinct groups
What if there are more than 2 distinct groups of
people?
Do a number of 2 group tests? X
Leads to an increase in the overall error (significance level)

Use ANOVA (Analysis of Variance)

Rahul Govind
Australian School of Business
Example:
Broad question:
Let us analyse, one at a time, what attributes are
important to different types of customers?

Analysis Question: Does the importance of


various car attributes differ according to age?
Age recoded into 4 groups or constant at 5:
17-35yrs
36-45yrs
46-55yrs
55yrs+

Rahul Govind
Australian School of Business

One-Way ANOVA
Lets take one attribute
Importance of gas mileage
rated from 1 (critical importance) to 7 (minor importance)

Therefore our question is now:


Does the average rating of importance of
mileage differ by age?

Rahul Govind
Australian School of Business
One-Way ANOVA
Only one categorical variable (a single factor)

Several levels (categories) for that factor

The typical hypothesis tested through ANOVA is that the


factor is irrelevant to explain differences in the dependent
variable (i.e. the means are equal, as in t-tests)

Apart from the tested factor(s), the groups


should be safely considered homogeneous
between each other

Rahul Govind
Australian School of Business

How does ANOVA work?


Tests to see whether the groups have the same
average values (ie come from the same
population)
Null hypothesis (Ho): all the means are equal
(Ha) : at least one mean is different

2 variables:
One defines the groups (indept var or factor) variable is
..
The other defines what you have measured (dependent
var) variable is ..

Rahul Govind
Australian School of Business
Based on the variation between and within the groups

Between variance
W (bw levels of respondents)
i
t Age <35 Age 35- Age > 55
h
(Responses to a Question)

55
i 1
n

2
v
a
3
r
i
a 4
n
c
e 5

Rahul Govind
Australian School of Business

(Sudaman & Blair 1998)

Within

Between

Rahul Govind
Australian School of Business
What does this mean for us?
Interested in whether the between groups variation is
much greater than the within groups variation.

If it is, then we have evidence that the groups do


have different average values

Then interested in discovering which group, or


groups, have different average values

Rahul Govind
Australian School of Business

The basic principle of the ANOVA


If the variation explained by the different factor
between the groups is significantly more relevant
than the variation within the groups, then the factor is
assumed to be statistically relevant in explaining the
differences

Rahul Govind
Australian School of Business 10
The test statistic
The test statistic is computed as:

sB2 Variance between groups


F 2
sW Variance within groups
This test statistic compares the weight of the
variance explained by the factors to the weight of the
variance not explained by the factors

Rahul Govind
Australian School of Business 11

Hypotheses
Null Hypothesis:
The average rating of the importance of .does not
differ by age
i.e. Average (17-35) = average (36-45yrs) = average (46-55yrs)
.= average(55yrs +)
1 = 2 = 3 = 4
Alternate Hypothesis:
The average rating of the importance of differs by age
i.e. At least one average of the subgroups is different from the other
averages
Eg. 1 2 3 OR 1 = 2, however 3 and 4 are
different etc
Rahul Govind
Australian School of Business
Underlying Assumptions
Normality of the dependent variable
Plots (primarily) and other tests
Homogeneity (equality) of variance across the
groups
This can be relaxed.

Rahul Govind
Australian School of Business

Example
SPSS output:
Test statistic - F:
p value (sig value):

Decision: ..H0

Rahul Govind
Australian School of Business
Interpretation
What does this mean?
There is evidence from our sample indicating that
the importance of .. differs by age.

Where do we go from here?

Rahul Govind
Australian School of Business

Question ...
Which group or groups are possibly different?
This is not given through the hypothesis test!
For our case, we can examine the sample means - is
this reliable?

Rahul Govind
Australian School of Business
Need more information than your own ability to
distinguish between numbers!
Especially if you have many sub-groups, as is the case
here!

Rahul Govind
Australian School of Business

Use of Post Hoc Tests


Allows you to understand which group or set of
groups are different in average value to the rest.
The test you choose depends on whether the
groups can be assumed to have equal variance
or not.
Equal var: Tukey, Duncan, Scheffe
Unequal var: Tamhanes, Dunnetts
Re-examine the SPSS output
Conclusion now?

Rahul Govind
Australian School of Business
Levenes test
It tests the null hypothesis that the population
variances are equal across all the sub-groups being
examined.

If p<0.05, then population variances are significantly


different.

Rahul Govind
Australian School of Business

Interpretation
The importance of speed of repairs does differ by
age. We just identified which ones are different!

Rahul Govind
Australian School of Business
Warning!!
Do NOT do a series of 2 sample tests to discover
these differences!
You will artificially (and incorrectly) increase the chance of
finding a statistically significant difference in your sample!

YOU CAN HOWEVER COMPARE THE MAXIMUM


AND MINIMUM MEANS to test if a difference exists.

Rahul Govind
Australian School of Business

NOTE:

Only go to the Post Hoc tests if


the ANOVA result is significant
(i.e. p value<0.05)

Rahul Govind
Australian School of Business
Main Points from ANOVA
ANOVA helps understand differences between
more than 2 independent groups
Post Hoc tests will identify which group(s) are
different
Choice of post hoc test depends on whether equal
variances can be assumed or not

Rahul Govind
Australian School of Business
Repeated Measures
ANOVA

Examining Multiple Groups - 2

Outline
Further Expanding our toolbox of techniques
Extension beyond 2 sets of responses:
Repeated Measures ANOVA

Rahul Govind
Australian School of Business

1
Extension beyond 2 sets of
responses
i.e., looking beyond the paired t-test
Difference in more than 2 sets of responses by the
same individuals
Need to account for the fact that the responses are
not independent
i.e., each respondent has provided information on each
question
Need to use Repeated Measures ANOVA
Through General Linear Models

Rahul Govind
Australian School of Business

What is GLM
The general linear model (GLM) is a statistical linear
model. It may be written as
Y = XB + U

where
Y is a matrix with series of multivariate responses
X is a design matrix
B is a parameter matrix (to be estimated)
U is a matrix containing errors

The errors are usually assumed to follow a multivariate


normal distribution.
Rahul Govind
Australian School of Business

2
Examples of questions
Has consumers attitude towards brand X changed
over time (measured monthly for the last 6 months )?
Does on-going training improve participants skill
development?
Is there a difference in respondents liking of the four
brands of soft drink?

Rahul Govind
Australian School of Business

Question Hypothesis
Broad Question:
What attributes are important to consumers?
Analysis Q:
Are all the attributes equally important?
H0: There is no difference in the average rating of
importance of the listed attributes.
H1: There is a difference in the average rating of
importance of the listed attributes.

Rahul Govind
Australian School of Business

3
Steps to RMA

General Linear Model repeated measures


Define within-subject factor.. In our case it is _____?
Give the measure a name what does the scale
signify? Importance of the attribute.
Click Define and identify all the variables being
compared.
Click on the options button.
Display means for attributes
Compare main effects Bonferoni/Bonferroni

Rahul Govind
Australian School of Business

Interpretation
There is a significant difference peoples average
importance of the various attributes of a customer service
company (p<.05).

Rahul Govind
Australian School of Business

4
Interpretation
Attribute 1 is statistically different in importance from
attributes 3,4,5,7..

Rahul Govind
Australian School of Business

Main Points from Repeated ANOVA


Repeated measures ANOVA helps you to
understand differences between more than 2
related answers from the ________
Once again, post hoc tests will identify which
group(s) are different

Think about what the results are telling you


try to group the like variables together.

Rahul Govind
Australian School of Business

5
Module 6 - Exploring
Relationships

Relationships in general
Revisiting Crosstabs and Chi-
square
Introduction to Correlation

Outline

Where are we what can we currently say about our data?


Types of relationships between variables
Relationships between 2 nominal variables
Revisiting cross tabs and Chi-square
Correlation
Pearson correlation
Spearman rank correlation
Introduction to Regression

Rahul Govind
Australian School of Business

1
Recap

What types of questions can we answer so far?


Examples:
What is the demographic profile of our sample?
Do they have definite opinions on topics?
On average, do different subgroups (eg
demographic, geographic, usage) differ in their
response or behaviour?
Are the general patterns consistent or not?

Rahul Govind
Australian School of Business

Lets expand our investigation further:

What other questions may we ask ourselves about the data


set we are investigating?
What attributes influence overall performance
perceptions of airlines?
What factors are associated with readings more
books?
What factors impact on satisfaction or likelihood of
purchase?
These are examples of questions for testing the
association/relationship between variables

Rahul Govind
Australian School of Business

2
Recall our types of analysis

Descriptive Analysis
E.g., Means, medians, frequency, standard deviation
Inferential and Difference Analysis
E.g., T-tests, ANOVA

Associative Analysis
E.g., Correlation, Crosstabs with chi-square

Predictive Analysis
E.g., Regression

Rahul Govind
Australian School of Business

Associative Analysis

Used to determine systematic relationships among


variables

Are the variables related?


If so, how are they related?

Rahul Govind
Australian School of Business

3
Types of relationships

Non-monotonic
Monotonic
Linear
Curvlinear and non-linear

Rahul Govind
Australian School of Business

1. Types of relationships - Non-monotonic

Presence or absence of a variable is associated with


presence or absence of another variable
No discernable direction to relationship (or not interested in
exploring), but a relationship exists
Variables:
type of data at least one is nominal (or made
nominal)

Rahul Govind
Australian School of Business

4
Examples

Is there an association between gender and type


information source used?
Is there an association between type of student and the
type of environmentally friendly product they would
purchase?

Rahul Govind
Australian School of Business

2. Types of relationships - Monotonic

Can only assign a general direction to the association


between two variables
Monotonic increasing
one variable increases as the other also increases
Monotonic decreasing
one variable decreases as the other variable also
decreases
No indication in the amount of change
Variables: type of data both ordinal (subject to some trivial
exceptions)

Rahul Govind
Australian School of Business

5
Examples

Is there a relationship between age and amount of


influence a child has on choosing their clothes?

Is there a relationship between education level and interest


in different sports at the Olympics?

Is there a relationship between age and how often people


purchase a cause-related product?

Rahul Govind
Australian School of Business

Rahul Govind
Australian School of Business

6
This is also Monotonic

Rahul Govind
Australian School of Business

3. Types of relationships - Linear

A straight-line association between two variables


More precise and more information than a monotonic
relationship
The amount of change is able to be calculated
y = a + bX
Variables:
type of data both scale (metric) we will learn how to deal
with other variables types later.
E.g. Does likelihood of purchasing a brand X increase
with an increase in attractiveness of the package?

Rahul Govind
Australian School of Business

7
Example

Sales vs Price

40
35
30
25

sales
20 Sales
15
10
5
0
0 1 2 3 4 5 6
price

Rahul Govind
Australian School of Business

4. Types of relationships - Curvlinear

The association between the 2 variables is described by a curve rather than a straight line.

Eg U shape, J shape
Variables:

type of data both scale (metric)

Rahul Govind
Australian School of Business

8
Kuznets curve

Rahul Govind
Australian School of Business

J Shaped Curve

Rahul Govind
Australian School of Business

9
Example - curvlinear relationship

Income vs Age

70
60
50
Income

40
30
20
10
0
0 20 40 60 80 100
Age

Rahul Govind
Australian School of Business

Describing relationships:- Characteristics

Presence
a relationship exists between 2 variables
Direction
is the relationship is positive or negative
Strength of association
strong, moderate, weak or nonexistent
how consistent is the relationship

Rahul Govind
Australian School of Business

10
Summary of relationship characteristics

Presence Direction Strength

Nonmonotonic X X
Monotonic X
Linear
Non-linear

Rahul Govind
Australian School of Business

So, How do we test for these associations?

We can use crosstabs with Chi-square and Correlation.

The actual procedure (test) depends on the type of data we


have!

Rahul Govind
Australian School of Business

11
Examining Non-monotonic relationships -
Recap on Crosstabs & Chi square
Crosstabs with Chi-square are used to assess the
presence or not of a non-monotonic relationship

Recall
Variables are nominal/ordinal
Counts the number of observations in each possible sub-
group or cell

Rahul Govind
Australian School of Business

Example:

Question: Is there any association between region and


type of transaction?

Is this association significant?


Type of association?
Null: There is no association between region and type of
transaction.
Alternative: There is an association between region and
type of transaction.

Rahul Govind
Australian School of Business

12
From the example

SPSS output
p level: 0.005

Reject null

There is evidence to say that there is an association.
Need to now go and describe the association How?

Rahul Govind
Australian School of Business

Remember

Chi-square only tells if there is an association or not


It does NOT describe the association or say how strong it is
Need to gain this information through other means
Revisit the crosstab
Use other statistics

Rahul Govind
Australian School of Business

13
Linear relationships - Correlation Analysis

Measured through the correlation coefficient: r


Range from -1 to +1
Absolute size of coefficient indicates the amount of
association - its strength
Sign indicates the direction
Correlation based on the degree of covariation among the
variables

Rahul Govind
Australian School of Business

The three steps of correlation and regression


1. Look at the scatter plot (maybe run a correlation)
2. Conduct the regression
3. Interpret the results

Rahul Govind
Australian School of Business

14
What different
levels of
correlation look
like.

(Churchill 2000, Fig 21.6)

Rahul Govind
Australian School of Business

Rules of thumb

0.81 - 1.0 strong


0.61 - 0.8 medium
0.41 - 0.6 weak
0.21 - 0.4 very weak
0.00 - 0.2 none

Rahul Govind
Australian School of Business

15
Pearson Correlation

Variables need to be scale (very few exceptions to this).


Measures the degree of association between 2 variables
Correlation
NOT cause and effect
ONLY relationship between the 2 variables
ONLY LINEAR relationship
Remember there are a number of possible meanings
of a correlation of 0.

Rahul Govind
Australian School of Business

Example

Broad question: Is there any relationship between


perceptions of the customer service company?

Analysis Q:Is there an association between importance


of staff competence and value for money??

Null: There is no relationship between the cost of


maintenance and need for high gas mileage.
ie Null: = 0

Alternate: There is a relationship between the cost of


maintenance and need for high gas mileage.

Rahul Govind
Australian School of Business

16
Interpretation

Correlation coefficient: r = 0.391


Strength of association?
Significant: Yes, p value = 0.000

There is a positive relationship between the importance


of staff competence and importance of value for money
(r=0.391, p=0.000), however, the relationship is weak.
The association is linear.

Rahul Govind
Australian School of Business

Relating Correlation and Regression

In many situations, people just do not want to describe the


relationship between two variables, they want to go
deeper to predict the effects of one variables on the
other.
Therefore you can extend beyond correlation to
regression!
Lets look at a plot of the relationship between 2 variables

Rahul Govind
Australian School of Business

17
Scatter Plot
What can it tell us?
How precisely can one describe the relationship?

3
Number of Cars

2
RECMD46E

0
0 1 2 3 4 5 6 7

TPERF46C
Number of Kids
Rahul Govind
Australian School of Business

Regression

What is it?

Specification of the relationship (linear)


Y= a + b*X
Terminology

Predictor variables: the independent or X variables


metric variables
Criterion variables: the dependent or Y variable
MUST be a scalar variable (in simple regression)
Error or residual: difference between the Y value you have
observed and the Y value predicted from the regression line.

Rahul Govind
Australian School of Business

18
Principles of regression

Equation: Y = a + bX +
is the error
Estimation through MLS:

i.e. finds the line which minimises the sum of the squared
errors over all the x values (e2 = 0)
Assumptions
error () has mean of 0
variance of the error terms is constant
variance of errors is independent of the values of X
errors are normally distributed

Rahul Govind
Australian School of Business

Y pred 2

e = Yactual - Ypred
RECMD46E

Y actual 1

0
0 1 2 3 4 5 6 7

TPERF46C Scatter Plot simple case

Rahul Govind
Australian School of Business

19
Principles of regression (assumptions)

Rahul Govind
Australian School of Business

Example
regression with 1 independent variable

Analysis question:
Is the overall satisfaction with the car usefully predicted by
satisfaction with the attribute comfort?
Dependent (Criterion) variable (Y):
Overall satisfaction
Predictor variable (X):
Comfort of the car

Rahul Govind
Australian School of Business

20
First:
Look at:
correlations
scatter plot
Why?

Rahul Govind
Australian School of Business

Output

Look at:

R2: coefficient of determination


percentage of the change in the Y variable explained by the
changes in the X variables
Results of ANOVA
significance of the whole regression procedure
Significance of coefficient
Does this variable make a significant contribution in explaining
the Y variable?
Size of coefficient

Rahul Govind
Australian School of Business

21
Equation -

Resulting equation:
Ov Sat = 1.878 + 0.65 * Comfort Satisfaction
Interpretation of equation and coefficient:
For each unit change in staff helpful, overall satisfaction
increases by 0.65 units

Perceptions of staff helpfulness is significant in predicting


overall satisfaction(p level for this variable: p=0.000); this
relationship is moderate (R2= 23.2%)

Rahul Govind
Australian School of Business

22
Module 7 - Exploring
Relationships

Multiple Regression (Multi-variate Analysis)


Explore a couple of procedures for doing this
Understand possible problems, in particular multi-
collinearity, and how to deal with them
Extending the applicability of regression
Incorporation of non-metric independent variables

Recap simple regression


Dept var (Y) what you are interested in
predicting/estimating
Indept var (X) have control over these

Want: correlation between DV and IV

Regression: Finds line of best fit


Precisely describes linear relationship between DV &IV
Is the IV useful in predicting Y (does it impact Y)?

Rahul Govind
Australian School of Business

1
Extend this to Multiple Regression
Marketing relationships are complex need to go
beyond simple regression
Equation:
Y = a + b1X1 + b2X2 + b3X3 +
Additional assumption now:
Predictor or independent variables (Xs) are uncorrelated
If this is not obeyed, you have the problem of
Multicollinearity

Rahul Govind
Australian School of Business

Multicollinearity
Why is this a concern?
Effects the significance of the coefficients
Reduces the efficiency of the estimates (of the
coefficients)
Creates problems interpreting the coefficients
Subsequent use of the coefficients
Problems if want to use as a basis of what to use in
strategy - Since here you are interested in the
importance of each variable

Rahul Govind
Australian School of Business

2
Multicollinearity

How do you test this?


Correlation between pairs of variables
Condition Index
If CI >15, there is possibly a problem
If CI >30, there is definitely a problem
Tolerance (amount variability in selected IV not explained
by other IVs)
If Tolerance <0.1 or 0.2
VIF >10 or 5

Rahul Govind
Australian School of Business

Procedures for Multiple Regression


1. ENTER method:

You have control over variables put in and taken out of


regression equation
2. STEPWISE methods:

Variables enter and leave regression equation


according to predefined rules

Rahul Govind
Australian School of Business

3
Procedure 1
ENTER Method

We go through this so that you understand


the process behind multiple regression.

Rahul Govind
Australian School of Business

Example - ENTER method:


Research Question
What impacts on customers overall satisfaction?
Analysis question:
Are perceptions of the customer service companys
performance useful in predicting overall satisfaction?
If so, what aspects of performance are the most
influential?
Criterion/explained/Dependent Variable?
Predictor/Independent/Explanators variables?
Method:
Enter method

Rahul Govind
Australian School of Business

4
Output

Fit of equation?
Usefulness of the regression procedure?
Which variables significant?
Problem of multi-collinearity?
Look at VIF>8
What do you do now?
Remove the offending variable(s) and re-run the
regression

Rahul Govind
Australian School of Business

Interpretation of equation
Ov Sat = Constant + b x1 + b x2 ..

Interpretation of coefficients

The average change in Y for a unit change in the X, given


that all other X variables are held constant
For our example:
Overall satisfaction will increase by ..

Rahul Govind
Australian School of Business

5
Interpretation

Relative importance of the variables


variables may be measured on different scales
Use standardised betas (and/or t-values) to
compare importance of the different X variables
What variable is more influential in our equation?

Rahul Govind
Australian School of Business

Report Interpretation - an illustration


Certain perceptions of the performance of the
customer service company have been found to
impact on a customers overall satisfaction. For
instance, positive perceptions of X1, and X2
increase a persons overall satisfaction (R2 = 0.xx),
with perception of X1(b1=0.xx) having the strongest
impact. X3 and X4 were some factors that did not
impact on overall satisfaction (p>0.05).
Rahul Govind
Australian School of Business

6
Conducting Regression Analysis
Scatter plots

Choose variables Run regression

NO Significance of
Rethink overall procedure
variables
Overall fit of equation
NO
Assumptions obeyed?
NO

All predictor variables significant?

Rahul Govind Use equation


Australian School of Business

Procedure 2
STEPWISE REGRESSION

This is the procedure you would be expected


to understand and use.

Rahul Govind
Australian School of Business

7
What is Stepwise Regression?

Used for sorting through a number of independent


variables

From your set of indept variables, it will provide you


with a subset of variables which are useful (i.e.
significant) in prediction your dependent variable

Rahul Govind
Australian School of Business

Stepwise regression
Various types:
Forward - one in, then adds one at a time
Backwards - all in, then eliminates one at a time
Stepwise - variables can enter or leave the
equation at each step depending on their
contribution to explaining the dependent variable

Rahul Govind
Australian School of Business

8
Stepwise regression

Warning:
may not produce the best equation
Variables in equation are related to the
multicollinearity present

Therefore, does not replace common sense!

Rahul Govind
Australian School of Business

Example:

Analysis Q: What combination of the following


variables are useful in predicting the overall
satisfaction with the customer service company?

How do the results compare our previous results


(using Enter method)?

Rahul Govind
Australian School of Business

9
Testing Assumptions of Regression

Rahul Govind
Australian School of Business

Assumptions of regression

Variables normally distributed


Errors have equal (constant) variance
Errors uncorrelated
Independent variables uncorrelated

Rahul Govind
Australian School of Business

10
Definitions and testing

Heteroscedasticity
variance of the errors is not constant
Testing: plot residuals (y axis) against predicted y (on X
axis)
Autocorrelation
the size of error is related to time; errors are not
independent
Testing: plot residuals (y axis) against time(X axis)
Normality
Testing: normal probability plot

Rahul Govind
Australian School of Business

Testing the Assumptions

Plots of the residuals in SPSS


Plot standardised residual against standardised y
Plot standardised residual against time

Rahul Govind
Australian School of Business

11
Heteroscedasticity

Rahul Govind
Australian School of Business

Autocorrelation

Rahul Govind
Australian School of Business

12
Ideal situation

Rahul Govind
Australian School of Business

Splitting data based on levels of IV


Does the relationship being studied differ for
Men and women?
The four income levels?
The five education levels?
The five age categories?

Should we look at the aggregate as well as the case


scenarios?

Rahul Govind
Australian School of Business

13
Expanding the Applicability
of Regression

Rahul Govind
Australian School of Business

Incorporating non-metric
Independent Variables
Usual scale for independent variable?
Non-metric variables - incorporated via dummy
variables
Dummy variable is a variable that only takes the
value 0 or 1
Dependent variable must always be metric!

Rahul Govind
Australian School of Business

1
Defining a dummy variable

Number of dummy variables needed equals the


number of categories minus 1
ie No. dummy variables = No. categories - 1
Therefore, if you wish to incorporate gender into
a regression, you would only need 1 dummy
variable, D1, such that:
D1 = 1, if the respondent was male
= 0, if the respondent was female

Rahul Govind
Australian School of Business

X variables Y Variable

Resp Gender Dummy_ Quality Satis


gender
1 M 1 4 5
2 M 1 4 5
3 F 0 3 3
4 F 0 2 1
5 M 1 5 4

Rahul Govind
Australian School of Business

2
Dummy variable regression

Overall Equation:
Satis = a + b1*Dummy_gender +b2*Quality

This reduces to -
Males:

Satis = a + b1 +b2*Quality
Females:

Satis = a + b2*Quality
The non-metric
variable has the
effect of changing
the constant for each
category.

Rahul Govind
Australian School of Business

Creating Binary Dummy variables

Rahul Govind
Australian School of Business

3
5

4.5

3.5

2.5 Satis-female
Satis-male
2

1.5

0.5

0
0 1 2 3 4 5 6 7 8

Rahul Govind
Australian School of Business

Incorporating dummy variables in


regression
Explains case when you have more than 2 categories for
your nominal/ordinal variable (Remember we are talking
about INDEPENDENT variables!!)

Rahul Govind
Australian School of Business

4
Main Points from Module

Regression provides the specification (i.e.


equation) of the relationship between variables
for our cases, linear association
Breaking of assumptions leads to inaccurate
interpretation and unreliable results
Check for multicollinearity
Know how to identify the key predictors and how
to interpret a regression equation

Concentrate on understanding Stepwise


regression
Rahul Govind
Australian School of Business