You are on page 1of 20

For Class Circulation ONLY

22-12-2014

Session 1: Basic Scales and Data


Preparing
Advanced Marketing Research
(with SPSS)

Dr. Vikas Goyal

Indian Institute of Management Indore


AMR @ Dr. Vikas Goyal

Key Concepts
Construct (or Concept or Variable)

A generalized idea about a class of objects, attributes,


occurrences, or processes.
Relatively concrete constructs
Age, gender, number of children, education, income

Relatively abstract constructs


Preference, brand-loyalty, satisfaction, happiness, etc.

Measurement

Figuring out how to measure what you want to measure


What (or how many) measures to keep? Ask open/close-ended
Qs?

Scaling

The generation of a continuum upon which measured


objects are located.
Ask actual income/askAMRbucketed-income?
@ Dr. Vikas Goyal

For Class Circulation ONLY

22-12-2014

Scale Characteristics
Description
By description, we mean the unique labels or
descriptors that are used to designate each
value of the scale. All scales possess
description.
Order
By order, we mean the relative sizes or
positions of the descriptors. Order is denoted
by descriptors such as greater than, less than,
and equal to.
AMR @ Dr. Vikas Goyal

Scale Characteristics
Distance
The characteristic of distance means that
absolute differences between the scale
descriptors are known and may be expressed
in units.
Origin
The origin characteristic means that the scale
has a unique or fixed beginning or true zero
point.
AMR @ Dr. Vikas Goyal

For Class Circulation ONLY

22-12-2014

Scaling
Type of scale depends on type of data!

Type of Scale
Nominal
Ordinal
Interval

Information content
increases

Ratio

AMR @ Dr. Vikas Goyal

Classifying Scaling Techniques


Scaling
Techniques
Noncomparative
Scales

Comparative
Scales

Paired
Comparison

Constant
Sum

Continuous
Rating Scales

Rank
Order

Itemized
Rating Scales

Likert
Stapel
AMR @ Dr. Vikas Goyal

Semantic
Differential

For Class Circulation ONLY

22-12-2014

SPSS-Variable View
Column

What it Means

This column provides the name of the variable. Unlike older


versions, newer versions of SPSS are not limited to 8 characters,
Name
but lengthy descriptions should not be included in the Name. They
go in the Label column.
This column indicates the type of variable that is reflected in this
particular row. There are 8 options: Numeric, Comma, Dot,
Scientific notation, Date, Dollar, Custom currency, and String. Most
variables beginning users will encounter are either Numeric or
Type String variables. Numeric variables are numbers that represent a
value. String numbers are text and can only be treated as such. As
a result, very few manipulations can be performed on them in
SPSS.
Width

This column indicates the number of spaces available for the


variable values.
AMR @ Dr. Vikas Goyal

SPSS-Variable View
Column

What it Means

Decimals This column allows you to control the number of characters after the

Label

decimal place.
This column allows you to provide a more extensive description of the
variable.

This column allows you to provide a key for what the numbers of a
numeric variable may represent (e.g., 1=Female, 2=Male).
This column allows you to indicate whether there are any missing
Missing values in a variable. Values marked as missing are excluded from
analyses in SPSS.
Values

Columns This column indicates the total number of columns a variable's values

may have.
Align
This column indicates the alignment of the variable in the Data View.
This last column indicates the level of measurement of the variable.
Measure There are three from which you can choose: Nominal, Ordinal, and
Scale.
AMR @ Dr. Vikas Goyal

For Class Circulation ONLY

22-12-2014

Data Preparation
Missing Value Treatment
User-defined missing values
System-missing values

Coding
Pre-coded
Coding open-ended questions
Re-coding

Compute Variable
Sub-setting data
Select-if
Split file
AMR @ Dr. Vikas Goyal

Session 2-3: Basic Analysis


(Freq. Dist. & Cross Tab)

Advanced Marketing Research


(With SPSS)

by - Dr. Vikas Goyal

Indian Institute of Management Indore


AMR @ Dr. Vikas Goyal

For Class Circulation ONLY

22-12-2014

Frequency Distribution
In a frequency distribution, one variable is
considered at a time.
A frequency distribution for a variable produces
a table of frequency counts, percentages, and
cumulative percentages for all the values
associated with that variable.
Frequency Distribution (single variable all levels)
for descriptive stats of data
AMR @ Dr. Vikas Goyal

Statistics Associated with


Frequency Distribution:
Measures of Location
Central Tendency
Mean, Median-middle value, Mode-most frequent

Measures of Variability

Range (Largest-Smallest)
Deviation from the mean
Variance Mean Squared Deviation
Std. Deviation (s) root of variance
Coefficient of Variation (s/mean)
Unitless & expressed as %
Measure of relative variability (can be used in segmentation)

Measures of Shape
Skewness
Kurtosis (zero for normal)
AMR @ Dr. Vikas Goyal

For Class Circulation ONLY

22-12-2014

Symbols for Population and Sample


Variables
Variable

Population

Sample

Mean

Proportion

Variance

s2

Standard deviation

Size

Standard error of the mean

Sx

Standard error of the proportion

Standardized variate (z)

(X-)/

Coefficient of variation (CV)

Sp

(X-X)/S

S/X

AMR @ Dr. Vikas Goyal

Cross-Tabulation
While a frequency distribution describes one variable at a
time, a cross-tabulation describes two or more variables
simultaneously.
Cross-tabulation results in tables that reflect the joint
distribution of two or more variables with a limited number
of categories or distinct values.
Cross Tab (multiple variables all levels) for exploring interdependence of variables, for example:
How many brand-loyal customers are males?
Is product ownership related to income levels?
Is familiarity with the new product related to age and
education levels?
AMR @ Dr. Vikas Goyal

For Class Circulation ONLY

22-12-2014

Gender and Internet Usage


Gender
Internet Usage

Male

Female

Row
Total

Light Users (1)

10

15

Heavy Users (2)

10

15

Column Total

15

15

AMR @ Dr. Vikas Goyal

Pet Adoption
Gender
Male

Row
Total

10

50

60

30

10

40

Pet

Female

Dog
Cat

Column Total

40

60
0

100

What can be concluded based on this??


How generalizable is this?
How reliable is this for the population?
AMR @ Dr. Vikas Goyal

For Class Circulation ONLY

22-12-2014

Chi-Square
Chi-Squared Test
comprehensive analysis rather than random chance
testing statistical significance of observed association.

It is based on actual count and not on


percentages.
The chi-square statistic ( 2 ) is used to test the
statistical significance of the observed association
in a cross-tabulation.

AMR @ Dr. Vikas Goyal

Chi-Square stats
n n
fe = nr c
where

nr
nc
n

= total number in the row


= total number in the column
= total sample size

= summation over all cells [ (fo - fe)2/fe ]

Contingency coefficient:
C = ( 2 / 2 + n)1/2
Measure of the strength of association between the
variables
Ranges from 0 - 1
AMR @ Dr. Vikas Goyal

For Class Circulation ONLY

22-12-2014

Chi-Square stats
Cramers V = [(2/n)/(Min (r-1),(c-1))]1/2
Measures strength of association, for any sized
table
Range from 0 - 1
Phi-Coefficient = (2 / n)1/2
Degree of Freedom (df) = (c-1)*(r-1)

AMR @ Dr. Vikas Goyal

Chi-Square stats
Chi-Square = 34.0278
Contingency coefficient = 0.5038
Cramers V = 0.5833
Phi-Coefficient = 0.5833
DOF = (2-1)*(2-1) = 1

AMR @ Dr. Vikas Goyal

10

For Class Circulation ONLY

22-12-2014

Degrees
of
Freedom
(df)

Probability (p)
0.95 0.90 0.80 0.70 0.50 0.30 0.20 0.10 0.05

1
2
3
4
5
6
7
8
9
10

0.01

0.001

0.004

0.02

0.06

0.15

0.46

1.07

1.64

2.71

3.84

6.64

10.83

0.10

0.21

0.45

0.71

1.39

2.41

3.22

4.60

5.99

9.21

13.82

0.35

0.58

1.01

1.42

2.37

3.66

4.64

6.25

7.82

11.34

16.27

0.71

1.06

1.65

2.20

3.36

4.88

5.99

7.78

9.49

13.28

18.47

1.14

1.61

2.34

3.00

4.35

6.06

7.29

9.24

11.07

15.09

20.52

1.63

2.20

3.07

3.83

5.35

7.23

8.56

10.64

12.59

16.81

22.46

2.17

2.83

3.82

4.67

6.35

8.38

9.80

12.02

14.07

18.48

24.32

2.73

3.49

4.59

5.53

7.34

9.52

11.03

13.36

15.51

20.09

26.12

3.32

4.17

5.38

6.39

8.34

10.66

12.24

14.68

16.92

21.67

27.88

3.94

4.86

6.18

7.27

9.34

11.78

13.44

15.99

18.31

23.21

29.59

Nonsignificant

Significant

AMR @ Dr. Vikas Goyal

Conclusion
As Chi-Square (34.02) > threshold Value (3.84,
at p=0.05)
There is a significant relationship between the
gender and pet adoption behaviour.
The strength of this relationship is about
58.33%

AMR @ Dr. Vikas Goyal

11

For Class Circulation ONLY

22-12-2014

Session 4:
Statistical Hypothesis Testing

Advanced Marketing Research


(With SPSS)

by - Dr. Vikas Goyal

Indian Institute of Management Indore


AMR @ Dr. Vikas Goyal

Statistical Hypothesis Testing


Hypothesis is a statement that tries to explain
observed results of a phenomenon
Hypothesis may be testable or non-testable
Sales promotions leads to higher sales
There are other earth like planets; etc.

In marketing research projects we mostly deal


with hypotheses that are statistically testable
based on the observable evidences/results.
AMR @ Dr. Vikas Goyal

12

For Class Circulation ONLY

22-12-2014

The Null Hypothesis


The null hypothesis refers to a specified value of
the population parameter (e.g., , , ), not a
sample statistic (e.g., X ).
A null hypothesis may be rejected, but it can
never be accepted based on a single test.
In classical hypothesis testing, there is no way to
determine whether the null hypothesis is true.

A null hypothesis is a statement of the status


quo, one of no difference or no effect. If the null
hypothesis is not rejected, no changes will be
made.
AMR @ Dr. Vikas Goyal

The Alternative Hypothesis


An alternative hypothesis is the one which
proposes to explain the observed results,
when the null hypothesis is being rejected (or
is not able to explain the observed results).
Alternative hypothesis is drawn in such a way
that rejection of null hypothesis implies that
alternative hypothesis can be accepted.

AMR @ Dr. Vikas Goyal

13

For Class Circulation ONLY

22-12-2014

The Alternative Hypothesis


Different alternative hypothesis can be drawn
from the same Null hypothesis:
H0: Higher prices will lead to lower sales
H1: price has no effect on sales
H1: Higher price leads to higher sales, due to
higher price-quality perception

AMR @ Dr. Vikas Goyal

Steps Involved in Hypothesis Testing


Formulate H0 and H1
Select Appropriate Test
Choose Level of Significance
Collect Data and Calculate Test Statistic

Determine Probability
Associated with Test
Statistic
Compare with Level of
Significance,

Determine Critical Value of


Test Statistic TSCR
Determine if TSCAL falls
into (Non) Rejection
Region

Reject or Do not Reject H0


Draw Marketing Research Conclusion
AMR @ Dr. Vikas Goyal

14

For Class Circulation ONLY

22-12-2014

Example
A firm is testing the effect of a new kind of Sales
Promotions (SP) on the sales. The firm offered
the SP in 100 stores and recorded they levels of
sales.
It is known that the mean level of sales without
the SP = 61 Cr./month
The obtained results:
The mean sales with SP across 100 stores = 66.7
Sample Std. Deviation = 18.69 Cr./month
Did SP had an effect on the sales?
AMR @ Dr. Vikas Goyal

H0: SP had no effect on sales, i.e.


mean sales = 61 (even with the SP are offered)

H1: SP has an effect on the sales, i.e.


mean sales is not = 61 (when SP are offered)

H1: SP has an effect on the sales, i.e.


mean sales > 61 (when SP is offered)
Assuming that H0 is true, what is the
probability of obtaining the observed results?
i.e. P-value:
AMR @ Dr. Vikas Goyal

15

For Class Circulation ONLY

22-12-2014

If this p-value is very small (traditionally 5% or


1%), it suggests that the observed data is
inconsistent with the assumption of H0, (i.e. of
no relationship), and
Thus H0 can be rejected and it can be said that
a relationship does exist.

AMR @ Dr. Vikas Goyal

The Error
When we draw inference about population based
on the sample, there is risk of making two types
of errors.
Type I Error
Type I error occurs when the sample results lead
to the rejection of the null hypothesis when it is
in fact true.
The probability of type I error ( ) is also called
the level of significance.
AMR @ Dr. Vikas Goyal

16

For Class Circulation ONLY

22-12-2014

The Error
Type II Error
Type II error occurs when, based on the sample
results, the null hypothesis is not rejected when it
is in fact false.
The probability of type II error is denoted by .
Unlike , which is specified by the researcher, the
magnitude of depends on the actual value of
the population parameter (proportion).
The risk of both and can be controlled by
increasing the sample size.
AMR @ Dr. Vikas Goyal

Hypothesis Testing
Parametric tests assume that the variables of interest are
measured on at least an interval scale.
Nonparametric tests assume that the variables are
measured on a nominal or ordinal scale.
These tests can be further classified based on whether one
or two or more samples are involved.
AMR @ Dr. Vikas Goyal

17

For Class Circulation ONLY

22-12-2014

Hypothesis Testing Related to Differences


Hypothesis Tests

Parametric Tests
(Metric Tests)
One Sample
* t test
* Z test

Non-parametric Tests
(Nonmetric Tests)
One Sample

Two or More
Samples

Independent
Samples
* Two-Group t
test (Mean)
* Z test

* Chi-Square
* K-S
* Runs
* Binomial

Paired
Samples
* Paired
t test

(Proportion)
AMR @ Dr. Vikas Goyal

Two or More
Samples

Independent
Samples

Paired
Samples

* Chi-Square
* Mann-Whitney
* Median
* K-S

* Sign
* Wilcoxon
* McNemar
* Chi-Square

One Sample
H0: SP had no effect on sales, i.e.
mean sales = 61 (even with the SP are offered)

H1: SP has an effect on the sales, i.e.


mean sales 61 (when SP are offered)
H0:
H1:

61.0
61.0

t = (X - )/sX

s X = s/ n
AMR @ Dr. Vikas Goyal

18

For Class Circulation ONLY

22-12-2014

Two Independent Sample


The samples are independent if they are drawn
randomly from different populations. For the
purpose of analysis, data pertaining to different
groups of respondents, e.g., males and females,
are generally treated as independent samples.
Suppose we wanted to determine whether the effect
of SP is different for Mumbai compared to Delhi. A twoindependent-samples t-test would be conducted.

AMR @ Dr. Vikas Goyal

F-test
Two sample test:
F-test: this is performed if it is not known whether the two
groups have equal variance, i.e. Equal variance not
assumed. For Ex. Variance of male and female
respondents on a particular variable in question (ex;
internet usage) is same.
H0: the two variables have Equal Variance
H1: The two variables DO NOT have Equal Variance
If F-test comes out to be significant.. This implies that H0
can be rejected.. i.e. the two variables do not have equal
variance. And thus take the results under this head.
AMR @ Dr. Vikas Goyal

19

For Class Circulation ONLY

22-12-2014

Related / Paired Sample


The samples are paired when the data for the two
samples relate to the same group of respondents.
Determine if the stored differed in the cost of running SP and
the income from SPs.
Determine if running the SP will be profitable (compare cost
versus income) for the firm?
Determine if the SP will have different profitability (compare
cost versus income) across Mumbai and Delhi?

AMR @ Dr. Vikas Goyal

SPSS Example
Please Run the following Parametric Tests:
Determine if the mean level of familiarity with internet
is more than 4.0
Determine if the internet usage is significantly different
for male and female
Determine if the mean level of familiarity with internet
is significantly different for male and female
Determine if the respondents significantly differed in
their attitude toward the Internet and attitude toward
technology.
Determine if the male and female respondents
significantly differed in their attitude toward the
Internet and attitude toward technology.
AMR @ Dr. Vikas Goyal

20