You are on page 1of 27

Data Analysis: Testing for Significant

Differences
By Hair, J.F., Bush, R.P., Ortinau, D.J.
Edited by Paul Ducham

SAMPLE GROUPS
To split the sample into two groups so you can compare them, you can use the options under the
Data pull-down menu. For example, to compare the customers of the Santa Fe Grill and the
customers of Joses Southwestern Caf, the click-through sequence is: DATA SPLIT FILE
Click on Compare Groups. Now highlight the fourth screening question (Favorite Mexican
restaurantx_s4) and move it into the Groups Based on: window, and then click OK. Your
results will now be computed for each restaurant separately. The same procedure can be used
with any variable. To do so, you insert the variable of choice into the Groups Based on:
window, and then click OK. A word of caution, however, is that until you remove this instruction
all data analysis will be based on separate groups as defined by the Groups Based on: window.
SMALLER SUBSET
Sometimes you may wish to select a smaller subset of your total sample to analyze. This can be
done using the Select Cases option under the Data pull-down menu. For example, to select
customers from only the Santa Fe Grill, the click-through sequence is DATA SELECT
CASESS IF CONDITION IS SATISFIED IF. Next, highlight x_s4 Favorite Mexican
restaurant and move it into the window; click the = sign and then 1. This instructs the SPSS
software to select only questionnaires coded 1 in the x_s4 column (the fourth screening question
on the survey), which is the Santa Fe Grill. If you wanted to analyze only the Joses
Southwestern Caf respondents, then you would do the same except after the = sign, put a 0.
MEAN
The mean is the average value within the distribution and is the most commonly used measure of
central tendency. The mean tells us, for example, the average number of cups of coffee the
typical student may drink during finals to stay awake. The mean can be calculated when the data
scale is either interval or ratio. Generally, the data will show some degree of central tendency,
with most of the responses distributed close to the mean.
The mean is a very robust measure of central tendency. It is fairly insensitive to data values being
added or deleted. The mean can be subject to distortion, however, if extreme values are included
in the distribution. For example, suppose you ask four students how many cups of coffee they
1

drink in a single day. Respondent answers are as follows: Respondent A = 1 cup; Respondent B =
10 cups; Respondent C = 5 cups; and Respondent D = 6 cups. Lets also assume that we know
that respondents A and B are males and respondents C and D are females and we want to
compare consumption of coffee between males and females. Looking at the males first
(Respondents A and B), we calculate the mean number of cups to be 5.5 (1 + 10 = 11/2 = 5.5).
Similarly, looking at the females next (Respondents C and D), we calculate the mean number of
cups to be 5.5 (5 + 6 = 11/2 = 5.5). If we look only at the mean number of cups of coffee
consumed by males and females, we would conclude there are no differences in the two groups.
If we consider the underlying distribution, however, we must conclude there are some
differences and the mean in fact distorts our understanding of coffee consumption patterns
among males and females.
MODE
The mode is the value that appears in the distribution most often. For example, the average
number of cups of coffee students drink per day during finals may be 5 (the mean), while the
number of cups of coffee that most students drink is only 3 (the mode). The mode is the value
that represents the highest peak in the distributions graph. The mode is especially useful as a
measure for data that have been somehow grouped into categories. The mode of the data
distribution in Exhibit 15.2 is Occasionally because when you look in the Frequency column you
will see the largest number of responses is 111 for the Occasionally label, which has a value of
3.

;
MEDIAN
The median is the middle value of the distribution when the distribution is ordered in either an
ascending or a descending sequence. For example, if you interviewed a sample of students to
determine their coffee-drinking patterns during finals, you might find that the median number of
cups of coffee consumed is 4. The number of cups of coffee consumed above and below this
number would be the same (the median number is the exact middle of the distribution). If the
number of data observations is even, the median is generally considered to be the average of the
two middle values. If there are an odd number of observations, the median is the middle value.
3

The median is especially useful as a measure of central tendency for ordinal data and for data
that is skewed to either the right or left. For example, income data is skewed to the right because
there is no upper limit on income.
Each measure of central tendency describes a distribution in its own manner, and each measure
has its own strengths and weaknesses. For nominal data, the mode is the best measure. For
ordinal data, the median is generally best. For interval or ratio data, the mean is generally used. If
there are extreme values within the interval or ratio data, however, the mean can be distorted. In
those cases, the median and the mode should be considered. SPSS and other statistical software
packages are designed to perform such types of analysis.
MEASURES OF CENTRAL TENDENCY
The Santa Fe Grill database can be used with the SPSS software to calculate measures of central
tendency. The SPSS click-through sequence is ANALYZE DESCRIPTIVE STATISTICS
FREQUENCIES. Lets use X25Frequency of Eating as a variable to examine. Click on X25 to
highlight it and then on the arrow box for the Variables window to use in your analysis. Next
open the Statistics box and click on Mean, Median, and Mode, and then Continue and OK.
Recall that if you want to create charts, open the Charts box. Your choices are Bar, Pie, and
Histograms. For the Format box we will use the defaults, so click on OK to execute the program.
The dialog boxes for this sequence are shown in Exhibit 15.1.
Lets look at the output for the measures of central tendency shown in Exhibit 15.2. In the
Statistics table we see the mean is 3.24, the median is 3.00, the mode is 3. Recall that this
variable is measured on a 5-point scale, with lower numbers indicating lower frequency of
patronage and larger numbers indicating higher frequency. The three measures of central
tendency can all be different within the same distribution, as described above in the coffeedrinking example. But it also is possible that all three measures can be the same. In our example
here the median and the mode are the same, but the mean is different.

RANGE
The range defines the spread of the data. It is the distance between the smallest and largest values
of the variable. Another way to think about it is that the range identifies the endpoints of the
distribution of values. For variable X25Frequency of Eating, the range is the difference
between the response category 5 (largest value) and response category 1 (smallest value); that is,
the range is 4. In this example, since we defined a narrow range of response categories in our
survey, the range doesnt tell us much. However, many questions have a much wider range. For
example, if we asked how often in a month respondents rent DVDs, or how much they would
pay to buy a DVD player that also records songs, the range would be quite informative. In this
case, the respondents, not the researchers, would be defining the range by their answers. For this
reason, the range is more often used to describe the variability of open-ended questions such as
our DVD example. For variable X25Frequency of Eating, the range is calculated as the
distance between the largest and smallest values in the set of responses and equals 4 (5 - 1 = 4).
5

STANDARD DEVIATION
The estimated standard deviation describes the average distance of the distribution values from
the mean. The difference between a particular response and the distribution mean is called a
deviation. Since the mean of a distribution is a measure of central tendency, there should be
about as many values above the mean as there are below it (particularly if the distribution is
symmetrical). Consequently, if we subtracted each value in a distribution from the mean and
added them up, the result would be close to zero (the positive and negative results would cancel
each other out).
The solution to this difficulty is to square the individual deviations before we add them up
(squaring a negative number produces a positive result). To calculate the estimated standard
deviation, we use the formula below.

Once the sum of the squared deviations is determined, it is divided by the number of respondents
minus 1. The number 1 is subtracted from the number of respondents to help produce an
unbiased estimate of the standard deviation. The result of dividing the sum of the squared
deviations is the average squared deviation. To convert the result to the same units of measure as
the mean, we take the square root of the answer. This produces the estimated standard deviation
of the distribution. Sometimes the average squared deviation is also used as a measure of
dispersion for a distribution. The average squared deviation, called the variance, is used in a
number of statistical processes.
Since the estimated standard deviation is the square root of the average squared deviations, it
represents the average distance of the values in a distribution from the mean. If the estimated
standard deviation is large, the responses in a distribution of numbers do not fall very close to the
mean of the distribution. If the estimated standard deviation is small, you know that the
distribution values are close to the mean.
Another way to think about the estimated standard deviation is that its size tells you something
about the level of agreement among the respondents when they answered a particular question.
For example, in the Santa Fe Grill database, respondents were asked to rate the restaurant on the
friendliness and knowledge of its employees (X12 and X19). We will use the SPSS program later
to examine the standard deviations for these questions.
Together with the measures of central tendency, these descriptive statistics can reveal a lot about
the distribution of a set of numbers representing the answers to an item on a questionnaire. Often,
however, marketing researchers are interested in more detailed questions that involve more than
one variable at a time.
MEASURES OF DISPERSION

The Santa Fe Grill database can be used with the SPSS software to calculate measures of
dispersion, just as we did with the measures of central tendency. Note that to calculate the
measures of dispersion we will be using the database with a sample size of 405 so we have
eliminated all respondents with missing data. The SPSS click-through sequence is ANALYZE
DESCRIPTIVE STATISTICS FREQUENCIES. Lets use X222 Satisfaction as a variable
to examine. Click on X22 to highlight it and then on the arrow box to move X22 to the Variables
box. Next open the Statistics box, go to the Dispersion box in the lower-left-hand corner, and
click on Standard deviation, Variance, Range, Minimum and Maximum, and then Continue. If
you would like to create charts, then open the Charts box. Your choices are Bar, Pie, and
Histograms. For the Format box we will use the defaults, so click on OK to execute the program.
Lets look at the output for the measures of dispersion shown in Exhibit 15.3 for variable X22.
First, the highest response on the 7-point scale is 7 (maximum) and the lowest response is 3
(minimum). The range is 4 (7 - 3 = 4), the standard deviation is 1.118, and the variance is 1.251.
Astandard deviation of 1.118 on a 7-point scale tells us the responses are dispersed fairly widely
around the mean of 3.24.

SAMPLE STATISTICS AND POPULATION PARAMETERS


The purpose of inferential statistics is to make a determination about a population on the basis of
a sample from that population. A sample is a subset of the population. For example, if we wanted
to determine the average number of cups of coffee consumed per day by students during finals at
your university, we would not interview all the students. This would be costly, take a long time,
and might be impossible since we may not be able to find them all or some would decline to
participate. Instead, if there are 16,000 students at your university, we may decide that a sample
of 200 females and 200 males is sufficiently large to provide accurate information about the
coffee-drinking habits of all 16,000 students.
You may recall that sample statistics are measures obtained directly from the sample or
calculated from the data in the sample. Apopulation parameter is a variable or some sort of
measured characteristic of the entire population. Sample statistics are useful in making
8

inferences regarding the populations parameters. Generally, the actual population parameters are
unknown since the cost to perform a true census of almost any population is prohibitive.
A frequency distribution displaying the data obtained from the sample is commonly used to
summarize the results of the data collection process. When a frequency distribution displays a
variable in terms of percentages, then this distribution is representing proportions within a
population. For example, a frequency distribution showing that 40 percent of the people
patronize Burger King indicates the percentage of the population that meets the criterion (eating
at Burger King). The proportion may be expressed as a percentage, a decimal value, or a fraction.
UNIVARIATE STATISTICAL TESTS
Marketing researchers often form hypotheses regarding population characteristics based on
sample data. The process typically begins by calculating frequency distributions and averages,
and then moves on to actually test the hypotheses. When the hypothesis testing involves
examining one variable at a time, it is referred to as a univariate statistical test. When the
hypothesis testing involves two variables it is called a bivariate statistical test. We first discuss
univariate statistical tests.
Suppose the owners of the Santa Fe Grill believe customers think their menu prices are very
reasonable. Respondents have answered this question using a 7-point scale where 1 = Strongly
Disagree and 7 = Strongly Agree. The scale is assumed to be an interval scale, and previous
research using this measure has shown the responses to be approximately normally distributed.
A couple of tasks must be completed before answering the question posed above. First, the
hypotheses to be compared (the null and alternative hypotheses) have to be developed. Then the
level of significance for rejecting the null hypothesis and accepting the alternative hypothesis
must be selected. At that point, the researcher can conduct the statistical test and determine the
answer to the research question.
In this example, the owners think the customers consider the prices of food at the Santa Fe Grill
to be very reasonable. The question is measured using a 7-point scale with 7 = Strongly Agree.
The marketing research consultant has indicated that expecting a 7 on a 7-point scale is
unreasonable. Therefore, the owners have defined reasonable prices by saying that perceptions
of the prices at Santa Fe Grill will not be significantly different from 6 = Very Favorable. The
null hypothesis is that the mean of the X166Reasonable Prices will not be significantly
different from 6. Recall that the null hypothesis asserts the status quo: any difference from what
is thought to be true is due to Random Sampling. The alternative hypothesis is: the mean
response to X166 Reasonable Prices will not be 6there is in fact a true difference between
the sample mean and the mean we think it is (6).
Assume also the owners want to be 95 percent certain the mean is not different from 6.
Therefore, the significance level will be set at .05. Using this significance level means that if the
survey of Santa Fe Grill customers is conducted many times, the probability of incorrectly
rejecting the null hypothesis when it is true would happen less than 5 times out of 100 (.05).

HYPOTHESIS TEST
Using the SPSS software, you can test the responses in the Santa Fe Grill database to find the
answer to the research question posed above. Before running this test, however, you must split
the sample into two groups: the customers of the Santa Fe Grill and the customers of Joses
Southwestern Caf. Recall that to do this, the click-through sequence is DATA SPLIT FILE
Click on Compare Groups. Now highlight the fourth screening question (Favorite Mexican
Restaurantx_s4) and move it into the Groups Based on: window, and then click OK. Your
results will now be computed for each restaurant separately.
To complete this test, the click-through sequence is ANALYZE COMPARE MEANS
ONE-SAMPLE T-TEST. When you get to the dialog box, click on X166 Reasonable Prices to
highlight it. Then click on the arrow to move X16 into the Test Variables box. In the box labeled
Test Value, enter the number 6. This is the number you want to compare the respondents
answers against, because your null hypothesis is that the mean of X16 will not be significantly
different from 6. Click on the Options box and enter 95 in the confidence interval box. This is the
same as setting the significance level at .05. Then, click on the Continue button and OK to
execute the program.
The SPSS output is shown in Exhibit 15.4. The top table is labeled One-Sample Statistics and
shows the mean, standard deviation, and standard error for X16Reasonable Prices for the two
restaurants (mean of 4.47 for Santa Fe Grill and standard deviation of 1.384). The One-Sample
Test table below shows the results of the t-test for the null hypothesis that the average response to
X16 is not significantly different from 6 (Test Value = 6). The t-test statistic is 25.613, and the
significance level is .000. This means that the null hypothesis can be rejected and the alternative
hypothesis accepted with a high level of confidence from a statistical perspective.
From a practical standpoint, in terms of the Santa Fe Grill, the results of the univariate
hypothesis test indicate respondents perceived that menu prices were significantly below a 6
(defined as very reasonable by the owners). The mean of 4.47 is substantially below 6 (7 =
Strongly Agree prices are reasonable). Thus, the Santa Fe Grill owners can conclude that their
prices are not perceived very favorably. Indeed, there is a lot of room to improve between the
mean of 4.47 on the 7-point scale and the highest value of 7. This is definitely an area that needs
to be examined. Of course, compared to Joses restaurant the Santa Fe Grill is perceived slightly
more favorably.

10

BIVARIATE STATISTICAL TESTS


In many instances marketing researchers test hypotheses that compare the characteristics of two
groups or two variables. For example, the marketing researcher may be interested in determining
whether there is a difference between older and younger new car purchasers in terms of the
importance of a 6-disk DVD player. In situations where more than one group is involved,
bivariate tests are needed. In the following section we first explain the concept of crosstabulation, which examines two variables. We then describe three bivariate hypothesis tests: Chisquare, which is used with nominal data; and the t-test (to compare two means) and analysis of
variance (compares three or more means), both of which are used with either interval or ratio
data.
Cross-Tabulation

11

We introduced one-way frequency tables to report the findings for a single variable. The next
logical step in data analysis is to perform cross-tabulation using two variables. Cross-tabulation
is useful for examining relationships and reporting the findings for two variables. The purpose of
cross-tabulation is to determine if differences exist between subgroups of the total sample. In
fact, cross-tabulation is the primary form of data analysis in some marketing research projects.
To use cross-tabulation you must understand how to develop a cross-tabulation table as well as
how to interpret the outcome.
Note that to simplify this example we will run this crosstab only for customers of the Santa Fe
Grill. To select just customers from the Santa Fe Grill, the click-through sequence is DATA
SELECT CASES IF CONDITION IS SATISFIED IF. Highlight x_s4 Favorite Mexican
restaurant and move it into the window. Then click the = sign and next the 1. This instructs the
SPSS software to select only questionnaires coded 1 in the x_s4 column, which is the Santa Fe
Grill. If you wanted to analyze only the Joses Southwestern Caf respondents, then do the same
except after the = sign put a 0.
To run the crosstab using SPSS, the click-through sequence is ANALYZE DESCRIPTIVE
STATISTICS CROSSTABS. This will get you the set of dialog boxes shown in Exhibit 15.5.
Insert X31 in the Rows window and X32 in the Columns window. Now click on the Cells box
and check the Row box under Percentages, and then the Expected box under Counts. Then click
Continue and OK to get the results.
Exhibit 15.6 shows the cross-tabulation between X31Ad Recall and X322Gender for the
Santa Fe Grill customers (N = 253). The cross-tabulation shows frequencies and percentages,
with percentages shown only for rows. One way to interpret this table, for example, would be to
look at the Observed Count versus the Expected Count. As you can see, the numbers are not very
different. Thus, our preliminary interpretation suggests that males and females do not differ in
their recall of Santa Fe Grill ads.
In constructing a cross-tabulation table, the researcher selects the variables to use when
examining relationships. Selection of variables should be based on the objectives of the research
project. Demographic variables typically are the starting point in developing crosstabulations.
These variables usually are the columns of the cross-tabulation table, and the rows are variables
like purchase intention, usage, or other categorical response questions. Cross-tabulation Tables
show percentage calculations based on column or row totals. Thus, the researcher can make
comparisons of behaviors and intentions for different categories of predictor variables such as
income, sex, and marital status.
As a preliminary technique, cross-tabulation provides the market researcher with a powerful tool
to summarize survey data. It is easy to understand and interpret, and can provide a description of
both total and subgroup data. Yet the simplicity of this technique can create problems. Analysis
can result in an endless variety of cross-tabulation tables. In developing these tables, the analyst
must always keep in mind both the project objectives and specific research questions the study is
designed to answer.
Chi-Square Analysis
12

Marketing researchers often analyze survey data by means of one-way frequency counts and
cross-tabulations. One purpose of cross-tabulations is to study relationships among variables.
The research question is Do the numbers of responses that fall into different categories differ
from what is expected? The null hypothesis is always that the two variables are not related.
Thus, the null hypothesis in the previous example would be that the number of men and women
customers who recall Santa Fe Grill ads is the same. The alternative hypothesis is that the two
variables are related, or that men and women differ in their recall of Santa Fe Grill ads. This
question and similar ones can be answered using Chisquare analysis. Below are some other
examples of research questions that could be examined using Chi-square statistical tests:

Is usage of the Internet (low, moderate, and high) related to gender?

Does frequency of eating out (infrequent, moderately frequent, and very frequent) differ
between males and females?

Do part-time and full-time workers differ in terms of how often they are absent from
work (seldom, occasionally, frequently)?

Do college students and high school students differ in their preference for Coke versus
Pepsi?

Chi-square (X2) analysis enables researchers to test for statistical significance between the
frequency distributions of two (or more) nominally scaled variables in a crosstabulation table to
determine if there is any association. Categorical data from questions about gender, education, or
other nominal variables can be examined with this statistic. Chi-square analysis compares the
observed frequencies (counts) of the responses with the expected frequencies. The Chi-square
statistic tests whether or not the observed data are distributed the way we expect them to be,
given the assumption that the variables are not related. The expected cell count is a theoretical
value, while the observed cell count is the actual cell count based on your study. For example, if
we observe that women recall ads more so than men, we would compare the observed value with
the frequency we would expect to find if there is no difference between womens and mens ad
recall. Thus, the chi-square statistic helps to answers questions about nominally scaled data that
cannot be analyzed with other types of statistical analysis, such as ANOVA or t-tests.
Calculating the X2 Value
To help you to better understand the Chi-square statistic, we will show you how to calculate it.
The formula is shown below:

13

As above equation indicates, the expected frequency is subtracted from the observed frequency
and then squared to eliminate any negative values before the results are used in further
calculations. After squaring, the resulting value is divided by the expected frequency to take into
consideration cell size differences. Then each of these calculations, which are performed for each
cell of the table, are summed over all cells to arrive at the Chi-square value. The Chi-square
value tells you how far the observed frequencies are from the expected frequencies.
Conceptually, the larger the Chi-square is, the more likely it is that the two variables are related.
This is because Chi-square is larger whenever the number actually observed in a cell is much
different than what we expected to find, given the assumption that the two variables are not
related. The computed Chi-square statistic is compared to a table of Chi-square values to
determine if the differences are statistically significant. If the calculated Chi-square is larger than
the Chi-square reported in standard statistical tables, then the two variables are related for a
given level of significance, typically .05.
Some marketing researchers call Chi-square a goodness of fit test. That is, the test evaluates
how closely the actual frequencies fit the expected frequencies. When the differences between
observed and expected frequencies are large, you have a poor fit and you reject your null
hypothesis. When the differences are small, you have a good fit.
One word of caution is necessary, however, in using Chi-square. The Chi-square results will be
distorted if more than 20 percent of the cells have an expected count of less than 5, or if any cell
has an expected count of less than 1. In such cases, you should not use this test. SPSS will tell
you if these conditions have been violated. One solution to small counts in individual cells is to
collapse them into fewer cells to get larger counts.
SPSS ApplicationChi Square
Based on their conversations with customers, the owners of the Santa Fe Grill believe that female
customers are coming to the restaurant from farther away than are male customers. The Chi14

square statistic can be used to determine if this is true. The null hypothesis is no difference in
distance driven (X30) between male and female customers of the Santa Fe Grill.
To conduct this analysis we examine only the responses for the Santa Fe Grill (N = 253). The
SPSS click-through sequence is ANALYZE DESCRIPTIVE STATISTICS CROSSTABS.
Click on X30Distance Traveled for the Row variable and on X32 Gender for the Column
variable. Click on the Statistics button and the Chi-square box, and then Continue. Next click on
the Cells button and on Expected frequencies (Observed frequencies is usually already checked).
Then click Continue and OK to execute the program.
The SPSS results are shown in Exhibit 15.7. The top table shows the actual number of responses
(count) for males and females for each of the categories of X300 Distance Driven: less than 1
mile, 15 miles, and more than 5 miles. For example, 74 males drove a distance of less than 1
mile while 12 females drove from this same distance. The expected frequencies (count) are also
shown in this table, right below the actual count.
The expected count is calculated on the basis of the proportion of the sample represented by a
particular group. For example, the total sample of Santa Fe Grill customers is 253, and 176 are
males and 77 are females. This means 69.6 percent of the sample is male and 30.4 percent is
female. When we look in the Total column for the distance driven category labeled Less than 1
mile we see that there are a total of 86 male and female respondents. To calculate the expected
frequencies, you multiply the proportion a particular group represents times the total number in
that group. For example, with males you calculate 69.6 percent of 86 and the expected frequency
is 59.8. Similarly, females are 30.4 percent of the sample so the expected number of females 26.2
(.304 86). The other expected frequencies are calculated in the same way.
Look again at the observed frequencies and note that a higher count than expected of female
customers of Santa Fe Grill drive more than 5 miles. That is, we would expect only 27.7 women
to drive to the Santa Fe Grill from more than 5 miles, but actually 34 women drove from this far
away. Similarly, there are fewer male customers than expected who drive from more than five
miles away (expected = 63.3 and actual only 57). This pattern is similar for the distance of 15
miles. That is, a higher proportion of females are driving from this distance than would be
expected.
Information in the Chi-Square Test table shows the results for this test. The Pearson Chi-Square
value is 16.945 and it is significant at the .000 level. Since this level of significance is much less
than our standard criterion of .05, we can reject the null hypothesis of no difference in distance
driven with a high degree of confidence. The interpretation of this finding suggests that female
customers are indeed driving from farther away than expected to get to the Santa Fe Grill. At the
same time, the males are driving shorter distances than expected to get to the Santa Fe Grill.

15

16

17

INDEPENDENT AND RELATED SAMPLES


In addition to examining frequencies, marketing researchers often want to compare the means of
two groups. There are two possible situations when means are compared. The first is when the
means are from independent samples, and the second is when the samples are related. An
example of an independent sample comparison would be the results of interviews with male and
female coffee drinkers. The researcher may want to compare the average number of cups of
coffee consumed per day by male students with the average number of cups of coffee consumed
by female students. An example of the second situation, related samples, is when the researcher
compares the average number of cups of coffee consumed per day by male students with the
average number of soft drinks consumed per day by the same sample of male students.
In a related sample situation, the marketing researcher must take special care in analyzing the
information. Although the questions are independent, the respondents are the same. This is called
18

a paired sample. When testing for differences in related samples the researcher must use what is
called a paired samples t-test. The formula to compute the t-value for paired samples is not
presented here. Students are referred to more advanced texts for the actual calculation of the tvalue for related samples. The SPSS package contains options for both the related-samples and
the independent samples situations.
T-TEST TO COMPARE TWO MEANS
Just as with the univariate t-test, the bivariate t-test requires interval or ratio data. Also, the t-test
is especially useful when the sample size is small (n < 30) and when the population standard
deviation is unknown. Unlike the univariate test, however, we assume that the samples are drawn
from populations with normal distributions and that the variances of the populations are equal.
Essentially, the t-test for differences between group means can be conceptualized as the
difference between the means divided by the variability of the means. The t-value is a ratio of the
difference between the two sample means and the standard error. The t-test provides a
mathematical way of determining if the difference between the two sample means occurred by
chance. The formula for calculating the t-value is:

INDEPENDENT SAMPLES T-TEST


To illustrate the use of a t-test for the difference between two group means, lets turn to the Santa
Fe Grill database. The Santa Fe Grill owners want to find out if there are differences in the level
of satisfaction between male and female customers. To do that we can use the SPSS Compare
Means program.
The SPSS click-through sequence is Analyze Compare Means Independent- Samples tTest. When you get to this dialog box, move variable X22Satisfaction into the Test Variables
box and variable X32Gender into the Grouping Variable Box. For variable X32 you must
define the range in the Define Groups box. Enter a 0 for Group 1 and a 1 for Group 2 (males
were coded 0 in the database and females were coded 1) and then click Continue. For the
Options we will use the defaults, so just click OK to execute the program.
Results are shown in Exhibit 15.8. The top table shows the Group Statistics. Note that 176 male
customers and 77 female customers were interviewed. The mean satisfaction level for males was
a bit higher at 4.70, compared with 4.18 for the female customers. Also, the standard deviation
for females was smaller (.823) than for the males (1.034).
19

To find out if the two means are significantly different, we look at the information in the
Independent Samples Test table. The statistical significance of the difference in two means is
calculated differently if the variances of the two means are equal versus unequal. In the column
labeled Sig. (2-tailed) you will note that the two means are significantly different (< .000),
whether we assume equal or unequal variances. Thus, there is no support for the null hypothesis
that the two means are equal, and we conclude that male customers are significantly more
satisfied than female customers. There is other information in this table, but we do not need to
concern ourselves with it at this time.

PAIRED SAMPLES T-TEST


Sometimes marketing researchers want to test for differences in two means for variables in the
same sample. For example, the owners of the Santa Fe Grill noticed that the taste of their food
was rated 4.78 while the food temperature was rated only 4.38. Since the two food variables are
obviously related, they want to know if the ratings for taste really are significantly higher (more
favorable) than for temperature. To examine this, we use the paired samples test for the
difference in two means. This test examines whether two means from two different questions
20

using the same scaling and answered by the same respondents are significantly different. The
null hypothesis is that the mean ratings for the two food variables (X18 and X20) are equal. Note
that in this example we are looking only at the responses of the Santa Fe Grill customers.
To test this hypothesis we use the SPSS paired-samples t-test. The click-through sequence is
Analyze Compare Means Paired-Samples t-Test. When you get to this dialog box,
highlight both X188Food Taste and X20Food Temperature and then click on the arrow
button to move them into the Paired Variables box. For the Options we will use the defaults, so
just click OK to execute the program.
Results are shown in Exhibit 15.9. The top table shows the Paired Samples Statistics. The mean
for food taste is 4.78 and for food temperature is 4.38. The t-value for this comparison is 8.421
(see Paired Samples Test table) and it is significant at the .000 level. Thus we can reject the null
hypothesis that the two means are equal and conclude that Santa Fe Grill customers definitely
have more favorable perceptions of food taste than food temperature.

21

ANALYSIS OF VARIANCE (ANOVA)


Analysis of variance (ANOVA) is used to determine the statistical difference between three or
more means. For example, if a sample finds that the average number of cups of coffee consumed
per day by freshmen during finals is 3.7, while the average number of cups of coffee consumed
per day by seniors and graduate students is 4.3 cups and 5.1 cups, respectively, are these
observed differences statistically significant? The ability to make such comparisons can be quite
useful for the marketing researcher.
The technique is really quite straightforward. In this section we describe a one-way ANOVA.
The term one-way is used since there is only one independent variable. ANOVA can be used in
cases where multiple independent variables are considered, which enables the analyst to estimate
22

both the individual and joint effects of the several independent variables on the dependent
variable.
An example of an ANOVA problem may be to compare light, medium, and heavy drinkers of
Starbucks coffee on their attitude toward a particular Starbucks advertising campaign. In this
instance there is one independent variable, consumption of Starbucks coffee, but it is divided into
three different levels. Our earlier t-statistics wont work here since we have more than two
groups to compare.
ANOVA requires that the dependent variable, in this case the attitude toward the Starbucks
advertising campaign, be metric. That is, the dependent variable must be either interval or ratio
scaled. A second data requirement is that the independent variable, in this case the coffee
consumption variable, be categorical.
The null hypothesis for ANOVA always states that there is no difference between the dependent
variable groupsin this situation, the ad campaign attitudes of the groups of Starbucks coffee
drinkers. In specific terminology, the null hypothesis would be:
1 = 2 = 3
ANOVA examines the variance within a set of data. Recall from the earlier discussion of
measures of dispersion that the variance of a variable is equal to the average squared deviation
from the mean of the variable. The logic of ANOVA is that if the variance between the groups is
compared to the variance within the groups, we can make a logical determination as to whether
the group means (attitudes toward the advertising campaign) are significantly different.
Determining Statistical Significance in ANOVA
In ANOVA, the F-test is used to statistically evaluate the differences between the group means.
For example, suppose the heavy users of Starbucks coffee rate the advertising campaign 4.4 on a
five-point scale, with 5 = Very favorable. The medium users of Starbucks coffee rate the
campaign 3.9, and the light users of Starbucks coffee rate the campaign 2.5. The F-test in
ANOVA tells us if these observed differences are meaningful.
The total variance in a set of responses to a question is made up of between-group and withingroup variance. The between-group variance measures how much the sample means of the
groups differ from one another. In contrast, the within-group variance measures how much the
observations within each group differ from one another. The F-distribution is the ratio of these
two components of total variance and can be calculated as follows:
F-ratio = Variance between groups/Variance within groups
The larger the difference in the variance between groups, the larger the F-ratio. Since the total
variance in a data set is divisible into between and within components, if there is more variance
explained or accounted for by considering differences between groups than there is within
groups, then the independent variable probably has a significant impact on the dependent
23

variable. Larger F-ratios imply significant differences between the groups. The larger the F-ratio,
the more likely it is that the null hypothesis will be rejected.
ANOVA, however, is able to tell the researcher only that statistical differences exist between at
least one pair of the group means. The technique cannot identify which pairs of means are
significantly different from each other. In our example of Starbucks coffee drinkers attitudes
toward the advertising campaign, we could conclude that differences in attitudes toward the
advertising campaign exist among light, medium, and heavy coffee drinkers, but we would not
be able to determine if the differences are between light and medium, or between light and
heavy, or between medium and heavy, and so on. We would be able to say only that there are
significant differences somewhere among the groups. Thus, the marketing researcher still must
determine where the mean differences lie. Follow-up post-hoc tests have been designed for just
that purpose.
There are several follow-up tests available in statistical software packages such as SPSS and
SAS. All of these methods involve multiple comparisons, or simultaneous assessment of
confidence interval estimates of differences between the means. All means are compared two at a
time. The differences between the techniques lie in their ability to control the error rate. We shall
briefly describe the Scheff procedure, although a complete discussion of these techniques is
well beyond the scope of this book. Relative to the other follow-up tests mentioned, however, the
Scheff procedure is a more conservative method of detecting significant differences between
group means.
The Scheff follow-up test establishes simultaneous confidence intervals, which hold the entire
experiments error rate to a specified level. The test exposes differences between all pairs of
means to a high and low confidence interval range. If the difference between each pair of means
falls outside the range of the confidence interval, then we reject the null hypothesis and conclude
that the pairs of means falling outside the range are statistically different. The Scheff test might
show that one, two, or all three pairs of means in our Starbucks example are different. The
Scheff test is equivalent to simultaneous two-tailed hypothesis tests, and the technique holds the
specified analysis significance level. Because the technique holds the experimental error rate to ,
the confidence intervals tend to be wider than in the other methods, but the researcher has more
assurance that true mean differences exist. Recall that the Scheff test is very conservative so
you may wish to look at one of the other tests available in your statistical software.
SPSS ApplicationANOVA
To help you understand how ANOVA is used to answer research questions, we refer to the Santa
Fe Grill database to answer a typical question. The Santa Fe Grill owners want to know how
their restaurant compares to their major competitor, Joses Southwestern Caf. They are
particularly interested in comparing satisfaction and related variables as well as gender. The
purpose of the ANOVAanalysis is to see if the differences that do exist are statistically
significant. To examine the differences, an F-ratio is used. The larger the F-ratio, the more
difference there is among the means of the various groups with respect to their likelihood of
recommending the restaurant. Note that this application of ANOVA examines only two groups:

24

the two restaurant competitors Santa Fe Grill and Joses Southwestern Caf. But ANOVA can be
used to examine three, four, or more groups, to identify statistical differences if they exist.
SPSS can conduct the statistical analysis to test the null hypothesis. To compare male and female
customers from the two restaurants, we first must split the sample into two groupsthe male and
female customers. To do this, the click-through sequence is DATA SPLIT FILE Click on
Compare Groups. Now highlight the variable X322Gender and move it into the Groups
Based on: window, and then click OK. Your results will now be computed for male and female
customers separately.
Next we want to test whether the two restaurants are viewed differently on selected variables.
The click-through sequence is ANALYZE COMPARE MEANS ONE-WAY ANOVA.
Highlight X22Satisfaction, X23Likely to Return, and X24Likely to Recommend by
highlighting them and moving to the Dependent List window. Next, highlight x_s4 Favorite
Mexican restaurant and move it to the Factor window. This tells the SPSS software to
statistically test the differences in the responses on the three variables selected. Next click on the
Options box, then on Descriptive (to get group means), and then continue. Now click OK to run
the test.
The results for the ANOVA are shown in Exhibit 15.10. The two restaurants differ significantly
on four of the six variables compared (see Sig. column). In the top of the table are the
comparisons of males and they differ on only one variable: X24Likely to Recommend. That is,
the mean perceptions of males between the two restaurants do not differ significantly on
satisfaction or likelihood of returning. But the male customers of the Santa Fe Grill are more
likely to recommend (X24) the restaurant (mean = 3.78) than are the male customers of Joses
Southwestern Caf (mean = 3.23).
The female customers feel very differently about the two restaurants, and indeed there are
significant differences on all three variables (.000). The female customers are more favorable
about Joses Southwestern Caf, particularly in terms of satisfaction and likelihood of returning.
The females are likely to recommend Joses restaurant (mean = 5.23 on a 7-point scale), but not
likely to recommend the Santa Fe Grill (mean = 3.22).
n-Way ANOVA
Discussion of ANOVA to this point has been devoted to one-way ANOVA in which there is only
one independent variable. In the examples, the usage category (consumption of Starbucks coffee)
or the restaurant competitors (Santa Fe Grill and Joses Southwestern Caf) was the single
independent variable. It is not at all uncommon, however, for the researcher to be interested in
several independent variables simultaneously. In such cases an n-way ANOVA would be used.
Often researchers are interested in the region of the country where a product is sold as well as
consumption patterns. Using multiple independent factors creates the possibility of an interaction
effect. That is, the multiple independent factors can act together to affect group means. For
example, heavy consumers of Starbucks coffee in the Northeast may have different attitudes

25

about advertising campaigns than heavy consumers of Starbucks coffee in the West, and there
may be still further differences between the various coffeeconsumption- level groups.
Another situation that may require n-way ANOVA is the use of experimental designs, where the
researcher uses different levels of a stimulus (for example, different prices or ads) and then
measures responses to those stimuli. For example, a marketer may be interested in finding out
whether consumers prefer a humorous ad to a serious one and whether that preference varies
across gender. Each type of ad could be shown to different groups of customers (both male and
female). Then, questions about their preferences for the ad and the product it advertises could be
asked. The primary difference between the groups would be the difference in ad execution
(humorous or nonhumorous) and customer gender. An n-way ANOVAcould be used to find out
whether the ad execution differences helped cause differences in ad and product preferences, as
well as what effects might be attributable to customer gender.
From a conceptual standpoint, n-way ANOVA is similar to one-way ANOVA, but the
mathematics is more complex. However, statistical packages such as SPSS will conveniently
allow the marketing researcher to perform n-way ANOVA.

PERCEPTUAL MAPPING
While our fast-food example illustrates how perceptual mapping grouped pairs of restaurants
together based on perceived ratings, perceptual mapping has many other important applications
in marketing research. Other applications include

New-product development. Perceptual mapping can identify gaps in perceptions and


thereby help to position new products.

26

Image measurement. Perceptual mapping can be used to identify the image of the
company to help to position one company relative to the competition.

Advertising. Perceptual mapping can assess advertising effectiveness in positioning the


brand.

Distribution. Perceptual mapping can be used to assess similarities of brands and channel
outlets.

27