You are on page 1of 127

Revision on

Surveys and Multivariate


Analysis
BY ISB ACADEMIC TEAM

For further information and step–by–step guide to solving problems, please kindly refer to Tutoring Videos uploaded
on ISB Academic Team Facebook fanpage.

Under no circumstances should one copy this document without the author’s permission
Chapter 1,2,4 - ISB Academic Team

CHAPTER 1, 2, 4

MARKETING RESEARCH - PROBLEM IDENTIFICATION

1. Marketing Research

 Marketing research is the systematic and objective identification, collection,


analysis, dissemination, and use of information for the purpose of improving
decision making related to the identification and solution of problems and
opportunities in marketing.

 A classification of marketing research: Problem-identification research and


problem-solving research go hand in hand, and a given marketing research project
may combine both types of research.
o Problem-identification research is undertaken to help identify problems
that are, perhaps, not apparent on the surface and yet exist or are likely to
arise in the future. Research of this type provides information about the
marketing environment and helps diagnose a problem.
o Problem-solving research is undertaken to arrive at a solution once a
problem or opportunity has been identified. The findings of problem-
solving research are used in making decisions that will solve specific
marketing problems.

1
Chapter 1,2,4 - ISB Academic Team

2. Marketing Research Process

 A set of six steps that defines the tasks to be accomplished in conducting a


marketing research study.
 Step 1: Problem Definition
o The researcher should take into account:
 The purpose of the study
 The relevant background information
 The information needed
 How it will be used in decision making.
o Involves discussion with the decision makers, interviews with industry
experts, analysis of secondary data, and qualitative research.

2
Chapter 1,2,4 - ISB Academic Team

 Step 2: Development of an Approach to the Problem


o Includes formulating an objective or theoretical framework, analytical
models, research questions, and hypotheses and identifying the
information.
o Guided by discussions with management and industry experts, analysis of
secondary data, qualitative research, and pragmatic considerations.

 Step 3: Research Design Formulation


o A framework for conducting the marketing research project that details the
procedures necessary for obtaining the required information.
o Its purpose is to design a study that will:
 Test the hypotheses of interest
 Determine possible answers to the research questions
 Provide the information needed for decision making.
o Formulating the research design involves:
 Definition of the information needed
 Secondary data analysis
 Qualitative research
 Methods of collecting quantitative data
 Measurement and scaling procedures
 Questionnaire design
 Sampling process and sample size
 Plan of data analysis

 Step 4: Fieldwork or Data Collection


o Involves a field force or staff that operates either in the field, from an office
by telephone, through mail, or electronically.
o Proper selection, training, supervision, and evaluation of the field force
help minimize data-collection errors.

3
Chapter 1,2,4 - ISB Academic Team

 Step 5: Data Preparation and Analysis


o Data preparation includes the editing, coding, transcription, and
verification of data.
o Editing: Each questionnaire or observation form is inspected or edited
and, if necessary, corrected.
o Coding: Number or letter codes are assigned to represent each response to
each question in the questionnaire.
o Transcription: The data from the questionnaires are transcribed onto
magnetic tape or disks, or input directly into the computer.
o Verification of data: The data are analyzed to derive information related
to the components of the marketing research problem and to provide input
into the management decision problem.

 Step 6: Report Preparation and Presentation


o The entire project should be documented in a written report that:
 Addresses the specific research questions identified.
 Describes the approach, the research design, data collection, and
data analysis procedures adopted.
 Presents the results and the major findings.
o The findings should be presented in a comprehensible format so that
management can readily use them in the decision-making process.
o An oral presentation should be made to management using tables, figures,
and graphs to enhance clarity and impact.
o The Internet is also being used to disseminate marketing research results
and reports, which can be posted on the Web and made available to
managers on a worldwide basis.
o Notes: These steps are interdependent and iterative. Thus, at each step, the
researcher should not only look back at the previous steps but also look
ahead to the following steps.

4
Chapter 1,2,4 - ISB Academic Team

 The role of Marketing Research in Decision Making


o The task of marketing research is to assess the information needs and
provide management with relevant, accurate, reliable, valid, current, and
actionable information.

 Competitive Intelligence
o The process of enhancing marketplace competitiveness through a greater
understanding of a firm’s competitors and the competitive environment.
o This process is unequivocally ethical. It involves the legal collection and
analysis of information regarding the capabilities, vulnerabilities, and
intentions of business competitors, conducted by using information
databases and other “open sources” and through ethical marketing
research inquiry.
o CI enables senior managers in companies of all sizes to make informed
decisions about everything from marketing, research and development
(R&D), and investing tactics to long-term business strategies.

5
Chapter 1,2,4 - ISB Academic Team

3. MDP (Management Decision Problem) vs MRP (Marketing Research Problem)

Management decision problem (MDP) Marketing research problem (MRP)

 The problem confronting the  A problem that entails determining what


decision maker. It asks what the information is needed and how it can be
decision maker needs to do. obtained in the most feasible way.
 Action oriented.  Information oriented.

 Focuses on symptoms.  Focuses on underlying causes.

 Examples:  Examples:

 Should a new product be  To determine consumer


introduced? preferences and purchase
 Should the advertising intentions for the proposed new
campaign be changed? product
 Should the price of the  To determine the effectiveness
brand be increased? of the current advertising
campaign

 To determine the price


elasticity of demand and the
impact on sales and profits of
various levels of price changes

 While distinct, the marketing research problem has to be closely linked to the
management decision problem.
 Conceptual map: A way to link the broad statement of the marketing research
problem to the management decision problem.
o Management wants to (take action) - the rationale for the question and the
project (the MDP).
o Therefore, we should study (topic). - what broader topic is being
investigated.
o So that we can explain (question). - who/how/why that needs to be
explained.
o Notes: the second and third lines define the broad marketing research
problem (MRP) .

6
Chapter 1,2,4 - ISB Academic Team

4. Data in research

Primary Data Secondary Data

Collection purpose For the problem at hand For other problems

Collection process Very involved Rapid and easy

Collection cost High Relatively low

Collection time Long Short

 Primary Data
o Originated by a researcher for the specific purpose of addressing the
problem at hand.
o Obtaining primary data can be expensive and time consuming.
 Secondary Data
o Data that has already been collected for purposes other than the problem at
hand. These data can be located quickly and inexpensively.
o Advantages
 Secondary data are easily accessible, relatively inexpensive, and
quickly obtained.
 Secondary data can help you:
 Identify the problem.
 Better define the problem.
 Develop an approach to the problem.
 Formulate an appropriate research design (for example, by
identifying the key variables).
 Answer certain research questions and test some
hypotheses.
 Interpret primary data more insightfully.

7
Chapter 1,2,4 - ISB Academic Team

o Disadvantages
 The objectives, nature, and methods used to collect the
secondary data may not be appropriate to the present
situation.
 Secondary data may be lacking in accuracy, or they may not
be completely current or dependable.
o Rule for using secondary data
 Start with secondary data. Proceed to primary data only when the
secondary data sources have been exhausted or yield marginal
returns.

o Classification of Secondary Data


 Internal data are those generated within the organization for
which the research is being conducted. This information may be
available in a ready-to-use-format but may require considerable
processing before they are useful to the researcher.
 Database marketing involves the use of computers to
capture and track customer profiles and purchase detail.
This secondary information serves as the foundation for
marketing programs or as an internal source of information
related to customer behavior.
 External data are those generated by sources outside the
organization. These data may exist in the form of published
material, computerized databases, or information made available
by syndicated service.

8
Chapter 1,2,4 - ISB Academic Team

5. Published External Secondary Sources

 Published External Secondary Sources include federal, state, and local


governments, nonprofit organizations, trade associations and professional
organizations, commercial publishers, investment brokerage firms, and
professional marketing research firms.
o General business sources consisted of guides, directories, indexes, and
statistical data.
o Government sources may be broadly categorized as census data and other
publications.
 Computerized databases consist of information that has been made available in
computer-readable form for electronic distribution.
 Syndicated Sources of Secondary Data
o Information services offered by marketing research organizations that
provide information from a common database to different firms that
subscribe to their services.

9
Chapter 2, 8, 9, 10 – ISB Academic Team

CHAPTER 2, 8, 9, 10
APPROACH DEVELOPMENT, MEASUREMENT & QUESTIONNAIRE

1. Developing the approach


 The outputs of the approach development process should include the following
components: objective/theoretical framework, analytical models, research
questions, hypotheses, and specification of information needed.

10
Chapter 2, 8, 9, 10 – ISB Academic Team

 Objective/ Theoretical Framework


o Research should be based on objective evidence and supported by theory.
o A theory is a conceptual scheme based on foundational statements called
axioms, which are assumed to be true.
o Objective evidence (evidence that is unbiased and supported by empirical
findings) is gathered by compiling relevant findings from secondary
sources.

 Analytical Model
o An analytical model is a set of variables and their interrelationships
designed to represent, in whole or in part, some real system or process.
o Models can have many different forms. The most common are verbal,
graphical, and mathematical structures.
 Verbal Models: Analytical models that provide a written
representation of the relationships between variables.
 Graphical Models: Analytical models that provide a visual picture
of the relationships between variables.
 Mathematical Models: Analytical models that explicitly describe
the relationships between variables, usually in the equation form.
 Research Questions
o Research questions are refined statements of the specific components of
the problem.
 To develop an approach, each component of the problem may have
to be broken down into subcomponents or research questions.
o Research questions ask what specific information is required with respect
to the problem components.
o The formulation of the research questions should be guided not only by
the problem definition but also by the theoretical framework and the
analytical model adopted.

11
Chapter 2, 8, 9, 10 – ISB Academic Team

 Hypotheses
o A hypothesis (H) is an unproven statement or proposition about a factor
or phenomenon that is of interest to the researcher.
o Hypotheses go beyond research questions because they are statements of
relationships or propositions rather than merely questions to which
answers are sought.
o An important role of a hypothesis is to suggest variables to be included in
the research design.

 Specification of Information Needed


o By focusing on each component of the problem and the analytical
framework and models, research questions, and hypotheses, the researcher
can determine what information should be obtained in the marketing
research project

12
Chapter 2, 8, 9, 10 – ISB Academic Team

2. Measurement & Scales


 Measurement means assigning numbers or other symbols to characteristics of
objects according to certain pre-specified rules.
o The most important aspect of measurement is the specification of rules for
assigning numbers to the characteristics.
o The assignment process must be isomorphic: There must be one- to-one
correspondence between the numbers and the characteristics being
measured.
 Scaling may be considered an extension of measurement. Scaling involves creating
a continuum upon which measured objects are located.
o To illustrate, consider a scale from 1 to 100 for locating consumers
according to the characteristic “attitude toward department stores.” Each
respondent is assigned a number from 1 to 100 indicating the degree of
(un)favorableness, with 1 extremely unfavourable, and 100 extremely
favourable. Scaling is the process of placing the respondents on a
continuum with respect to their attitude toward department stores.
 Primary Scales of Measurement
o There are four primary scales of measurement: nominal, ordinal, interval,
and ratio.

13
Chapter 2, 8, 9, 10 – ISB Academic Team

 Nominal Scale
o A nominal scale is a figurative labelling scheme in which the numbers
serve only as labels or tags for identifying and classifying objects.
o When used for identification, there is a strict one-to-one correspondence
between the numbers and the objects.
 Common examples include Social Security numbers and numbers
assigned to football players. In marketing research, nominal scales
are used for identifying respondents, brands, attributes, stores, and
other objects.
 When used for classification purposes, the nominally scaled
numbers serve as labels for classes or categories.
 For example, you might classify the control group as group
1 and the experimental group as group 2.
 The numbers in a nominal scale do not reflect the amount of the
characteristic possessed by the objects.
 The only permissible operation on the numbers in a nominal scale is
counting.
 Only a limited number of statistics, all of which are based on
frequency counts, are permissible. These include percentages,
mode, chi-square, and binomial tests.

14
Chapter 2, 8, 9, 10 – ISB Academic Team

 Ordinal Scale
o An ordinal scale: A ranking scale in which numbers are assigned to objects
to indicate the relative extent to which some characteristic is possessed.
o An ordinal scale allows you to determine whether an object has more or
less of a characteristic than some other object, but not how much more or
less.
 Indicates relative position, not the magnitude of the differences
o Common examples of ordinal scales include quality rankings, rankings of
teams in a tournament, socioeconomic class, and occupational status.
 The ordinal scales possess description and order characteristics
but do not possess distance (or origin).
 In an ordinal scale, as in a nominal scale, equivalent objects receive
the same rank. Any series of numbers can be assigned that
preserves the ordered relationships between the objects.

 Interval Scale
o A scale in which the numbers are used to rate objects such that numerically
equal distances on the scale represent equal distances in the characteristic
being measured.
o An interval scale contains all the information of an ordinal scale, but it also
allows you to compare the differences between objects.
o A common example in everyday life is a temperature scale. In marketing
research, attitudinal data obtained from rating scales are often treated as
interval data.
o In an interval scale, the location of the zero point is not fixed, i.e., these
scales do not possess the origin characteristic -> Both the zero point and
the units of measurement are arbitrary.

 Ratio Scale
o A ratio scale possesses all the properties of the nominal, ordinal, and
interval scales and, in addition, an absolute zero point.
o Ratio scales possess the characteristic of origin (and distance, order, and
description).
o Ratio scales we can identify or classify objects, rank the objects, and
compare intervals or differences. It is also meaningful to compute ratios of
scale values.
 Common examples of ratio scales include height, weight, age, and
money.
 In marketing, sales, costs, market share, and number of customers
are variables measured on a ratio scale.

15
Chapter 2, 8, 9, 10 – ISB Academic Team

 Ratio scales allow only proportionate transformations of the form y


= bx, where b is a positive constant. One cannot add an arbitrary
constant, as in the case of an interval scale.
o All statistical techniques can be applied to ratio data. These include
specialised statistics such as geometric mean, harmonic mean, and
coefficient of variation.

 Itemised Rating Scale


o Itemized rating scale: A measurement scale having numbers and/or brief
descriptions associated with each category. The categories are ordered in
terms of scale position.
o Itemised rating scales are widely used in marketing research and form the
basic components of more complex scales, such as multi-item rating scales.
o The commonly used itemised rating scales are the Likert scale, semantic
differential, and Stapel scales.

 Likert Scale
o Likert scale is a widely used rating scale that requires the respondents to
indicate the degree of agreement or disagreement with each of a series of
statements about the stimulus object.

16
Chapter 2, 8, 9, 10 – ISB Academic Team

 The data are typically treated as intervals. Thus, the Likert scale
possesses the characteristics of description, order, and distance.
o Advantages
 Easy to construct and administer.
 Making it suitable for mail, telephone, personal or electronic
interviews.
o Disadvantages
 Taking longer to complete than other itemised rating scales
because respondents have to read each statement.
 May be difficult to interpret the response to a Likert item, especially
if it is an unfavourable statement.

 Semantic Differential Scale


o The semantic differential is a 7-point rating scale with endpoints
associated with bipolar labels that have semantic meaning.
 The respondents mark the blank that best indicates how they would
describe the object being rated.
o In a typical application, respondents rate objects on a number of itemised,
7- point rating scales bounded at each end by one of two bipolar adjectives,
such as “cold” and “warm.”

 Stapel Scale
o Stapel Scale is a scale for measuring attitudes that consists of a single
adjective in the middle of an even-numbered range of values, from 5 to 5,
without a neutral point (zero).
o Respondents are asked to indicate how accurately or inaccurately each
term describes the object by selecting an appropriate numerical response
category.
o The higher number, the more accurately the term describes the object, as
shown in the department store project.

17
Chapter 2, 8, 9, 10 – ISB Academic Team

 Scale Evaluation
o A multi-item scale should be evaluated for accuracy and applicability,
involving an assessment of reliability, validity, and generalizability of the
scale.

 Reliability
o Reliability refers to the extent to which a scale produces consistent results
if repeated measurements are made.
o Reliability is assessed by determining the proportion of systematic
variation in a scale.

18
Chapter 2, 8, 9, 10 – ISB Academic Team

 This is done by determining the association between scores


obtained from different administrations of the scale.
 If the association is high, the scale yields consistent results and is
therefore reliable.
o Approaches to assessing reliability include test-retest reliability,
alternative-forms reliability, and internal consistency reliability.

 Validity
o The validity of a scale may be defined as the extent to which differences in
observed scale scores reflect true differences among objects on the
characteristic being measured, rather than systematic or random error.
o Validity can be assessed by examining content validity, criterion validity,
and construct validity.
 CONTENT VALIDITY
 Content validity, sometimes called face validity, is a
subjective but systematic evaluation of how well the content
of a scale represents the measurement task at hand.
 CRITERION VALIDITY
 Criterion validity reflects whether a scale performs as
expected in relation to other variables selected as
meaningful criteria (criterion variables).
 CONSTRUCT VALIDITY
 Construct validity addresses the question of what
construct or characteristic the scale is, in fact, measuring.
 Construct validity includes convergent, discriminant, and
nomological validity.
 Convergent validity is the extent to which the scale
correlates positively with other measures of the same
construct.

 Generalizability
o Generalizability refers to the extent to which one can generalise from the
observations at hand to a universe of generalisations.
o The set of all conditions of measurement over which the investigator
wishes to generalise is the universe of generalisation.
 In generalizability studies, measurement procedures are designed
to investigate the universes of interest by sampling conditions of
measurement from each of them.
 For each universe of interest, an aspect of measurement called a
facet is included in the study.

19
Chapter 2, 8, 9, 10 – ISB Academic Team

3. Questionnaire:
 Objectives: Any questionnaire has three specific objectives.
o First, it must translate the information needed into a set of specific
questions that the respondents can and will answer.
o Second, a questionnaire must uplift, motivate, and encourage the
respondent to become involved in the interview, to cooperate, and to
complete the interview.
o Third, a questionnaire should minimise response error.

 Questionnaire Design Process:


o Questionnaire design will be presented as a series of steps (see Figure
10.1).

20
Chapter 2, 8, 9, 10 – ISB Academic Team

 Double-barreled & Filter Question


o Double-barreled question: A single question that attempts to cover two
issues.
o Such questions can be confusing to respondents and result in ambiguous
responses. Sometimes, several questions are needed to obtain the required
information in an unambiguous manner.
o Example: Consider the question: “Do you think Coca-Cola is a tasty and
refreshing soft drink?” (Incorrect)
 What if the answer is “no”? -> Does this mean that the respondent
thinks that Coca-Cola is not tasty, that it is not refreshing, or that it
is neither tasty nor refreshing?
 To obtain the required information unambiguously, two distinct
questions should be asked: “Do you think Coca-Cola is a tasty soft
drink?” and “Do you think Coca-Cola is a refreshing soft drink?”
(Correct).

 Filter questions: An initial question in a questionnaire that screens potential


respondents to ensure they meet the requirements of the sample.
o In situations where not all respondents are likely to be informed about the
topic of interest, filter questions that measure familiarity, product use, and
past experience should be asked before questions about the topics
themselves.

 Structured & Unstructured Question


o Unstructured Questions: are open-ended questions that respondents
answer in their own words.
 Advantages:
 Enable the respondents to express general attitudes and
opinions.
 Have a much less biasing influence on response than
structured questions.
 Respondents are free to express any views. Their comments
and explanations can provide the researcher with rich
insights.
 Disadvantages:
 Interviewer bias is high.
 The coding of responses is costly and time-consuming.

21
Chapter 2, 8, 9, 10 – ISB Academic Team

o Structured Questions: specify the set of response alternatives and the


response format. A structured question may be multiple choice,
dichotomous, or a scale.
 MULTIPLE-CHOICE QUESTIONS
 In multiple-choice questions, the researcher provides a
choice of answers and respondents are asked to select one
or more of the alternatives given.
 DICHOTOMOUS QUESTIONS
 A dichotomous question has only two response alternatives:
yes or no, agree or disagree, and so on.

 Question-Wording: is the translation of the desired question content and structure


into words that respondents can clearly and easily understand.
o To avoid these problems, we offer the following guidelines:

(1) define the issue


(2) use ordinary word
(3) use unambiguous words
(4) avoid leading questions
(5) avoid implicit alternatives
(6) avoid implicit assumptions
(7) avoid generalizations and estimates
(8) use positive and negative statements.

 Order of Questions
o Opening Questions: can be crucial in gaining the confidence and
cooperation of respondents. The opening questions should be interesting,
simple, and nonthreatening.
o Type of Information: The type of information obtained in a questionnaire
may be classified as

(1) basic information

(2) classification information

(3) identification information

22
Chapter 2, 8, 9, 10 – ISB Academic Team

 Difficult Questions: Difficult questions or questions that are sensitive,


embarrassing, complex, or dull should be placed late in the sequence.
o Effect on Subsequent Question: Questions asked early in a sequence can
influence the responses to subsequent questions. As a rule of thumb,
general questions should precede specific questions. This prevents specific
questions from biasing responses to general questions.

 Logical Order: Questions should be asked in a logical order. All of the questions
that deal with a particular topic should be asked before beginning a new topic.

23
Chapter 6, 11, 13 – ISB Academic Team

CHAPTER 6,11,13
SAMPLING, SURVEY, AND FIELDWORK

1. SAMPLING DESIGN: involves several basic questions.

1. Should a sample be taken?

2. If so, what process should be followed?

3. What kind of sample should be taken?

4. How large should it be?

5. What can be done to control and adjust for nonresponse errors?

 Population: The aggregate of all the elements, sharing some common set of
characteristics, that comprises the universe for the purpose of the marketing
research problem.
o Census: involves a complete enumeration of the elements of a population.
o Sample: a subgroup of the population selected for participation in the study.
 Sample characteristics (sample statistics) are used to make
inferences (estimation procedures and tests of hypotheses) about
the population parameters.

24
Chapter 6, 11, 13 – ISB Academic Team

Sample vs Census Conditions Favoring the Use of

Sample Census

1. Budget Small Large

2. Time available Short Long

3. Population size Large Small

4.Variance in the characteristic Small Large

5. Cost of sampling errors Low High

6. Cost of nonsampling errors High Low

7. Nature of measurement Destructive Nondestructive

8. Attention to individual cases Yes No

 The Sampling Design Process: 5 steps

25
Chapter 6, 11, 13 – ISB Academic Team

 Define the target population


o Target population: The collection of elements or objects that possess the
information sought by the researcher and about which inferences are to be
made.
 Defined in terms of elements, sampling units, extent, and time.
 Element: the object about which or from which the information is
desired. In survey research, the element is usually the respondent.
 Sampling Unit: an element, or a unit containing the element
available for selection at some stage of the sampling process.

 Determine the sampling frame


o Sampling Frame: a representation of the elements of the target population.
 Consists of a list or set of directions for identifying the target
population.
o Sampling error: sampling frame may ignore or omit a few elements.
 Three ways to recognize and treat the sampling frame error:
 Redefine the population in terms of the sampling frame.
 Screening the respondents in the data-collection phase.
 Adjust the data collected by a weighting scheme to
counterbalance the sampling frame error.

 Select a sampling technique(s)


o Bayesian approach: the elements are selected sequentially.
 Explicitly incorporates prior information about population
parameters and the costs and probabilities associated with making
wrong decisions.
o Sampling with replacement: an element can be included in the sample
more than once.
o Sampling without replacement: an element cannot be included in the
sample more than once.
o The most important decision about the choice of sampling technique is
whether to use probability or nonprobability sampling.

26
Chapter 6, 11, 13 – ISB Academic Team

 Determine the sample size


o Sample size: The number of elements to be included in a study.
 Important qualitative factors that should be considerered in
determining the sample size include:

(1) the importance of the decision


(2) the nature of the research
(3) the number of variables
(4) the nature of the analysis
(5) sample sizes used in similar studies
(6) incidence rates
(7) completion rates
(8) resource constraints.
o Sampling efficiency - trade-off between sampling cost and precision -> The
sample size increase, each unit of information is obtained at greater cost.

 Execute the sampling process


27
Chapter 6, 11, 13 – ISB Academic Team

 A Classification of Sampling Techniques


o Nonprobability Sampling relies on the personal judgment of the
researcher rather than chance to select sample elements.
 Convenience sampling: attempts to obtain a sample of convenient
elements. The selection of sampling units is left primarily to the
interviewer.
 Judgmental Sampling: the population elements are purposely
selected based on the judgment of the researcher.
 Quota Sampling: two-stage restricted judgmental sampling.
 First stage: developing control categories or quotas of
population elements.
 Second stage: sample elements are selected based on
convenience or judgment.
 Snowball Sampling: an initial group of respondents is selected
randomly. Subsequent respondents are selected based on the
referrals or information provided by the initial respondents.

o Probability Sampling: sampling units are selected by chance.


 Simple Random Sampling (SRS): each element in the population
has a known and equal probability of selection.
 Every element is selected independently of every other
element and the sample is drawn by a random procedure
from a sampling frame (aka lottery system)
 Systematic Sampling: the sample is chosen by selecting a random
starting point and then picking every ith element in succession from
the sampling frame.
 Every element is selected independently of every other
element and the sample is drawn by a random procedure
from a sampling frame.
 Only the permissible samples of size n that can be drawn
have a known and equal probability of selection.
 Stratified Sampling: uses a two-step process to partition the
population into subpopulations, or strata (mutually exclusive and
collectively exhaustive).
 Elements are selected from each stratum by a random
procedure.
 Stratification variables: used to partition the population
into strata.
 Criteria for selection: homogeneity, heterogeneity,
relatedness, and cost.

28
Chapter 6, 11, 13 – ISB Academic Team

 Common variables: demographic characteristics, type of


customer, size of firm, or type of industry.

 Proportionate or disproportionate sampling?


o Proportionate stratified sampling: the size of the sample drawn from
each stratum is proportionate to the relative size of that stratum in the total
population.
o Disproportionate stratified sampling: the size of the sample from each
stratum is proportional to the relative size of that stratum and to the
standard deviation of the distribution of the characteristic of interest
among all the elements in that stratum.

 Cluster Sampling: multistage process.

(1) The target population is divided into mutually exclusive and


collectively exhaustive subpopulations or clusters.
(2) A random sample of clusters is selected, based on a probability
sampling technique such as SRS.
o For each selected cluster, either all the elements are included in the
sample or a sample of elements is drawn probabilistically.
 One-stage cluster sampling: If all the elements in each selected
cluster are included.
 Two-stage cluster sampling: If a sample of elements is drawn
probabilistically from each selected cluster.
o Area sampling: A common form of cluster sampling, in which the clusters
consist of geographic areas, such as counties, housing tracts, or blocks.
o In probability proportionate to size sampling, the clusters are sampled with
probability proportional to size.

29
Chapter 6, 11, 13 – ISB Academic Team

 Differences Between Stratified and Cluster Sampling:

Stratified Cluster (One-stage)

Objective Increase precision Decrease cost

A sample of clusters is
Subpopulations All strata are included chosen

Within subpopulations Each stratum should be Each cluster should be


homogeneous heterogeneous

Across subpopulations Strata should be Clusters should be


heterogeneous homogeneous

Needed for the entire Needed only for the selected


Sampling frame population clusters

Selection of elements Elements selected from All elements from each


each stratum randomly selected cluster are included

 Strengths and Weaknesses of Basic Sampling Techniques

Strengths Weaknesses

Nonprobability Convenience Least expensive, least Selection bias, sample not


Sampling sampling time-consuming, most representative, not
convenient recommended for
descriptive or causal
research

30
Chapter 6, 11, 13 – ISB Academic Team

Does not allow


Judgmental Low cost, convenient, not generalization,
sampling time-consuming subjective

Sample can be controlled Selection bias, no


Quota for certain assurance of
sampling characteristics representativeness

Snowball Can estimate rare


sampling characteristics Time-consuming

Difficult to construct
Simple sampling frame,
random expensive, lower
Easily understood, precision, no assurance of
sampling results projectable
(SRS) representativeness

Can increase
representativeness, Can decrease
Probability Systematic easier to implement than representativeness if
Sampling Sampling SRS, sampling frame not there are cyclical
necessary patterns

Difficult to select relevant


Includes all important stratification variables,
Stratified not feasible to stratify on
sampling subpopulations, precision
many variables,
expensive

Imprecise, difficult to
Cluster Easy to implement, compute and interpret
sampling cost-effective results

31
Chapter 6, 11, 13 – ISB Academic Team

2. SURVEY
 Survey method: A structured questionnaire given to a sample of a population and
designed to elicit specific information from respondents.
o The questioning is structured - the degree of standardization imposed on
the data collection process.
 Structured data collection: Use of a formal questionnaire that presents questions
in a prearranged order.
 The structured-direct survey involves administering a questionnaire. In a typical
questionnaire, most questions are fixed-alternative questions.
o Fixed-alternative questions: Questions that require respondents to
choose from a set of predetermined answers.
 Survey Methods Classification:

o Survey questionnaires may be administered in four major modes:


 Telephone Interviewing
 Personal Interviewing,
 Mail Interviewing
 Electronic Interviewing

32
Chapter 6, 11, 13 – ISB Academic Team

Telephone Interviewing Personal Interviewing

 Can be typed as traditional and  May be categorized as in-home, mall-


computer-assisted. intercept or computer- assisted.

 Personal In-Home Interviews:


 Traditional telephone respondents are interviewed face- to-
interviews involve phoning a face in their homes (to contact the
sample of respondents and asking respondents, ask the questions, and
them a series of questions. record the responses.)
 Computer-Assisted Telephone  Mall-Intercept Personal Interviews:
Interviewing (CATI): uses a respondents are intercepted while
computerized questionnaire they are shopping in malls and
administered to respondents over the brought to test facilities in the malls.
telephone.  Computer-Assisted Personal
Interviewing (CAPI): the respondent
sits in front of a computer terminal and
answers a questionnaire on the
computer screen by using the keyboard
or a mouse.

33
Chapter 6, 11, 13 – ISB Academic Team

Mail Interviewing Electronic Interviewing

 Can be conducted via ordinary mail  Can be conducted by e-mail or


or the mail panel. administered on the Internet or
the Web.

 Mail Interviews: interview  E-Mail Interviews: a list of email


questionnaires are mailed to addresses is obtained. The survey is
preselected potential written within the body of the e-mail
respondents. message. The emails are sent out over
the Internet.
 Mail Panels: consists of a large,
 Internet Interviews: Internet or Web
nationally representative sample
surveys use hypertext markup
of households that have agreed to
language (HTML), the language of the
participate in periodic mail
Web, and are posted on a Web site.
questionnaires and product tests.

 Improving the Response Rates

34
Chapter 6, 11, 13 – ISB Academic Team

 Refusals:
o Result from the unwillingness or inability of people included in the sample
to participate.
o Result in lower response rates and increased potential for nonresponse
bias.

 How to lower refusal rates?


o Prior notification: sent a letter notifying respondents of the imminent
survey => reduces surprise and uncertainty and creates a more cooperative
atmosphere.
o Motivating the respondents: the use of sequential requests.
 Foot-in-the-door:
 Starts with a relatively small request.
 Followed by a larger request, the critical request
 Door-in-the-face: reverse strategy.
 The initial request is relatively large and a majority of
people refuse to comply.
 Followed by a smaller request, the critical request.
o Incentives: offering monetary as well as non-monetary incentives
(commonly premiums and rewards) to potential respondents.
 Prepaid Incentive: included with the survey or questionnaire =>
more effective in increasing response rates than promised incentive.
 Promised Incentive: sent to only those respondents who complete
the survey.
o Good questionnaire design and administration: can decrease the overall
refusal rate as well as refusals to specific questions.
 Trained interviewers are skilled in refusal conversion or
persuasion.
o Follow-up: contacting the non-respondents periodically after the initial
contact.
 Can also be done by telephone, e-mail, or personal contacts.
o Other facilitators: Personalization, or sending letters addressed to specific
individuals, is effective in increasing response rates.

35
Chapter 6, 11, 13 – ISB Academic Team

3. FIELDWORK PROCESS

36
Chapter 6, 11, 13 – ISB Academic Team

 Fieldwork involves the selection, training, and supervision of persons who


collect data. The validation of fieldwork and the evaluation of fieldworkers are
also parts of the process.

 Selection: The researcher should:


o Develop job specifications for the project, taking into account the mode of
data collection.
o Decide what characteristics the fieldworkers should have.
o Recruit appropriate individuals.

 Training: should cover


o Making the Initial Contact: make opening remarks that will convince
potential respondents. Interviewers should also be instructed on handling
objections and refusals.
o Asking the Questions:
 Guidelines for asking questions:

1. Be thoroughly familiar with the questionnaire.


2. Ask the questions in the order in which they appear in the
questionnaire.
3. Use the exact wording given in the questionnaire.
4. Read each question slowly.
5. Repeat questions that are not understood.
6. Ask every applicable question.
7. Follow instructions and skip patterns, probing carefully.

 Probing: intended to motivate respondents to enlarge on, clarify, or explain their


answers. Probing should not introduce any bias.
o Common probing:
 Repeating the question.
 Repeating the respondent’s reply: Respondents can be stimulated
to provide further comments.

37
Chapter 6, 11, 13 – ISB Academic Team

 Using a pause or silent probe (should not become


embarrassing)
 Boosting or reassuring the respondent (“There are no right
or wrong answers, “Just whatever it means to you.”)
 Eliciting clarification.
 Using objective/neutral questions or comments.
 Recording the Answers: All interviewers should use the same
format and conventions to record the interviews and edit
completed interviews.
 The General Rule: check the box that reflects the
respondent’s answer (structured), record the responses
verbatim (unstructured).
 Terminating the Interview: The interview should not be closed
before all the information is obtained.
 The respondent should be left with a positive feeling about
the interview.
 Thank the respondent and express appreciation

38
Chapter 14 – ISB Academic Team

CHAPTER 14
DATA PREPARATION & RELIABILITY TEST

1. Data preparation (excel)


 Delete blank answer

39
Chapter 14 – ISB Academic Team

 Check skip pattern (check whether they read the skip pattern to answer the right
question)
o e.g. have you ever drunk beer? (If yes/no go to Q6/Q7, relatively).
o They answer yes but they go to Q7 -> wrong (maybe just randomly answer
so do not read the skip pattern) -> need to check (no skip pattern in ISB
questionnaires because it is hard for the beginners) -> delete the answer.

 Little variance: spam 1 (strict research might count 2) and for all of the question.
o e.g. choose all 1 or 5 for all of the questions (Likert scale).

40
Chapter 14 – ISB Academic Team

2. Reliability Test
 Definition:
o Reliability refers to the consistency of a test, often measured through the
internal consistency of a scale.
 Internal consistency means to what extent various parts of the
scale truly capture the desired characteristics in a similar direction.
o Only constructs have a reliability test. If we don’t have constructs, we don’t
have a reliability test.
o Constructs: Variables that are measured by more than one question.
 Example: Lecturer, Subject Complexity, Subjective Knowledge are
constructs (measured by 1+ question in the survey).
 Purpose: make sure that your groups of questions measure the same constructs =
how “focus’ your questions are.
 Technique: run a Reliability Test for each construct using Cronback’s Alpha.
o Cronback’s Alpha should be between 0 and 1.
 Cronback’s Alpha = 0 -> questions are not any focus.
 Cronback’s Alpha = 1 -> questions are totally focus.
 Best Cronback’s Alpha ranges from 0.5 to 0.95 (too close to mean
the questions are “too close”).
o If Cronbach’s alpha is < 0.5 -> the questions are not focus enough.
 If Cronbach’s Alpha is > 0.95 -> it likes asking 1 question 3 times, it
is not asking 3 questions about the same thing.

41
Chapter 14 – ISB Academic Team

 SPSS Running and Results:


o How to run SPSS:
 Analyse -> Scale -> Reliability Test
 Select all of the questions used to measure your construct.
 Example: Questions PBQ1 -> PBQ4 were designed to
measure the Perceived Brand Quality -> select PBQ1 ->
PBQ4 when checking for Perceived Brand Quality.

 For Statistics -> Check “Scale if item deleted”


 For Model -> Select Alpha
 Click Ok
 Notes: you must run the Reliability test for all of your constructs.

42
Chapter 14 – ISB Academic Team

o Results and Interpretation:


 For Reliability Statistics table:
 The Cronbach’s Alpha of the construct should fall between
0.5 and 0.95 -> the Alpha is “good enough”.
 For Cronbach Alpha if item deleted Table
 Remove the item/question that will improve overall
Cronbach’s alpha (must fall between 0.5 and 0.95) -> better
set of questions or the construct.

43
Chapter 14 – ISB Academic Team

o Example:
 The Cronbach’s Alpha of the construct is .776, which is between 0.5
and 0.95, we can continue.
 Running for Perceived Brand Quality: Cronbach’s alpha if PBQ4
is deleted is bigger than the Cronbach’s Alpha in Reliability Statistic
(.813 > .776), so we deleted the PBQ4 in the data set. If you delete
any other question, then the reliability will result in lower Cronbach
Alpha.
 Notes: DO NOT delete any question of construct which just has 2
questions (if do so, only 1 question remain for the construct =>
cannot test reliability)
 Do the same for other constructs.

44
Chapter 14 – ISB Academic Team

 Running for Brand Image: deleting any question would lower the
Cronbach’s Alpha of Reliability Statistics
(.705>.669>.637>.634>.631), so we are keeping the same questions.
 Check again without the deleted one to see whether there is any
better Cronbach Alpha.

 Running for Perceived Price: According to the table, if PP4 is


eliminated, the Cronbach's alpha will be higher than the Cronbach's
alpha in the Reliability Statistic (.573>.489), thus we removed it. If
any other question is removed, the reliability will suffer, resulting in
a lower Cronbach Alpha.

45
Chapter 14 – ISB Academic Team

 Running for Subjective Influence: We are retaining the same


questions for Subjective Influence since eliminating any will reduce
the Cronbach's Alpha of Reliability Statistics
(.862>.863>.657>.830).

 Running for Subjective Knowledge of Product: Because deleting


any will reduce Cronbach's Alpha of Reliability Statistics
(.802>.793>.774>.771>.737), we're keeping the same questions for
Subjective Influence.

46
Chapter 19 – ISB Academic Team

CHAPTER 19: FACTOR ANALYSIS

1. Definition:
 Factor analysis (or exploratory factor analysis - EFA) is a class of procedures
primarily used for data reduction and summarization. Factor analysis is an
interdependence technique in that an entire set of interdependent relationships
is examined.
o In marketing research, there may be a large number of variables, most of
which are correlated and which must be reduced to a manageable level.
o Relationships among sets of many interrelated variables are examined and
represented in terms of a few underlying factors.
 Purpose: Factor analysis is used in the following circumstances.
o To identify underlying dimensions, or factors, that explain the correlations
among a set of variables.
o To reduce a large number of correlated variables into a smaller and
uncorrelated set of variables for subsequent multivariate analysis
o Example: How do you reduce your data from 30+ variables into a smaller
and more useful set of variables?
 Applications in Marketing:
o Market Segmentation: identifying the underlying variables on which to
group the customers.
o Product Research: to determine the brand attributes that influence
consumer choice.
o Advertising Studies: used to understand the media consumption habits of
the target market.
o Pricing Studies: used to identify the characteristics of price- sensitive
consumers.

2. SPSS and Interpretation:


 Technique: Run factor analysis for each level (excluding the filter questions).
o Independent Variables (IV): Run Level 2 -> Level 1
o Dependent Variables (DV): Satisfaction, Purchase Intention, ...
 SPSS Running and Interpretation:
o For Independent Variables (IVs):
o Level 2 Factor Analysis:
 How to run SPSS:
 Step 1: Analyze -> Dimension Reduction -> Factor
 Step 2: Select all of your variables (minus the filter
questions and dependent variables).

47
Chapter 19 – ISB Academic Team

 Step 3: For “Descriptives” -> Click “KMO and Barlett’s test of


sphericity”
 Step 4: For “Extraction”, make sure the method is “Principal
Component”
 Step 5: For “Rotation”, check “Varimax”
 Step 6: Click Ok

48
Chapter 19 – ISB Academic Team

49
Chapter 19 – ISB Academic Team

50
Chapter 19 – ISB Academic Team

 Results & Interpretation:


 For KMO and Sig.
o KMO should be from 0.5 - 0.95, but should not be too
close to 1.
o The Sig. number should be less than 5% (0.005). The
smaller sig is (near .000), the better results we get
o Example: The KMO here is 0.77 (0.5 < 0.77 < 0.95),
and the Sig. is 0.000 < 0.005, which are good signs to
continue.

 Total Variances Explained table


o Initial Eigenvalues: the number of questions that
the representative represents.
o Each representative can be called a component.
o We only care about the factors that can represent
more than 1 question.
o Cumulative %: the percentage of the model that
can be explained by the components.
o Example:
 Component 1 represents 4.988 questions.
 To represent all of the variables, SPSS
recommends using a maximum of 5
factors/constructs. At least one question can
be explained by components 1 to 5, and
these 5 constructs would cover 62.762
percent of the model.

51
Chapter 19 – ISB Academic Team

o Level 1 Factor Analysis:


 How to run SPSS:
 Step 1: Analyse -> Dimension Reduction -> Factor
 Step 2: Select all of your independent variables (minus the
filter questions)
 Step 3: For “Descriptives” -> Click “KMO and Barlett’s test of
sphericity”
 Step 4: For “Extraction”, make sure the method is “Principal
Component”
 Step 5: For “Rotation”, check “Varimax”
 Step 6: For “Options”, check “Suppress small coefficients” ->
Absolute value below 0.50
 Step 7: Click Ok

52
Chapter 19 – ISB Academic Team

53
Chapter 19 – ISB Academic Team

54
Chapter 19 – ISB Academic Team

 Results & Interpretation:


 Rotated Components Matrix table:
o The number in the component column shows that
how “close” the component is to the questions.
o Delete the questions that are not represented by
any factors.
o To avoid multi-collinearity, delete the questions that
are represented by two or more components.
o We also skip the factor represents only 1 question
for the sake of reliability.

55
Chapter 19 – ISB Academic Team

o Example: Factor number 4 is close to questions SI1


-> SI5 -> Factor 4 represents these 5 questions. We
skip PP5 for it represents only 1 question
 Naming the new factors:
o The name of the factor is of the name construct
which it represents.
o If the factor represents 2 constructs, choose the
name of the construct that has more questions

56
Chapter 19 – ISB Academic Team

o If the questions in 1 construct are represented by


2 factors, choose the higher number (the closest to
the factor). This is called “surrogate factor”
o Example: Factor 1 represents questions PBQ1,
PBQ2, PBQ3. We name this factor Perceived Brand
Quality (PBQ).
o The value of these 5 factors will be the average of
the represented questions’ value.

o For Dependent Variable (DV):


 How to run SPSS:
 Step 1: Analyse -> Dimension Reduction -> Factor
 Step 2: Select all of your dependent variables
 Step 3: For “Descriptives” -> Click “KMO and Barlett’s test of
sphericity”
 Step 4: For “Extraction”, make sure the method is “Principal
Component”
 Step 5: For “Rotation”, check “Varimax”
 Step 6: For “Options”, check “Suppress small coefficients” ->
Absolute value below 0.50
 Step 7: Click Ok

57
Chapter 19 – ISB Academic Team

58
Chapter 19 – ISB Academic Team

 Results and Interpretation:


 Total Variances Explained table
o Initial Eigenvalues: the number of questions that
the representative represents.
o Example: SPSS recommends using 1 factor to
represent the dependent variables, and it would
cover 75.513 percent of the model.

59
Chapter 19 – ISB Academic Team

 Naming the factor:


o The name of the factor is of the name construct
which it represents.
o Example: The factor (PI1, PI2, PI3, PI4) will be
named Purchase Intention (PI).

60
Chapter 15 – ISB Academic Team

CHAPTER 15
FREQUENCY DISTRIBUTION, CROSS-TABULATION & HYPOTHESIS
TESTING

1. Frequency distribution:

 Objective: To obtain a count of the number of responses associated with different


values of one variable and to express these counts in percentage terms.
o In a frequency distribution, one variable is considered at a time.
 A frequency distribution for a variable produces a table of frequency counts,
percentages, and cumulative percentages for all the values associated with that
variable.
 The most commonly used statistics associated with frequencies are:
o Measures of location - describes a location within a data set (mean, mode,
and median).
o Measures of variability - indicates the distribution’s dispersion (range,
interquartile range, standard deviation, and coefficient of variation).
o Measures of shape - help in understanding the nature of the distribution
(skewness and kurtosis).

61
Chapter 15 – ISB Academic Team

Measures of Mean The average; that value obtained by summing all


location elements in a set and dividing by the number of
elements.

Mode A measure of central tendency given as the value


that occurs the most in a sample distribution.

Median A measure of central tendency given as the value


above which half of the values fall and below
which half of the values fall.

Measures of Range The difference between the largest and smallest


variability values of a distribution.

Interquartile The range of a distribution encompassing the


range middle 50 percent of the observations.

Variance The mean squared deviation of all the values from


the mean.

Standard The square root of the variance.


deviation

Coefficient of An useful expression in sampling theory for the


variation standard deviation as a percentage of the mean.

Measures of Skewness A characteristic of a distribution that


shape assesses its symmetry about the mean

Kurtosis A measure of the relative peakedness or flatness of


the curve defined by the frequency distribution.

62
Chapter 15 – ISB Academic Team

2. Cross-Tabulations:
 Definition:
o A statistical technique that describes two or more variables
simultaneously and results in tables that reflect the joint distribution of
two or more variables that have a limited number of categories or distinct
values.
o Purpose: To describe the relationship between two categorical variables.
o Contingency table: a cross-tabulation table. It contains a cell for every
combination of categories of the two variables.
o Bivariate Cross-Tabulation: Cross-tabulation with two variables.
o Example: People with different genders have different jobs, is that right?

 SPSS Running:
o Step 1: Analyze -> Descriptive Statistics -> CrossTabs
o Step 2: Put your variables into Row(s) and Column(s) - Age into Row,
Occupation into Column
o Step 3: For Statistics, check Chi-square
o Step 4: Click Ok

 SPSS Results & Interpretation:

63
Chapter 15 – ISB Academic Team

o Cross Tabulation Table: There are 8 participants who are under 18 and
none of them are employed, while there are 124 students and 2 employees
in the 18-25 age group.
 This is not an important table, it only shows the samples statistics,
not the relationship.
o Chi-square Tests: We look at the sig (2-sided) of the Pearson Chi- square
 If the Sig. is near 0.00 (<5%) -> There is a relationship between 2
variables -> The 18-25 age group is more independent than the
under 18 group.

3. Sample T-test:
 A univariate hypothesis test using the t distribution, which is used when the
standard deviation is unknown and the sample size is small.
 One sample: A hypothesis test using the t-test or the z-test to test about a single
variable against a known or given standard.
 T-test: testing whether the population mean conforms to a given hypothesis (𝐻0).
 Independent Samples T-test (for metric):
o Independent sample: The measurement of one sample has no effect on the
values of the second sample.
o Purpose: to check whether the sample mean and hypothesis mean is
statistically different.
o In a one-sample t-test, the variables we tested will be compared with the
known average values based on the hypothesis.
o Example: Does your female have a higher purchase intention than male?
o SPSS Running:
 Step 1: Analyze -> Compare Means -> Independent Samples T-test
 Step 2: For Test Variables, pick the factor that we want to test
(numeric/metric).
 Step 3: For Grouping Variables, pick the factor of the group that we
are comparing (nominal/non-metric).
 Step 4: In Defined Groups, define the group 1 and group 2 (1 for
male and 0 for female).
 Step 5: Click Ok

64
Chapter 15 – ISB Academic Team

o SPSS Results:
 Test Variables: Purchase Intention
 Grouping Variables: Gender (1 for male, 0 for female)

o Interpretation: Read the 2nd table -> 1st table


 Independent Samples Test Table:
 Levene’s Test for Equality: compare the Sig. to 0.05 (5%).
=> If Sig. is high (> 5%) -> 2 groups have the same variances
so we choose the equal variances assumed.
 For Sig. (2-tailed):
o If Sig. of equal variances assumed is high (>5%) -
> two group are the same (skip the 1st table).

65
Chapter 15 – ISB Academic Team

o If Sig. 2 tailed is lower than 5% -> there are


different between 2 groups -> go back to the 1st
table.
 Group Statistics Table: compare the means to determine the
higher one
 N is the sample size -> There are 92 male participants and
54 female participants.
 The mean of male participants is higher than that of female
participants (3.4457 > 3.3333) -> purchase intention of
male is higher than female.

 Paired Samples T-test / Dependent T Test (for non-metric):


o Paired samples t-test: A test for differences in the means of paired
samples.
o Purpose: to determine whether there is statistical evidence that the mean
difference between paired observations is significantly different from zero.
o Example: Is purchase intention of the participants more affected by
perceived price or subjective knowledge of product?
o SPSS Running:
 Step 1: Analyze -> Compare Means -> Paired Samples T-test

66
Chapter 15 – ISB Academic Team

 Step 2: For Paired Variables, specify the paired variables


 Step 3: Click Ok

o SPSS Results:
 Paired Variables: we pick Perceived Price for Variable 1 and
Subjective Knowledge of Product for Variable 2.

o Interpretation: Read backward from the last table


 Paired Samples Test: compare the Sig. (2-tailed) to 0.05 (5%).
 If sig 2 tailed > 5% -> there are no differences between 2
samples, skip the 1st table.
 If the sig 2 tailed is close to 0.00 (< 5%) -> the samples
are different -> go back to the 1st table to determine the
higher mean.
 Paired Samples Statistics: Compare the mean between PerPrice
and SubKnowProduct.
 Mean of PerPrice is 4.2534 > 3.6699 mean of
SubKnowProduct -> Perceived Price affect the customers
more.

67
Chapter 16 – ISB Academic Team

CHAPTER 16:
ANALYSIS OF VARIANCE & COVARIANCE

1. Analysis of Variance (ANOVA)


 Definition
o A statistical technique for examining the differences among means for two
or more populations.
o Factors:
 Categorical independent variables.
 The independent variables must be all categorical (nonmetric) to
use ANOVA.
 Example: Age, gender, income level, occupation…
o Treatment: In ANOVA, treatment is a particular combination of factor
levels or categories.
o One-way ANOVA involves only one categorical variable or a single factor.
⇒ Purpose: Test the relationship between a non-metric variable and
metric variable by comparing the means of different groups within the non-
metric variable (≥ 3 levels)
o Example: Does age affect the purchase intention of participants?
o Two-way ANOVA involves two factors.
⇒ Purpose: Compare the impact of two non-metric variables on a metric
variable.

 SPSS and Interpretation


o One-way ANOVA
 How to run SPSS:
 Step 1: Analyse → Compare Means → One-Way ANOVA
 Step 2: Put a metric variable into dependent list and a
categorical into factor
 Step 3: Click Post Hoc → Tick LSD → Continue
 Step 4: Click OK

68
Chapter 16 – ISB Academic Team

69
Chapter 16 – ISB Academic Team

 Results & Interpretation:


 Look at the ANOVA table, if the significance level (Sig.
Between Groups) is lower than 0.05 (<5%).
⇒ There is a relationship between those 2 variables.
 Take the Sums of Squares of Between Groups → divided by
the Total Sums of Squares → multiply with 100% →
Percentage of metric variable is explained by the non-
metric variable
𝑆𝑆 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑔𝑟𝑜𝑢𝑝𝑠
𝑇𝑜𝑡𝑎𝑙 𝑆𝑆
 Post Hoc Test: compare each level (I) under the non-metric
variable with the others (J):
o If the significance level between I and J < 0.05 ⇒
There is a difference between the impacts of I and J
on the metric variable.
→ Mean difference (I-J) > 0: Impact of I > impact of J
→ Mean difference (I-J) < 0: Impact of I < impact of J
o Conversely, the impacts of I and J on the metric
variable are the same.
o Example: Testing whether different family income
levels lead to different satisfaction levels of
university students.

 Significance level ≈ 0 ⇒ Family income really leads to


students’ satisfaction.
 SS between groups / Total SS = 25.308/714.425 = 0.0352 =>
About 3% of students satisfaction is explained by family
income.

70
Chapter 16 – ISB Academic Team

 The significance between the lowest income level (1) and


the highest income level (5) is 0.383 > 0.05 ⇒ Very poor
people and very rich people have the same satisfaction.
 The significance between the middle-income level (3) and
the low-income level (2) ≈ 0 ⇒ Middle-income people and
poor people have different satisfaction. Specifically, middle-
income people are more satisfied than poor people (mean
difference > 0).

o Two-way ANOVA:
 How to run SPSS:
 Step 1: Analyze → General Linear Model → Univariate
 Step 2: Input the dependent variable and fixed factors (non-
metric)
 Step 3: Click OK

71
Chapter 16 – ISB Academic Team

 Results & Interpretation:


 Ignore the “Between-Subjects Factors” table
 In the “Tests of Between-Subjects Effects” table, look at the
interacting power first (X1 * X2: When X1 changes, the
relationship between X2 and the dependent variable
changes too)
 Significance < 0.05: Not good because there is an
interacting power ⇒ Cannot measure the relationship of
each non-metric variable individually

72
Chapter 16 – ISB Academic Team

 Conversely, look at the significance of each non-metric


variable
o Significance < 0.05: The non-metric variable does
affect the dependent variable.
o The non-metric variable with a higher Type III Sum
of Squares affects the dependent variable more.

 Example: Compare the effect of Gender and Income on


Student’s Satisfaction

 Read the Gender*Income first (interacting power between


gender and income) → affect the DV (purchase intention)
 The interacting power should be small (smaller is better) →
can analyse 2 factors separately (they do not “mix up”)

73
Chapter 16 – ISB Academic Team

 Look at the Sig. of Gender*Income and compare it to 5% →


Significance of Gender*Income = 0.487 > 0.05 ⇒ The
interacting power does not matter.
 If Sig. is higher than 5% (.285 > .05) → no interacting
power → we can analyse them individually
 Significance of Gender = 0.598 > 0.05 ⇒ There is no
relationship between Gender and satisfaction

 Significance of Income = 0.001 < 0.05 ⇒ Income does affect


satisfaction
 In case both gender and income affect satisfaction, we
compare the sum of squares of them. Which one is higher,
that one affects the DV more.
 If Sig. of Gender*Income is lower than 5% → they mix
together too much. We cannot say which one affects
satisfaction more.

2. Analysis of Covariance (ANCOVA):


 Definition
o An advanced analysis of variance procedure in which the effects of one or
more metric- called extraneous variables are removed from the
dependent variable before conducting the ANOVA.
⇒ ANOVA with control variables (metric) that are outside the model but
may affect the model.
o Control a variable: Put the variable into the test → The interacting power
of that variable is taken out.

 SPSS and Interpretation


o How to run SPSS:
 Step 1: Analyse → General Linear Model → Univariate
 Step 2: Input the dependent variable and fixed factors (non-
metric)
 Step 3: Input control variables (metric independent variables) into
covariate
 Step 4: Click OK

74
Chapter 16 – ISB Academic Team

 Results & Interpretation:


o In the “Tests of Between-Subjects Effects” table, look at the control
variable first.
 Significance < 0.05: The control variable does affect the dependent
variable.
 After that, we interpret the same as a Two-way ANOVA.
 However, the results can be different from One-way ANOVA and
Two-way ANOVA because of the power of the control variable.
o Example:
 Compare the effect of Gender and Income on students’ satisfaction
under the impact of perception towards lecturers.

75
Chapter 16 – ISB Academic Team

 Significance of Lecturers ≈ 0 ⇒ Lecturers do affect satisfaction.


 Both significances of Gender and Income are higher than 0.05.
⇒ Gender and Income have no effect on students’ satisfaction.

76
Chapter 17 – ISB Academic Team

CHAPTER 17
CORRELATION AND REGRESSION

1. Correlation Analysis
 Definition:
o The product moment correlation coefficient, r, measures the linear
association between two metric (interval or ratio scaled) variables.
 Its square, r2, measures the proportion of variation in one variable
explained by the other.
o The partial correlation coefficient measures the association between two
variables after controlling, or adjusting for, the effects of one or more
additional variables.
o The order of a partial correlation indicates how many variables are being
adjusted or controlled. Partial correlations can be very helpful for
detecting spurious relationships.
o A nonmetric correlation measure for two nonmetric variables that relies
on rankings to compute the correlation.

77
Chapter 17 – ISB Academic Team

 SPSS Technique:
o Correlation Test: Test the relationships between 2 metric variables.
o Correlation (r): the degree to which a pair of variables is linearly related.
 Range: -1 < r < 1
 High r = a strong relationship between the two variables
 Low r = the variables are weakly related
 Special Values
 r = 0 => no relationship between the variables
 r = 1 => positive correlation (2 variables move in the same
direction)
 r = -1 => negative correlation (2 variables move in the
opposite direction)
o How to run SPSS:
 Step 1: Select Analyze → Correlate → Bivariate
 Step 2: Put all of the metric variables in (both DVs and IVs)
 Step 3: Make sure that you check Pearson (in Correlation
Coefficients), Two-tailed and Flag significant correlation
 Step 4: Click Ok
o SPSS Results

78
Chapter 17 – ISB Academic Team

o Interpretation:
 The diagonal of the table is 1 ← Correlation between 1 variable
and itself is always 1 (PerBrandQuality - PerBrandQuality).
 The diagonal is also a “mirror” → the above and below triangle is
symmetric.
 The Sig. here have no meaning (whether < 5% or not).
 Compare the correlation of all IVs to each other
 Look for correlations that is higher than 0.7 → IVs are too
near (multicollinearity). It’s not the case here, but if you
have it, go back to Factor Analysis.
 Compare the correlation of all IVs to DV:
 PerBrandQuality has the best correlation (.625 is the
highest) → However, we cannot conclude anything since
correlation matrix do not account the interacting power
between IVs (“fake relationships”).
 Things are fine even if 1 IV is too close to the DVs.

2. Simple Regression
 Definition:
o Correlation Test: Test the relationships between 2 metric variables.
o Regression analysis is a powerful and flexible procedure for analyzing
associative relationships between a metric dependent variable and one or
more independent variables.
o Purposes:
 Determine how much of the variation in the dependent variable can
be explained by the independent variables: strength of the
relationship.
 Determine the structure or form of the relationship: the
mathematical equation relating the independent and dependent
variables.
 Predict the values of the dependent variable.
 Control for other independent variables when evaluating the
contributions of a specific variable or set of variables.
o Bivariate regression derives a mathematical equation between a single
metric criterion variable and a single metric predictor variable. The
equation is derived in the form of a straight line by using the least-
squares procedure.
o When the regression is run on standardized data, the intercept assumes a
value of 0, and the regression coefficients are called beta weights.

79
Chapter 17 – ISB Academic Team

o The strength of association is measured by the coefficient of determination,


r2, which is obtained by computing a ratio of SSreg to SSy.
o The standard error of estimate is used to assess the accuracy of
prediction and may be interpreted as a kind of average error made in
predicting Y from the regression equation.

80
Chapter 17 – ISB Academic Team

 SPSS Technique:
o Regression Test: test the relationship between independent variable X
(metric) and dependent variable Y (metric).
o Purpose: Used to predict the value of a variable based on the value of
another variable.
o Function: D: Y = aX + b + e

 SPSS Running:
o Step 1: Analyze → Regression → Linear
o Step 2: Put the IV into Dependent, the DVs into the Independent box
o Step 3: Click Ok

 SPSS Results:

81
Chapter 17 – ISB Academic Team

 Interpretation: read from the last table → 2nd table


o Coefficient Table: The Sig. of SocInfluence is .001 (<5% and close to .000)
→ Social Influence leads to PurIntention
o Strength of the Relationship: Look at B, not Beta (single relationship) → B
is .295 → To Increase PurIntention by 1 unit → Social Influence must be
increased by .295 unit
o Model Summary: Adjusted R Square = 0.063 → 6.3% of PurIn is explained
by SocialInfluence (92.7% explained by errors.

3. Multiple Regression
 Definition:
o Multiple regression involves a single dependent variable and two or more
independent variables.
o The partial regression coefficient, b1, represents the expected change in
Y when X1 is changed by one unit and X2 through Xk are held constant.
o The strength of association is measured by the coefficient of multiple
determination, R2.
o The significance of the overall regression equation may be tested by the
overall F test.
o Individual partial regression coefficients may be tested for significance
using the t test or the incremental F test.

 SPSS Technique:
o Regression Test for 1 IV and 1 DV:
 Purpose: used when we want to predict the value of a variable
based on the value of another variable.
 Function: D: Y = aX + b + e
 Example: Propose that SocialInfluence lead to PurIntention -> Test
that (single relationship).
 SPSS Running:
 Analyze -> Regression -> Linear
 Put the IV into Dependent, the DVs into the Independent box
 Click Ok
 SPSS Results:

82
Chapter 17 – ISB Academic Team

 Interpretation: read from the last table -> 2nd table


 Coefficient Table: The Sig. of SocInfluence is .001 (<5% and
close to .000) -> Social Influence lead to PurIntention
 Strength of the relationship: Look at B, not Beta (single
relationship) -> B is .295 -> To Increase PurIntention by 1
unit -> Social Influence must be increased by .295 unit
 Model Summary: Adjusted R Square = 0.063 -> 6.3% of
PurIn is explained by SocialInfluence (92.7% explained by
errors)

o Regression Test for 2 IVs and 1 DVs


 Purpose: Adding 1 more variable into the regression to increase
the “reality fit”.

83
Chapter 17 – ISB Academic Team

 SPSS Running:
 Step 1: Analyze → Regression → Linear
 Step 2: Put the new variable (BrandImage) into the
Independent box
 Step 3: Click Ok
 SPSS Results:

 Interpretation:
 If all Sig. are lower than 5% → both factors really lead to
Purchase Intention → Which one is stronger?
 B is unstandardized but Beta is already standardized
(both represent the slope) → Look at the Beta in multi
regression and B in single regression
 Model Summary: Adjusted R Square is .163 → higher than
single regression (.063) → reduce the error

84
Chapter 18 – ISB Academic Team

CHAPTER 18
DISCRIMINANT AND LOGIT ANALYSIS

1. Discriminant
 Definition
o A technique for analysing marketing research data when the dependent
variable is non metric and independent variables are metric.
o The purpose of Discriminant analysis is to test the effect on the metric (IVs)
leading to 2 or 3 levels of non - metric variable (DV).
 Examples
o If subscribing intention DV only has 2 levels: low and high, and attitude is
the most important among the others, is that right?
o Is Brand Image the most important factor affecting the 2-level purchase
intention?
 SPSS Running and interpretation
o 2- groups discriminant
 Example: Is Brand Image the most important factor affecting the 2-
level purchase intention? (2 levels are high and low)
 How to run SPSS
 Step 1: Transform dependent variables into independent
variables that have 2 levels
 Transform → Recode into Different variables → Purchase
Intention
 Name the new variable PI2lev
 For the Old -> New box:
o Tick Range: type in 1 through 3 → New value: type
in Value: 1
o Type in 3 through 5 → New value: type in Value: 2

85
Chapter 18 – ISB Academic Team

 Step 2: Analyse → Classify → Discriminant


 Step 3: PI2lev → Grouping Variable
 Step 4: Define range (1-2)
 Step 5: Independent: all of independent variables
 Step 6: Classify → Leave-one-out-classify
 Step 7: Click continue → OK

86
Chapter 18 – ISB Academic Team

 SPSS Interpretation

 Eigenvalues table
 We have 1 function because we have 2 groups (1 function to
categorize the results into 2 groups).
 Eigenvalue= xxx, function 1 explains 100% of variance in
dependent variable.
 Canonical Correlation = 0.xx → (0.xx2) = 0.xx = xx% of
dependent variable explained by independent variable.
 Example: Canonical Correlation =0.551 → (0.303)2= 0.674
→30.3% of dependent variable explained by independent
variables.

 Wilk's Lambda
 Sig < 5% --> It's good. It means that SPSS can create the
discriminant line.
 Sig > 5% --> Groups are mixed together too much --> SPSS
cannot create the Discriminant line (maybe can but it's not
effective).

87
Chapter 18 – ISB Academic Team

 Standardised canonical discriminant function coefficient:


 Compare the Beta (the slope between the D and DV): which
is highest --> affect DV the most
→ Thus, it is correct/ incorrect with your question that
IV is the most important among the others.
 Examples:
o Beta of Perceived Brand Quality is 0.804, which is
the highest beta → means that Perceived Brand
Quality has the highest effect/the most important
among others on the purchase intention.

 Classification results

 Look at the cross validated grouped


 Number of people predicted in group 1 and actually in
group 1: x
 Number of people predicted in group 2 and actually in
group 2: y
 To check validity of estimated discriminatory function, we
calculate based on randomly sample selected
→ Correct discrimination ratio = (x+ y)/N = %, where N
is calculated based on samples collected by researchers.
If the ratio is > 62.5%, it is good.
 The simple random rate is 50 percent for 2 groups.
Researchers believe that a ratio that is 25% greater than
simple random (50%), which translates to a mean over
62.5%, is favorable.

88
Chapter 18 – ISB Academic Team

 Example:
o Number of people predicted in group 1 and actually
in group 1: 40
o Number of people predicted in group 2 and actually
in group 2: 65
—> Correct discrimination ratio = (40+65)/
(55+94) = 0.7 → 70% > 50% →it is good because
there are only 2 groups being classified.

o 3- groups discriminant
 Example: Is Brand Image the most important factor affecting the 3-
level purchase intention?
 How to run SPSS:
 Step 1: Transform dependent variables into independent
variables that have 2 levels
 Transform → Recode into Different variables → Purchase
Intention
 Name: PI3lev
 In the Old -> New table
o Tick range from 1 through 2.3 → New value: value:
1
o Tick range 2.3 to 3.7 → New value: value : 2
o Tick range 3.7 to 5 → New value: value : 3

89
Chapter 18 – ISB Academic Team

 Step 2: Analyse → Classify → Discriminant


 Step 3: PI3lev → Grouping Variable
 Step 4: Define range (1-3)
 Step 5: Independent: all of independent variables
 Step 6: Classify → Leave-one-out-classify and Combined
group also tick all groups equal and within-groups
 Step 7: Click continue → OK

90
Chapter 18 – ISB Academic Team

 SPSS Interpretation

 Eigenvalues table
 We have 2 functions because we have 3 levels so it needs to
have 2 discriminant lines.
 We do not need canonical correlation because we have 2 R,
which are considered as 2 levels. DO NOT care about
canonical correlation because there are two R and we
cannot square them like 2-level discriminants.
 Look at the % of variance column:
o Function 1 can explain xx% of variance.
o Function 2 can explain xx% of variance.
 Example
o Function 1 can explain 91.2% of variance
o Function 2 can explain 8.8% of variance

91
Chapter 18 – ISB Academic Team

 Wilk's Lambda
 Sig. of Function 1 through 2 < 5% → We should use both 2
functions to explain the model,
 Sig of Function 2 > 5% → Do not use only function 2 alone
to explain the model
 If both Sig. < 5% → We can explain the model with 2
functions or with function 2 only. However, with function 1
only, we don’t know.

 Standardised Canonical Discriminant Function Coefficient:


 Beta of var x is xx, which is the highest beta in Function 1 →
Function 1 has a good relationship with var x.
 Beta of var y is yy, which is the highest beta in Function 2 →
Function 2 has a good relationship with var y.
 Example:
o Beta of Perceived Brand Quality is 0.807, which is
the highest beta in Function 1 → Function 1 has a
strong relationship with Perceived Brand
Quality.
o Beta of Brand Image, which is the highest beta in
Function 2 → Function 2 has a good relationship
with Brand Image.

 Canonical Discriminant Function table:


 Look at Group Centroid. We draw the line horizontally and
vertically from each group centroids to the axis.
 We want to improve DV (purchase intention) → Group 3 is
a group of high purchase intention consumers → we want
number of them to increase.
 Group 3 has a good relationship with function 1.
 Want to improve DV → Improve Group 3 —> Improve
Function 1→ Improve 2 highest betas in function 1 (look at
Standardised Canonical Discriminant Function
Coefficient).

92
Chapter 18 – ISB Academic Team

 Classification results

93
Chapter 18 – ISB Academic Team

 Look at the cross validated grouped:


 Number of people predicted in group 1 and actually in
group 1: x
 Number of people predicted in group 2 and actually in
group 2: y
 Number of people predicted in group 3 and actually in
group 3: z
 To check validity of estimated discriminatory function, we
calculate based on randomly sample selected
→ Correct discrimination ratio = (x+ y+ z)/N = %, where
N is calculated based on samples collected by
researchers. If ratio > 41.6%, it is good.
 Example
o Number of people predicted in group 1 and actually
in group 1: 12
o Number of people predicted in group 2 and actually
in group 2: 26
o Number of people predicted in group 3 and actually
in group 3: 52
 The simple random rate is 33.3 percent for 3 groups.
Researchers believe that a ratio that is 25% greater than
simple random (33.3%), which translates to a mean over
41.6%, is favourable.
—> Correct discrimination ratio = (12+26+52)/149)=
0.6 → 60% > 41.6% →it is good.

94
Chapter 18 – ISB Academic Team

2. Logistics regression analysis


 Definition
o Logistic regression is used to predict the relationship between
independent variables and the dependent variable where the
dependent variable is binary.
o The purpose of logistic regression is to test whether the independent
variable affects or not affects the dependent variable.
o Logistics regression does have the sig but not the beta, Discriminant does
have the beta but no sig. —> we cannot compare the effect of independent
variables on the dependent variable in logit analysis.
o Examples:
 Does interface quality really affect (important at all) in the whole
model?
 If subscribing intention only has 2 levels: low and high, frequency of
past behaviour is not important at all among other factors, is that
right?

 SPSS Running and Interpretation


o Example: If purchase intention only has 2 levels: low and high, social
influence is not important at all among other factors, is that right?
o How to run SPSS
 Step 1: Transform dependent variables into independent variables
that have 2 levels
 Transform → Recode into Different variables → Purchase Intention
 Name: PI2lev
 In the Old -> New Table:
 Tick range from 1 through 3 → New value: value: 1
 Tick range 3 to 5 → New value: value : 2

95
Chapter 18 – ISB Academic Team

 Step 2: Analyse → Regression→ Binary logistics


 Step 3: Dependent variable: PI2lv
 Step 4: Independent: all of independent variables
 Step 5: Click continue → OK

96
Chapter 18 – ISB Academic Team

o SPSS Interpretation

 Do not care about the block 0


 In block 1:
 The omnibus Tests of Model Coefficients table in Model row:
Sig. < 0.05
→ Block 1 model is a significant improvement to the block 0
model.
 Model Summary
 Nagelkerke R squared = 0.XYZ→ DV is explained by XY.Z%
by this model.
 Example: Nagelkerke R squared is 0.415 → DV is explained
by 41.5% by this model.

97
Chapter 18 – ISB Academic Team

 Variable in the Equation


 Sig. < 5% --> good sig--> IV affect DV
 Sig. > 5% → IV does not affect DV
 We do not compare the B of the independent variables to see
which have the stronger effect on dependent variable as the B
in this case has no meaning.
 Example:
 Sig of Social influence is 0.019, which is lower than 5%,
which means that social influence really affects the
purchase intention → the statement is wrong.

98
Chapter 20 – ISB Academic Team

CHAPTER 20: CLUSTER ANALYSIS

1. Definition:
 Cluster Analysis (classification analysis, or numerical taxonomy) is a class of
techniques used to classify objects or cases into relatively homogeneous groups
called clusters.
o Objects in each cluster tend to be similar to each other and dissimilar to
objects in the other clusters.
 Cluster analysis examines an entire set of interdependent relationships, which
mean it makes no distinction between dependent and independent variables.
 Purpose: to classify objects into relatively homogeneous groups based on the set of
variables considered.
 Both cluster analysis and discriminant analysis are concerned with classification.
o In cluster analysis, there is no a priori information about the group or
cluster membership for any of the objects -> You don’t know who or what
belongs in which group. You often don’t even know the number of groups.
o In discriminant analysis, prior knowledge of the cluster or group
membership are required for each object or case included to develop the
classification rule -> You can derive a rule for classifying other cases
based on the available cases
 SPSS has three different procedures that can be used to cluster data: hierarchical
cluster analysis, k-means cluster, and two-step cluster.
o A large data file, a mixture of continuous and categorical variables -> two-
step procedure
o A small data set and want to easily examine solutions with increasing
numbers of clusters -> hierarchical clustering
o You know how many clusters you want and you have a moderately sized
data set -> k-means clustering
 Applications in Marketing:
o Segmenting the Market: For example, consumers may be clustered on the
basis of benefits sought from the purchase of a product. Each cluster would
consist of consumers who are relatively homogeneous in terms of the
benefits they seek
o Example: how many segments should we divide the market into and what
are the characteristics of those segments.

99
Chapter 20 – ISB Academic Team

2. SPSS and Interpretation:


 Hierarchical Clustering:
o How to run SPSS:
 Step 1: Analyse -> Classify -> Hierarchical cluster
 Step 2: Put all your IVs and DV into the Variable(s)
 Sep 3: For Statistics, tick the Agglomeration schedule
 Step 4: For Plots, tick Dendrogram
 Step 5: For Method, tick Ward’s method
 Step 6: Make sure the Measure section is Interval and Square
Euclidean distance
 Step 7: Click Ok

100
Chapter 20 – ISB Academic Team

101
Chapter 20 – ISB Academic Team

o Results & Interpretation:


 Notes: Pretend that your sample represents the market for the
segmentation to be “useful” (since the sample size is small).
 For Agglomeration Schedule:
 An agglomeration schedule gives information on the
objects or cases being combined at each stage of a
hierarchical clustering process.
 The number in Cluster 1 and Cluster 2 column is the
variables that got combined at nth stage.
 The Coefficients column represents the distance between 2
cluster that got combined.
 If Coefficient is .000 -> they got combined because 2
participants got the same answers for every single question.
 Look for the biggest jump in Coefficients to determine the
number of clusters
 Example:
o SPSS combines clusters 42 and 45 at stage 121.
o In this case, SPSS suggests dividing into two
segments based on the chart. (Because the distance
between the last two clusters is the greatest -> keep
2 clusters for the most different groups).

102
Chapter 20 – ISB Academic Team

103
Chapter 20 – ISB Academic Team

104
Chapter 20 – ISB Academic Team

 Ward’s Linkage Dendrogram: To determine the number of


clusters.
 A dendrogram, or tree graph, is a graphical device for
displaying clustering results.
 Draw a line on the graph whenever you see a vertical line
and start reading the dendrogram is read from left to right.
 The distances indicate how separated the individual pairs of
clusters are. Clusters that are widely separated are distinct,
and therefore desirable -> Look for the largest width
between the lines -> Count the number of horizontal lines
from the top in the largest segment
 Example: In this case, the widest gap between lines has 2
horizontal lines -> there should be 2 segments.

105
Chapter 20 – ISB Academic Team

 K-Means Cluster Analysis (“Quick Cluster”):


o The K Means Cluster Analysis is used to determine the properties or
characteristics of each cluster.
o Example: what are the characteristics of the segments that we divided?
o SPSS Running:
 Step 1: Analyze -> Classify -> K-means Clusters
 Step 2: Put all your IVs and DV into the Variable(s)
 Step 3: Type in the Number of Clusters you desỉe (get from running
the Hierarchical Clustering)
 Step 4: Click Ok

106
Chapter 20 – ISB Academic Team

o Results & Interpretation:

107
Chapter 20 – ISB Academic Team

 For Final Cluster Centers and Number of Cases in each Cluster:


 Cluster 1 contains 70 cases while Cluster 2 contains 76
cases -> Divide by the total number of participants to know
the percentage of market each clusters represent
 Conclusion:
 Cluster 1 is more satisfying than Cluster 2 in every way (the
numbers are always higher in Cluster 1).
 Cluster 2 is made up of choosy clients, whereas Cluster 1 is
made up of easygoing customers.
 This solution, however, is only for mathematical purposes and has
no application to managerial choices.
 To avoid this issue, we should apply the K-means technique to
divide the market into more segments (keep the number under 6 so
the clusters are not too close).

108
Chapter 21 – ISB Academic Team

CHAPTER 21
MULTIDIMENSIONAL SCALING & CONJOINT ANALYSIS

1. Multidimensional Scaling
 Internal
o Definition
 A technique for positioning, which means making a spatial map -
the visual display of our brands and our competitor positions, with
the purpose of comparing and knowing about the positioning of
current brands on these dimensions.
 The purpose of this technique is to determine direct and indirect
competitors and the strength of them.
 Example:
 Who are your direct and indirect competitors, what are the
strengths of you and your competitors?

o SPSS Running and Interpretation


 Example: Our firm is Samsung, so who are your direct and indirect
competitors, what are the strengths of you and your competitors?
 How to run SPSS:
 Step 1: Analyze → Scale → Multidimensional Scaling
(PROXSCAL)
 Step 2: Multidimensional Scaling Data: Data format →
Create proximities from data
 Step 3: Put our firm and the competitors into the variables
box
 Step 4: Check the following:
o Model: Interval
o Output: check the distances
o Measure: Square Euclidean distance

109
Chapter 21 – ISB Academic Team

110
Chapter 21 – ISB Academic Team

 SPSS Interpretation: output check the distances

 Define the direct and indirect competitors → we have 2


methods
 Using the Distance tables:
o Direct competitors: which competitors have a small
distance with our firm are the direct competitors.
o Indirect competitors: which competitors have a big
distance with our firm are indirect competitors.
 Examples:
o Our firm is Samsung → look at the first column
o Xiaomi is the most direct competitor with a distance
of 0.257, followed by Apple (0.454).
o Indirect competitors: Nokia, Huawei, Oppo, Vivo.

111
Chapter 21 – ISB Academic Team

 Using the object points (internal spatial map) → cannot


talk about the characteristics
o Direct competitors: which firms near to our firm
are the direct competitors.
o Indirect competitors: which firms far from our
firm are the indirect competitors.

what we want is the map

 Define the strength of each competitor and our firm →


Based on the excel file, which firm has the highest score in
each dimension will have that strength.
 Example:

112
Chapter 21 – ISB Academic Team

→ Strength of each brand:


o Oppo: Brand Image
o Vivo: Social Influence
o Realme: Subjective Knowledge of Product
o Huawei: Perceived Price
o Apple: Perceived Brand, Purchase Intention

 External:
o Definition
 The technique can see how far each brand to be the perfect, also
which brand is the nearest to the perfect brand compared to the
other.
 Example:
 Between you and your competitors, who is the nearest to be
the perfect competitor in this market?
o SPSS Running and Interpretation
 Example: As your firm is Samsung, between you and your
competitors, who is the nearest to the perfect competitor in this
market?
 How to run SPSS:
 Step 1: On the left bottom of the Data view -> Variable View ->
Creating the new variable (named Perfect – PerfectUni,
PerfectCompany,,..) in Name column. After that, we give the
all 5 for the PerfectFirm variablec -> Run the
Multidimensional again.
 Step 2: Analyze → Scale → Multidimensional Scaling
(PROXSCAL)
 Step 3: Multidimensional Scaling Data: Data format →
Create proximities from data
 Step 4: Put our firm, the competitors and the Perfect into
the variables box
 Step 5: Check the following:
o Model: Interval

o Output: check the distances

o Measure: Square Euclidean distance

113
Chapter 21 – ISB Academic Team

 SPSS Interpretation
 Object Common Table: which point is nearest to the
Perfect point. —> external spatial map.

 Distance Table: which brand has shortest/ smallest


distance with Perfect Brand. --> Brand X is the nearest to be
the perfect competitor in the market
 Example: Based on the external spatial map, we can see
that Apple is the nearest to the Perfect point and based on
the distance table, we can see that Apple is closest to the
Perfect Firm (1.140).

114
Chapter 21 – ISB Academic Team

2. Conjoint Analysis:
 Definition
o Conjoint analysis attempts to determine the relative importance
consumers attach to salient attributes and the utilities they attach to the
levels of attributes
o Conjoint tests estimate which attribute is the most and the least important
that leads to respondents’ preference.
o Example:
 Brainstorm for ideas of a new product with 3 attributes, 2 levels
each. Collect preferences of all members in your own group to test
the new product.

 SPSS Running and Interpretation


o We start by preparing Plancards, breaking down your dimensions into
attributes.
o Examples: For our product being Samsung phone, we have 3 main
attributes, each having 2 levels
 Price: 2 levels - low and high
 Size of the smartphone: 2 levels, medium (4.5-5 inches) and large
(5-5.5 inches)
 Screen: Foldable or unfoldable

115
Chapter 21 – ISB Academic Team

o How to run SPSS:


 Step 1: Create profiles
 Step 1.1: From the above 3 attributes, we will create several
smartphone profiles using Orthogonal Design in SPSS
o Analyze → Orthogonal → Design
o Factor name: Add your attribute → Add
o Define value: 1 → label name; 2 → label name (see
the image below)

116
Chapter 21 – ISB Academic Team

 Step 1.2: After enter all attribute → create new data file →
file and save your file name
 Step 1.3: Save and Click Oke
 Step 1.4 -Display the Plancards:
o Open that data file → Orthogonal → display and
choose profile for subject → we will have a card list
like this (see the image below)

 Step 2: Collect users’ preferences


 The preferences help SPSS understand the taste of users to
create the best profile that can attract most customers. The
larger the data, the more precise the result.
 Step 2.1: Open new data → variable → create the ranks from
1 to N (N is the number of competitors that you have
collected the data).
 Step 2.2: Ask the respondents to rank the attributes, with ID
is the attribute in the Card list that you have created (see the
image below).
 Step 2.3: Save your file name

117
Chapter 21 – ISB Academic Team

 Step 3: Run the conjoint


 Step 3.1: Open → syntax (will be provided)
 Step 3.2: Go to the Library to find the correct address of
each input file.
 Step 3.3: Enter the name of Factors correctly
 Step 3.4: Click run (the green play button)

118
Chapter 21 – ISB Academic Team

o SPSS Interpretation

119
Chapter 21 – ISB Academic Team

 Warnings
 No reveals occurred → good things as there are no mistakes

 Utilities
 Utility estimate indicates which option is the best for your
new product.
 In each attribute, which level has the highest utility estimate
is the best for that attribute.
 The highest Utility Estimate → the best choice for each
attribute.
 Example: With the results shown in the table, we can
conclude that manufacturer should create smartphone with
the following characteristics:
o High price (utility estimate is .750)
o Medium size (utility estimate is .563)
o Foldable phone (utility estimate is .375)

 Importance value
 The highest value → the most important attribute that leads
to people preference.
 The lowest value → the least important attribute that leads
to people preference.
 Example: It is the price that the manufacturer needs to pay
most attention to as it has the highest importance value
score.

120
Summary Table - ISB Academic Team

No. Techniques Purpose Data Type Interpretation Example

Test the reliability of the


items in a construct of the Cronbach’s Alpha Are your questions
Reliability
1 questionnaire when we Metrics should fall between related or suitable for
test
have multiple Likert scale 0.5 and 0.95 your questionnaire?
questions.

A method to test main KMO should be from


constructs to find a 0.5 to 1 How to reduce 35+
Factor representative for each Dimension variables to only 4 or
2 The Sig. number
Analysis construct and make sure Reduction 5 conclusive
groups are not merged should be less than constructs?
together. 5% (0.005)

Sig. is high (>5%),


two group are the
2-levels Are there differences
same
Independent Compare sample mean Nonmetric in perceived brand
3 > Sig. is lower than
sample t-test and hypothesis mean quality between the
5%, there is a two genders?
Metric
difference between
2 groups

Sig. 2 tailed > 5%,


there is no
differences between Is it right to say that
Compare the means of 2 2 samples
Paired Metric > people with different
4 variables (both metrics) in
sample T-test Metric Sig. 2 tailed is close jobs have different
1 sample.
to 0.00 (< 5%), the perceptions on price?
samples are
different

Test the relationship Sig. is near 0.05 People with different


Nonmetric
Cross between two categorical (<5%), there is a genders have
5 >
Tabulation variables (non-metric relationship different jobs, is that
Nonmetric
variables) between 2 variables right?

121
Summary Table - ISB Academic Team

Sig. is lower than Does income (low


3-levels 0.05 (<5%), there income, high income
One-way Comparing means between Non is a relationship and medium income)
6
ANOVA 3 or more samples. Metric > between those 2 affect purchase
Metric variables intention?

Test the relationship Significance < 0.05:


between two non-metric Between age and
2 Non The non-metric
Two-way variables and a metric gender, which one
7 Metrics > variable does affect
ANOVA variable to see which one has more effect on
has more effect on DVs Metric the dependent
purchase intention?
variable

Between age and


gender, which one
2-way ANOVA with 1 Significance < 0.05:
2 Non has more effect on
controlled independent The control variable
8 ANCOVA Metric > purchase intention,
variables does affect the
Metric with control on the
dependent variable
brand image?

Correlations Table

If Pearson
correlation
Test the relationship Do your questions
between 2 IVs is >
9 Correlation between 2 metric All Metric face the problem of
0.7 ->
variables. multicollinearity?
multicollinearity ->
return to Factor
analysis

Coefficient Table

Sig. < 0.05 -> there


is a relationship Propose that Social
Test the relationship
Influence leads to
10 Regression between metric variables > All Metric B: Strength of
Purchase Intention?
metric variables. Relationship
How do we test this?
Adjusted R-
squared

122
Summary Table - ISB Academic Team

Canonical
Correlation (only
for 2-level)

Wilk’s Lambda
Is Brand Image the
Test effect on the Metric >
Beta: which one most important
Discriminant metric (IVs) lead to 2 Nonmetric
11 has the biggest factor affecting the 2-
Analysis or 3 levels of non - (Intention, effect level purchase
metric variable (DV). Satisfaction)
No Sig. provided - intention?
> cannot
determine if the
relationship is true

Nagelkerke R-
square

SIG < 5%: IV affect


DV
Metric >
Test the relationship Nonmetric SIG > 5%: IV does Can Subjective
12 Logit Analysis between metric and (Purchase not affect DV Knowledge lead to
nonmetric variables. Intention, purchase intention?
Do not read B
Satisfaction)
(not Beta) ->
cannot know
which one affect
the most

Dendrogram
using Ward How many segments
Hierarchy Cluster
13 Segmentation . Linkage in the market should
Analysis
be divided into?
Coefficient

Final Cluster
Identify the Centers Table What is the
K Mean Cluster
14 characteristics of the characteristic of each
Analysis Number of cases in
clusters. segmentation?
each cluster

Provide a “spatial map” Internal


Spatial Map What are your direct
Multidimensional about the positions of Distance Table
16 and indirect
Scaling you and your External
Object points competitors?
competitors Spatial Map

123
Summary Table - ISB Academic Team

How important are


Testing the new
Orthogonal Utility Estimate my product’s
17 Conjoint Analysis products with many
Design attributes in the
attributes Important Values
customers’ view?

124
Flowchart - ISB Academic Team

125
REFERENCES

Malhotra, N. K. (2009). Marketing research: An applied orientation (6th ed.). Pearson College
Division.

126

You might also like