Causality: X causes Y.
The scientific concept of causality is different from the commonsense (the
commonsense suggests that there’s a single cause of an event and it implies
completely deterministic relationship, whereas the other it implies only a possibility,
and you can never prove it, maybe it is an error).
What kinds of evidence can be used to support scientific inferences?
1. Concomitant variation
It is the extent to which X and Y occur together in the way predicted by the
hypothesis. There are 2 cases (qualitative or quantitative).
The relation between 2 variable is not perfect, there’s other factors obtruding
meanings and relationships (concomitant variation).
Qualitative: we wouldn’t expect the relationship to be perfect, there will be always
other variables that will make a variation.
2. Time order of occurrence of variables
The sequential ordering of the occurrence of variables X and Y helps provide
evidence of a causal relationship between the two. If X causes Y, typically X
precedes Y.
3. Elimination of other possible causal factors
To conclude that X were responsible for the increase of Y, we need to eliminate
other explanatory variable (such as price, size of store…).
Method knowledge is not a substitute for conceptual knowledge.

• Experimentation
An experiment can provide more convincing evidence of causal relationships than
exploratory or descriptive designs. In fact, experiments are often called “causal
research.” An experiment can provide evidence of causality because of the control it
affords researchers.
An experimental design, then, is one in which the investigator manipulates at least one
independent variable.
There are 2 types of experiments: Laboratory and Field Experiments.
o Laboratory
The researcher creates a situation with the desired conditions and then
manipulates some variables while controlling others. The investigator is
thus able to observe and measure the effect of the manipulation of the
independent variables on the dependent variable holding the other
variables constant.
o Field Experiments
It is a research study in a realistic or natural situation and involves the
manipulation of independent variables under conditions that are as
carefully controlled as the field situation will permit. It’s made in a natural
Internal and External Validity
- Laboratory experiment: has greater internal validity because of the greater
control (we are able to eliminate the effects of other factors).
- Field experiment: it exhibits greater externally validity.
Internal Validity: is this ability to attribute the effect that was observed to the
experimental variable, and not to other factors.
External Validity: the effect can be generalized.
Extraneous Variables
We need to rule out extraneous factors as possible causes. Extraneous factors fall into
several categories:
- History: specific events (external to the experiment but occulting at the same
time) that can affect the criterion variable.
- Maturation: changes occurring within the test units (consumers) that are not
due to the effect of the experimental variable for the result of the passage of
time. The type of maturation depends on the duration of the experiment.
- Testing: The experiment itself can affect the responses. 2 kinds: the main
testing effect is the effect of a prior observation on a latter observation, the
interactive testing effect (a prior measurement affects perceptions of the
experimental variable).
- Instrument variation: includes any and all changes in measuring instruments
that can account for differences. When many interviewers participate,
significant instrument variation can occur because it is difficult to ensure that
all the interviewers will ask the same questions with the same voice
- Statistical regression: is the tendency of “extreme” cases to move closer to the
average during the course of an experiment. Ex. People may be chosen for the
experiment because they drink a lot, so they will answer all the time the same
- Selection bias: it arises from the way in which test units are selected and
assigned in an experiment. Ex. Select people to interview that has watched the
presidential interview (we can’t decide that) and other who not. The first ones
has been exposed to external variables and we can’t do anything.
- Experimental mortality: the loss of test units during the course of an

Specific Designs
1. Pre-experimental designs
The researcher has very little control over when participants are exposed to the
experimental stimuli. There’s little control of the measurement, the timing or the
- The ONE-SHOT case study: a single group of consumers is exposed to an
experimental variable.
exposed to a variable before and after the study.
- The STATIC-GROUP COMPARASION: Design using 2 groups, one that has
experienced X and another that has not. This groups haven’t been created
2. True Experimental designs
The true experimental design is distinguished by the fact that the experimenter can
randomly assign treatments to randomly selected test units. The experimenter can
control who gets the experimental condition and who doesn’t, and when the
intervention (X) and measurement (O) occur. To distinguish the true experiment,
we denote a random assignment of test units to treatments by (R).
- BEFORE-AFTER WITH CONTROL GROUP DESIGN: The researcher decides which
test units receive the experimental stimulus and which do not. (It is not up to
the test units to self-select whether they will be members of the control or
experimental groups, as they did in the pre-experimental designs.) Further, the
experimenter must assign test units to the experimental and control groups
randomly. (The experimenter may first match the test units on some external
criterion and then assign one member from each of the matched pairs to the
experimental and control groups, but this final assignment is made randomly).
Finally, each of the test units in both groups is measured before and after the
intro-duction of the experimental stimulus.
- FOUR-GROUP SIX-STUDY DESIGN: When an interactive testing effect is likely to
be present, the four-group six-study design is a good choice. This offers the
researcher much control.
We’d first select a random sample of the firm’s employees. Then we’d
randomly divide the sample into four groups. Those designated for the first
experimental and control groups would be measured for their knowledge of
the credit union. The brochure would then be mailed to those designated as
belonging to the first and second experimental groups. Finally, all four groups
would be measured on their knowledge of the credit union. Six measurements
are made in all, per the name of the design.
- AFTER-ONLY WITH CONTROL GROUP DESIGN: studying the last two groups of
the four-group six-study design in an after-only with control group design
3. Quasi-Experimental designs
the investigator is not able to schedule the experimental stimuli, and perhaps more
problematic is that the researcher isn’t able to randomly assign test units to
- TIME-SERIES EXPERIMENT: (O1 O2 O3 O4 O5 O6 O7 O8): This diagram indicates
that a group of test units is observed over time, that an experimental stimulus
is introduced, and that the test units are again observed for their reaction. A
change in the previous pattern of observations is taken as the effect of the
experimental stimulus.
Experimental vs. Nonexperimental Designs
Internal validity: need random assignment.
External validity: need random sampling.
Experimentation in Marketing Research
One of the most important growth areas has been in market testing or test marketing
(experiment done in a small section of the marketplace).
Problems of Experimentation
3 of the more critical problems with experimentation in general and test marketing in
- Cost
- Time (recommended 1 year minimum)
- Control
Types of Test Markets:
1. Standard Test Markets: One in which companies sell the product through their
normal distribution channels.
2. Controlled Test Market: The entire experiment is conduced by an outside
service (the service pays retailers for shelf space and therefore can guarantee
distribution to those stores).
3. Electronic Test Market: a panel of households in the test market area are
recruited and we get their demographic information. People in these
households are given ID cards, which they show when checking out at grocery
stores. Everything they purchase is automatically recorded and associated with
the household through scanners found in all supermarkets in the area.
4. Simulated Test Market: prelude to a full-scale market test. Most STMs operate
similarly: consumers are interviewed in shop-ping malls, exposed to the new
product, and asked to rate its features. They are then shown commercials for it
and for competitors’ products. In a simulated store environment, they are given
the opportunity to buy the product using seed money. Those not purchasing
the test product are typically given free samples. After a usage period, follow-
up phone interviews are conducted with the participants to assess their
reactions to the product and their repeat-purchase intentions. All the
information is fed into a computer model, which has equations for the repeat
purchase and market share likely to be achieved by the test model. The key to
the simulation is the equations built into the computer model.

The emphasis in this chapter was on the third basic type of research design—causal
design. The notion of causality was reviewed, and, according to the scientific
interpretation of the statement “X causes Y,” it was found that (1) we could never prove
that X caused Y, and (2) if the inference that it did was supported by the evidence, X was
one factor that made the occurrence of Y more probable, but it did not make it certain.
Three types of evidence support the establishment of causal linkages. Concomitant
variation implies that X and Y must vary together in the way predicted by the hypothesis.
The time order of occurrence of variables suggests that X must precede Y. The
elimination of other factors requires the analyst to design the investigation so that the
results do not lend themselves to a number of conflicting interpretations. Experiments
provide the most convincing evidence of causal linkages. An experiment is a scientific
study in which an investigator manipulates and controls one or more predictor variables
and observes the response of a criterion variable. There are two general types of
experiments: the laboratory experiment, in which an investigator-tor creates an artificial
situation for the manipulation of the predictor variables; and the field experiment,
which allows these manipulations to take place in a natural setting. The greater control
of a lab experiment allows more precise determination of the effect of the experimental
stimulus, but there is a greater danger of generalizing the results because of its artificial
nature. In either type of experiment, we must be on guard against extraneous sources
of error that may confound interpretation: history, maturation, testing (both main and
interactive), instrument variation, statistical regression, selection bias, and experimental
mortality. True experimental designs are useful in minimizing the impact of these errors.
They are distinguished from pre-experimental and quasi experimental designs by the
fact that the researcher decides who is to be exposed to the experimental stimulus and
when the exposure is to occur.
The growth of experiments in marketing has been steady. The market test has become
standard practice for some companies to establish the sales potential of new products,
and increasingly to determine the effectiveness of contemplated changes of any
elements of the marketing mix. Two popular variations are the controlled test, in which
the distribution of the product is guaranteed by the service provider, and the simulated
test market (STM), in which reactions from users of the product are used in a series of
equations to predict the repeat-purchase behavior and market share likely to be realized
by the test product. Despite causal designs’ growing use, descriptive designs are still the
dominant form of marketing research investigations. This is partly due to tradition, but
it also reflects the cost, time, and control problems associated with experimental

Questionnaire Design
- Avoid ambiguous questions.
Procedure for developing a questionnaire:

Questions to do when creating the survey:

1. Is the question necessary?
2. Are several questions needed instead on one?
3. Do respondents have the necessary information? (or do they know it?) →
Solution: Filter questions to determine if they could know what you’re going to
ask. Ex. Who goes shopping in your home?
4. Will respondents give the information?
5. Form of Response (Open-ended or fixed-alternative?)
a. MULTICHOTOMOUS QUESTIONS: is a fixed-alternative question;
respondents are asked to choose the alternative that most closely
corresponds to their position on the subject.

b. DICHOTOMOUS QUESTIONS: is also a fixed-alternative question but one

in which there are only 2 alternatives. Ex. Yes / No
c. SCALES: Another fixed-alternative question. Ex. Never / Occasionally /
Sometimes / Often
6. Decide on Question Wording:
a. Use simple words
b. Avoid ambiguous words and questions
c. Avoid leading questions
d. Avoid implicit assumptions
e. Avoid generalizations and estimates
f. Avoid Double-barreled questions
7. Decide on question sequence (the order of the questions)
a. Use simple, interesting opening questions
b. Use the Funnel Approach (we start with broad questions and
progressively narrow the scope).
c. Design Branching Questions with care (this kind: only answer if your last
question is “yes”).
d. Ask for classification information last (data we collect to classify
respondents, ex. Demographic variables).
e. Place difficult or sensitive questions late in the questionnaire.
8. Determine Physical Characteristics
9. Re-examination and Revision of the Questionnaire
10. Presenting the Questionnaire
Observational Forms
The decision about what to observe requires that the researcher specify the following:
- Who should be observed?
- What aspects of the purchase should be reported?
- When should the observation be made?
- Where should the observation be made?

A researcher wishing to collect primary data needs to tackle the task of designing the
data-collection instrument. Typically this means designing a questionnaire, although it
may mean framing an observational form. Questionnaire design is still very much of an
art rather than a science, and there are many admonitions of things to avoid when
doing the designing. A nine-step procedure (Figure 9.1) was offered as a guide: What
information will be sought? What type of questionnaire will be used? How will that
questionnaire be administered? What will be the content of the individual questions?
What will be the form of response—dichotomous, multichotomous, or open-ended—
to each question? How will each question be phrased? How will the questions be
sequenced? What will the questionnaire look like physically? Researchers should not
be surprised to find themselves repeating the various steps when designing a
questionnaire. Further, although the temptation is sometimes great, one should never
omit a pretest of the questionnaire. Regardless of how good it looks in the abstract,
the pretest provides the real test of the questionnaire and its mode of administration.
At least two pretests should be conducted. The first should use personal interviews,
and after all the troublesome spots have been smoothed over, a second pretest using
the normal mode of administration should be conducted. The data collected in the
pretest should then be subjected to the analyses planned for the full data set, as this
will reveal serious omissions or other shortcomings while it is still possible to correct
these deficiencies. Observational forms generally present fewer problems of
construction than questionnaires because the researcher no longer needs to be
concerned with the fact that the question itself, and the way it is asked, will affect the
response. Observational forms do, however, require a precise statement of who or
what is to be observed, what actions or characteristics are relevant, and when and
where the observations will be made.


Some definitions of ATTITUDE:
1. Attitude represents a predisposition to respond to an object (not yet the actual
behavior toward the object). Attitude thus possesses the quality of readiness.
2. Attitude is persistent over time and changing a strongly held attitude requires
substantial pressure.
3. Attitude is a latent variable that produces consistency in verbal and physical
4. Attitude has a directional quality. It connotes a preference regarding the out-comes
involving the object, evaluations of the object, or positive/neutral/negative feelings for
the object.
Scales of Measurement:
- Measurement: the assignment of numbers to objects (e.g., consumers) in a
way that reflects the quantity of the attribute that the object possesses (e.g.,
preference fora brand).
1. Nominal Scale: A person’s social security number is a nominal scale (it
has numbers, but this number simply identify the individual). Ex.
Women will be identified with “1” and Males with “2”.
2. Ordinal Scale: We say that number 3 is greater than 2 and 1. In this case,
the larger the number, the greater the amount of the attribute we’re
3. Interval Scale: Here we can say that 80ºF is warmer than 40ºF, and the
difference in “heat” between 80ºF and 120ºF is the same as the
difference between 40ºF and 80ºF.
4. Ratio Scale: We can say that a person weighing 200 pound is said to be
twice as heavy as one weighing 100.

- Scaling of Psychological Attributes:

The more powerful scales (ratio, interval) allow stronger comparisons and
conclusions to be made than simpler levels of measurement (ordinal or
1. Attitude-Scaling Procedures: The most common approach to measuring
attitudes is self-reports, in which people are asked directly for their
beliefs or feelings toward some stimulus. But they can also be measured
in other ways (observation of over behavior, indirect techniques…).
▪ INDIRECT TECHNIQUES: Word association tests, sentence
completion test, storytelling…
memorize a number of facts about the extent of pollution and
then ask them (the things they remember are more consistent
with their own position).
▪ PHYSICOLOGICAL REACTIONS: It uses electrical or mechanical
means to monitor a person’s response to some stimulus.

- Self-Report Attitude Scales:

1. Equal-Appearing Intervals: We want to compare the opinion that
people have of 5 different banks. Equal-appearing interval scaling
develops values for the statements (“a convenient location”,
“discourteous service” …) so we can assess the person’s attitude toward
these banks.
▪ SCALE CONSTRUCTION: You take all the statements that people
have made, you make a list. Then some judges order them by
their degree of positivity (1 to 11). In column A they will put the
unfavorable maxims, in the F the neutral ones and in the K the
favorable maxims.

The Q value is the interquartile range, which provides a measure of

▪ SCALE USE: We randomly select a few of the above statements
and prepare another survey for other respondents to tell us if
they agree or disagree with the statements.

2. Summated Ratings:
▪ SCALE CONSTRUCTION: We ask if they are strongly disagree,
disagree, agree… and we assign numbers to this levels: 1 if they
are strongly disagree, 2 if they are disagree…

3. Semantic Differential: It was found that the ratings tended to be

correlated and that three basic uncorrelated dimensions accounted for
most of the variation in ratings: an evaluation dimension represented by
adjective pairs such as good–bad, helpful–unhelpful; a potency
dimension represented by bipolar items like powerful–powerless,
strong–weak; and an activity dimension captured by pairs like fast–slow,
alive–dead, noisy–quiet. The same three dimensions tended to emerge
regardless of the object being evaluated.11The general idea in semantic
differentials is to form scales that cover each of the evaluation, potency,
and activity dimensions.

4. Stapel Scale: It differs from the semantic differential scale in that: (1)
Adjectives are tested separately instead of simultaneously, (2) Points on
the scale are identified by number, and (3) There are 10 scale positions
rather than 7.

5. Importance Judgments: we might wish to assess how important each

attribute is to that person. For example, even though someone believes
that a bank has convenient hours, if “convenience” isn’t valued by that
person, then it wouldn’t affect his or her overall attitude toward the
bank. We can use the “Quadrant Analysis”.

6. Graphic: Graphic rating scales are becoming more important with

online surveys. Individuals indicate their rating by marking the point of a
line that runs from one extreme of the attribute to the other with the
7. Comparative rating scales: they involve relative judgments. Raters
judge each attribute with direct reference to the other attributes being
evaluated. For example, ask respondents to share 100 points among the
most important attributes a bank should have (they can give 50 points
for courteous service and 50 for low interest rates) → Constant sum
scaling method.

Gendered languages assign masculine and feminine grammatical gender to all nouns,
including nonhuman entities. In French and Spanish, the name of the disease resulting
from the virus (COVID-19) is grammatically feminine, whereas the virus that causes the
disease (coronavirus) is masculine. In this research, we test whether the grammatical
gender mark affects judgments. In a series of experiments with French and Spanish
speakers, we show that grammatical gender affects virus-related judgments consistent
with gender stereotypes: feminine- (vs. masculine-) marked terms for the virus lead
individuals to assign lower stereotypical masculine characteristics to the virus, which in
turn reduces their danger perceptions. The effect generalizes to precautionary
consumer behavior intentions (avoiding restaurants, movies, public transportation,
etc.) as well as to other diseases and is moderated by individual differences in chronic
gender stereotyping. These effects occur even though the grammatical gender
assignment is semantically arbitrary.
They tested all this in several studies (experiments):
- Study 1: tested if grammatical gender of the virus affects danger perceptions
and precautionary consumer behavioral intentions. YES.
- Study 2: tested if the findings generalize to diseases other than COVID. YES.
- Studies 3 and 4: tested the process and theoretically relevant boundary
They measured mood and demographics in all studies, but their inclusion as covariates
did not materially affect the results, and participant gender did not interact with
grammatical gender.


Basic Considerations (to choose the perfect method):
The appropriate technique depends on the type of data, the research design, and the
assumptions underlying the test statistic and its related consideration, the power of
the test.
1. Type of Data
- NOMINAL (each number represents a distinct category, ex. “1” girls, and “2”
boys). Here the only measure of central tendency that is appropriate is the
mode. Ex. In a sample of 60 men and 40 women we cannot say that the
average gender is 1.4, we can say that 60% of the sample is “male”.
- ORDINAL (The numerals assigned reflect the order). The median and the mode
are now both legitimate measures of central tendency.
- INTERVAL (We can determine how much more one category is than another).
We still cannot state that “A is five times larges than B”, because the interval
scale contains an arbitrary zero. The mean, the median, and the mode are all
appropriate measures of central tendency.
- RATIO (it has a natural zero point, so we can sensibly say that “A is twice
heavy”). All statistics appropriate for the interval scale are also appropriate for
a ratio scale (mean, median, mode).
Nominal, ordinal, interval or ratio help determine which methods and statistics are
meaningful for your data.
2. Research Design
- SAMPLE INDEPENDENCE (Dependent or independent samples?) → 2 samples
made, one for those who watched the advert, and the other for the ones who
didn’t (independent samples), or 2 samples, one for before watching the
advert, and the other for after (dependent).
- NUMBER OF VARIABLES (the number of measures per object also affects the
choice of analytical procedure). Ex. We would see how the advert has impact in
the consumer’s attitudes (variable 1) and also in sales (variable 2).
- VARIABLE CONTROL (the control of variables that can affect the result) → ex.
One variable that would determine attitudes after or before watching the
advert is a previous use of the product. We must try to minimize this effect.

3. Assumptions Underlying Test Statistic

The choice of a statistical method is linked the assumptions and considerations of
our variables and samples.
Our point is to illustrate the fact that statistical tests depend on certain
assumptions for their validity. If the assumptions are not met, sometimes they can
be satisfied through a transformation on the data (e.g., change to log units).
Sometimes a different test statistic should be chosen that relies on different
assumptions, e.g., a distribution-free test.
Overview of Statistical Procedures
In the graphic you can see the methods you can use, answering the 3 questions we
explained above.
- Null Hypothesis:
A hypothesis may be rejected but can never be accepted except tentatively, because
further evidence may prove it wrong. In other words, one rejects the hypothesis or
does not reject the Hypothesis on the basis of the evidence at hand. It is wrong to
conclude, though, that since the hypothesis was not rejected, it can necessarily be
accepted as valid.
1. Type I error: when we reject a true null hypothesis.
2. Type II error: when we don’t reject a false null hypothesis, which we
should have, given that it’s false.
The type II errors are more difficult to control.
We need to frame the null hypothesis in such a way that its rejection leads
to the acceptance of the desired conclusion, the statement or condition
that we wish to verify.
The one-tailed test:

- Types of Errors:
1. Type I error: wrongly reject the null.
2. Type II error: wrongly do not reject the null.
Sample information will always be somewhat incomplete, there will always be
some x error. The only way it can be avoided is by never rejecting the null
hypothesis. The confidence level of a statistical test is 1-x, and the more
confident we want to be, the lower we must set x error.
One-tailed tests are more powerful than two-tailed tests because, for the same
x error, they’re simply more likely to lead to a rejection of a false null
- Procedure:

- Power: It’s very important to specify correctly the risk of error.

The difference between the assumed value under the null hypothesis and the
true, but unknown value is known as the effect size.

Case of Study: Cerenity Sanitizer

Cerenity: a toilet seat sanitizer.
They realized 2 studies: first a qualitative one (interviews, observation) and then
The short-term aim of the strategy was maximize trials to leading to the product’s
acceptance. Long term objective: put Cerenity like a high quality brand in the market
and build costumer loyalty.
Aim of the study: testing consumers’ acceptance of the product. The findings were
expected to help the company introduce Cerenity in the market and design an
effective communication plan for its lunch and it will help in deciding the target
segment for the product and developing an appropriate positioning strategy.
- Findings in the qualitative research:
1. Meaning of toilet hygiene (= clean toilets comfortable to use)
2. Perception of shared common toilets (infections)
3. Willingness to try the product
4. Low awareness of the product concept (they were not aware if this
product really exist in the market).
5. Doubts regarding product effectiveness
6. Need for product at workplace
7. Locations for product use (in public restrooms)
8. Spray as preferred product form
9. Experience with product trial among pilot users
10. Packaging and portability (small spray sizes)
- Important variables identified during the exploratory research:
1. Germ-killing effectiveness
2. Ease of use
3. Fragrance: strength and type of fragrance
4. Time of action: Instant dryness after application
5. Product form: spray, gel and liquid
6. Price
- Conclusive research: the ream distributed a total of 40 product samples within
the IIMA campus. They prepared different questionnaires for pilot users and
non-users to capture similarities and differences in their evaluation of the
product. They conducted also surveys to collect primary data.
▪ Non-user group survey: data was used to understand the
purchase decision making process, to identify the target
segment and the optimal combination of features in the
▪ Pilot test survey: the data was used to analyze product
performance-related features and post-purchase decision
- Data Analysis: the team performed a regression analysis to understand the
factors driving customer’s willingness to purchase the product. Variables
▪ Likelihood of purchase
▪ Frequency of public toilet usage
▪ Health problems experienced in general public toilets
▪ Health problems experienced in toilets at workplaces
▪ Health problems experienced in toilets at malls
▪ Health problems experienced in toilets at airports
Dependent variable: likelihood of purchase
Independent variables: the others.
All the health problems (less the ones in airports) impulse the likelihood of
The exploratory research identified 8 attributes (likelihood, comfort, ease of
use, confidence, carry, fragrance, price, germ-killing, dryness) that could
possibly affect the purchase behavior of the customer.
The effectiveness dimension didn’t play a significant role in the purchase
decision but the convenience dimension (comprising of comfort, ease of use,
confidence and ease of carrying) yes.

- Identifying Customer Segments: through a Cluster Analysis

Data was collected from respondents on the 6 attributes they looked for in the
product on a scale of 1-5 and their favorability toward the product. They
identified 3 different clusters to decide on the customer target segment:
▪ Cluster 1: the ones that frequently visited malls (their ratings on
the importance of price, ease of use, and ease of carrying were
the highest).
▪ Cluster 2: the premium sub-segment, which didn’t frequent
malls but rather planned to use this product at home (their
ratings on the importance of price, use, carrying were the
▪ Cluster 3: the middle sub-segment. They visited malls regularly
and wasn’t price-sensitive they give more importance to the
other factors.

- Deciding Product Formulation and Pricing

The most important attribute of the product: the form.

- The Pilot: Product Performance Analysis

They distributed 40 samples and then the consumers participate in a survey
and rate various attributes (likelihood of purchase, effectiveness of the
product…) on a scale of 1 to 7.
They find that “convenience” and “effect” play an important role in consumers’
repeat product purchase decisions but money not. The results suggested that
the best place to purchase the product would be medical shops / pharmacies.
About 70% of the users felt the product easy and comfortable to use.

