You are on page 1of 27

This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul

29, 2020 to Dec 31, 2022.

Indian Institute of Management


Ahmedabad IIMA/MAR0258TEC

Technical Note

Note on Sampling in Marketing Research

A judicious application of marketing research findings to a real life situation requires some
knowledge about the extent of accuracy of data collected through the research study vis-a-vis
the underlying reality about which data were collected. Data are generally collected by the use
of some method of data collection from a number of respondents or situations. Respondents or
situations might be all those available in the target group or a few belonging to the target
group, as specified by the design of the study. In a very broad sense, therefore, collected data
could be erroneous on three counts: i) the method of data collection vis-a-vis the underlying
reality, ii) definition of the target group for research study vis-a-vis the actual desire, and iii)
the actual respondent/situation selected not representing the target group for the study. An
understanding of these types of potential errors in data collected through a research study
(whether based on secondary or primary sources of information) helps the researcher in two
phases: designing the research study and analysing and interpreting the data/findings. The
following example would make this clear.

Suppose a researcher is interested in finding out the percentage of households owning a TV


set in Ahmedabad. The researcher would first have to define the limits of Ahmedabad and the
term "household" as close to the meaning as the decision maker had in mind. He would then
have to operationalize these definitions so that he could identify each of the households from
some other kind of establishment (would a shop-cum-household be classified as household?)
and whether it falls within Ahmedabad or not. This is a step called "frame" preparation
(discussed later). Even if the researcher was able to enumerate all households, the number and
percentage of households owning a TV set might be wrong (What does owning mean? How
should this be ascertained?).

Finally, the cost and time available and other constraints might force the researcher to get
information from only a limited number of households as against the enumeration (census)
mentioned above. The group of households from whom the information is collected might not
be representative of all the households in Ahmedabad in terms of both the number of
households and their composition which was adequate to arrive at a reliable value of the
percentage of households owning TV sets.

This note would discuss the methods of selecting respondents/situations which have a
bearing on the understanding of errors (ii) and (iii) above. The methods of data collection
would be dealt with in a separate note. A perusal of the above example reveals the essential
steps in a method of selecting respondents/situations in the study. These are:

Prepared by Professors Abhinandan K. Jain and M.N. Vora with the assistance of Ms. Preeta Vyas,
Research Associate, Marketing Area, Indian Institute of Management, Ahmedabad.

© 1980 by the Indian Institute of Management, Ahmedabad.


This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

2 of 20 IIMA/MAR0258TEC

- defining the target group(s) and sampling frame(s)


- choice of a basic approach, i.e. obtain information from all members of a target group
(census) or use some method of sampling like non-probability methods or methods
based on classical statistics, or Bayesian approach
- choice of a specific method of sampling from among the ones available under the basic
method
- size and composition of the sample.

There are some important considerations for sampling in marketing research. These would be
discussed in Section I of this note. Section II would deal with defining the target group of the
study and a sampling frame. Pros and cons of different basic approaches along with their
applicability in specific situations would form the subject matter of Section III. Sections IV, V
and VI would provide a non-technical description and pros and cons of the specific sampling
methods under the three basic approaches to sampling, i.e. non-probability sampling, classical
statistical methods, and Bayesian methods respectively. Section VII would provide some
guidelines for choosing a method of sampling once the broad approach has been decided.
Section VIII will deal with special problems of sampling for experimental research designs. A
separate section (Section IX) would discuss issues relating to pretesting and field training
which would help in effectively implementing the sampling plan in the field. The note would
conclude with a discussion of some important issues in sampling.

I. IMPORTANT CONSIDERATIONS

A research project in marketing deals with information from respondents/situations as


prevailing in the marketing context. If the research project is intended to help a decision maker
in resolving this problem it should be conducted within the time and cost available for the
project and should provide valid and reliable information for the specific purpose of research.
A marketing researcher would, therefore, be able to devise a much better sampling plan for his
research design if he was aware of the important considerations arising out of a)
characteristics of marketing situations, b) cost and time aspects of sampling, and c) purpose of
research. Each one of these would be discussed briefly in sub-sections A, B, and C. Section D
would provide conclusions on the topic.

A. Characteristics of Marketing Situations

A number of important characteristics of marketing situations need to be noted. Marketing


populations tend to be somewhat clustered in limited geographic locations as well as finite
and limited in size. They are also characterized by a relatively high degree of mobility and
lower degree of cooperation. Marketing executives, in general, have considerable prior
knowledge about the population of interest. All these characteristics, and probably some more,
have important implications for sampling in marketing research.

Clustering and small size of marketing populations implies the use of sampling methods
which take care of these characteristics as against general statistical methods which assume
infinite size and high diffused populations. Availability of prior knowledge about parameters
and relationships generally implies a small size of sample. High degree of mobility and/or
low level of cooperation indicates a strong need for effective sample substitution rules at the
field level (number of callbacks in case the respondent is not available, etc.). Designing a
probability sample, therefore, requires special attention to implication of such marketing
situation. In case methods are not available to take care of such implications, even a
probability sample might turn out a non-probability sample.
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

3 of 20 IIMA/MAR0258TEC

B. Time and Cost Estimates for Sampling

There are no hard and fast rules about the proportion of total project time and money spent on
field work. However, as a rough guide field work takes approximately 40 to 50 per cent of
both these resources. An estimate of these should be made by the researcher. The accuracy of
the estimate generally depends on the kind and the duration of experience a researcher has. It
has often been seen in practice that the budgeted time and money resources, spent on field
work, are underestimates. The researcher should, therefore, carefully estimate the time and
money to be spent on planning and executing a desired sample plan.

In terms of planning, the question of whether to use a census or a probability or non-


probability method itself takes considerable thought and time (to be discussed later). Much
more time goes into selection of a specific method and working out its details. Time spent on
executing a sampling plan includes i) planning to locate respondents in field, ii) time to travel
to meet respondents, and iii) callbacks in case respondents are not available. In terms of both
planning and execution of a sampling plan, probability sampling methods are more time
consuming and costlier compared to non-probability sampling methods.

C. Purpose of Research

The purpose of research influences sampling decisions quite significantly. Implications of each
of the purposes of research for sample designs are described below.

1. Exploratory Research

Exploratory research is concerned with identification of variables and/or broad nature of


relationships among the variables. It is not concerned with estimation and/or building
inferences about variable values or relationships among variables. Implications of such
purposes in terms of sampling are i) study of a relatively small number of
respondents/situations and ii) sample respondents/situations taken together to represent
extreme view of the phenomenon under study so that the largest possible number of variables
and kinds of relationships among variables is discovered.

The task of the researcher in sampling for exploratory purposes reduces to identification of
specific sources which could represent the extremes of the phenomenon under study. Having
identified such sources, the desired judgmental sample could be obtained by assessing the
accessibility and willingness/feasibility of such sources providing information.

In terms of specific research designs, the above method could be directly used for case studies,
in-depth personal interviews, and motivation research design for exploratory purposes. The
group discussion research designs puts a further limitation: a group is generally more effective
if the number of members is between 8 and 16.

2. Descriptive Research

Descriptive research studies are concerned with descriptions and inference building about
population parameters and the relationships among two or more variables. Descriptions and
inferences could be qualitative or quantitative in nature. In either case, drawing inferences
about population parameters on the basis of sample information implies that the sample
design, as far as possible, be probabilistic and should represent a fair cross-section of the
population.
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

4 of 20 IIMA/MAR0258TEC

The kinds of research designs used for descriptive research, as indicated in the Note on
Marketing Research Designs, include case study, in-depth personal interviews, group
discussions, motivation research and surveys. All these research designs except the last
require that the sample size be rather small. This is because of time and cost implications as
well as the non-availability of skilled personnel to conduct such researches. Sample size,
however, would be large to provide confidence to the researcher for drawing inferences. Also,
the aspect of fair cross-section implies that a few respondents representative of different
segments of population be included for collecting information. In the case of a survey design,
sample size could be large as the cost and time requirements per respondent are relatively
small and personnel for conducting such research are generally available.

Qualitative inferences are to be drawn on the basis of rigorous qualitative logic. However,
quantitative inferences about parameter values and degree of association are drawn by the use
of certain statistical techniques. The use of specific techniques have some implications for
sample size and composition decisions.

For example, the researcher wants to do estimation of several parameter values with some
degree of confidence. This aspect, through a formula, determines sample size. In the case of
drawing of inferences about relationship among two or more variables, the statistical
technique influences the sample size and its composition. For example, the use of Chi-square
technique to infer whether there is any significant correlation between two or more variables
requires that the expected number of observations in individual cells of the cross-tabulations
of variables be at least 5. Similarly, the number of observations in a multiple regression
analysis (one dependent variable and several independent variables) should be significantly
larger than the number of variables under study (including the dependent variable).

3. Causative Research

Studies conducted for drawing causal inferences in marketing could use rigorous non-
quantified logic (generally in the case of in-depth, case, and other such research designs) and
degree of association among variables (generally in the case of survey research and
experimentation). The implications of causal research design except experimental research
designs have already been indicated under descriptive research above. The experimental
designs have two components of sampling. One of the components is the number of sample
groups and the other size and composition of each such group. The first determines the nature
of experiment design itself (already dealt in Note on Marketing Research Designs). The second
is dealt here. One of the important characteristics of experiments is that the groups
(individuals) subjected to different experimental treatments (as also the group/ individual(s)
being used as control) should be matched and each group should have a sample of
respondents which represents the population. However, the implications are different in the
case of two major types of experimental designs, i.e. laboratory and field.

The researcher has some flexibility in selecting respondents for laboratory experiments; he
requests/recruits the sample either personally or by advertisements or any other means. He
could most probably succeed in getting enough number of potential respondents from whom
he can select these with characteristics similar to the target population. However, he must bear
in mind that i) the selection process might introduce some bias (e.g. advertising for co-
operation, particularly, by payment, could draw a number of potential respondents who are
more interested in curiosity value/money rather than those randomly drawn from the
population) and ii) formation of matching groups might not be possible.

Field experiments could be instore tests (for a number of decision areas like
product/package/price/ point of purchase, etc.) or non-instore tests (for decisions like media
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

5 of 20 IIMA/MAR0258TEC

advertising, etc.). In the former case, selection of matched stores and the in the latter selection
of matched towns/areas become important. Moreover, those by themselves would not
unnecessarily be representative of target store/town characteristics. In such cases, the
researcher could try for matching to the extent possible. The differences whatever might
remain would have to be taken care of by either outside the experimental design or suitable
experimental design.

D. Conclusion

The three types of considerations were described separately only to provide a better
understanding. However, their implications for sample design should be viewed by
considering them together. This aspect is left to the reader to be operationalized in a context of
specific research project that he is seized with. However, some additional implications are in
order.

At times researchers would knowingly violate some sampling principles or sacrifice some
sampling precision to avoid more serious non-sampling errors. High degrees of precision and
fidelity to sampling principles tend to characteristic academic research projects and
government projects that are not under time pressure. Applied projects which facilitate
business decisions often face serious time perishability, in which late results have little value.
Executives typically do not require high statistical confidence limits for making their decisions.
Such considerations imply that, in a decision making situation, the researcher might attempt
an ideal sampling plan; he might be forced to use simplified procedures to make the results
meaningful for decision-making.

II. POPULATION DEFINITION AND FRAME PREPARATION

A. Universe: Sampling Frame

Once the researcher has determined his information needs - especially of primary information
to be collected - he must identify the relevant source from where such information could be
generated. All units of such a source of information are termed as the universe of population
in the literature on sampling methods. It is from this universe that sample units are selected. If
the universe – i.e. units of source of information--are very few, the researcher might decide to
approach all these units - i.e. use a census approach to generate the required information.
More often, however, the researcher decides to approach only a fraction of these units - a
sample - and in such cases clear definition of the universe and arrangement of the universe in
a manner (sampling frame preparation) to facilitate sample members' selection will have to be
carried out.

B. Need for Preparation of Sampling Frame

Defining the universe and preparation of a sampling frame could be a simple affair in the case
of many industrial marketing research situations or a very difficult and time-consuming task
in the case of consumer products' marketing research. For example, an organization whose
products are used by airlines can make a list of all airlines operating in India in a matter of
minutes. Of course, the researcher will have to still decide as to `who' in the selected airline
organization should be contacted for information generation. For many other specialized/
sophisticated industrial products, users could be identified and listed very easily, because they
are very few and well-known. For other situations somewhat larger effort might become
necessary to prepare the frame, i.e. listing of all relevant units of source of information and
serializing it. Many a time readymade lists might be available which could approximate the
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

6 of 20 IIMA/MAR0258TEC

universe quite satisfactorily. Hence the researcher might decide to acquire and use such a list
rather than spend effort in preparation of such a list/frame.

In terms of industrial organizations as a source of information, a census of manufacturers -


large and small - could provide a master list. The Director-General of Trade and Development
prepares such a list because all DGTD units are supposed to report certain data regularly to
DGTD. There is, however, no such requirement in the case of small scale manufacturers and
hence no up-to-date list of organizations is available. There was, however, a census of small
scale industries carried out in 1972, and a list of that should be available. Such a master list is
broken down into a number of parts probably using a certain kind of industry-wise
classification. Availability of such sub-lists (arrangements)could help putting together a frame
of all relevant kinds of manufacturers. For many industries, directories or association
members' lists will be available. An important thing which should be noted about universe or
population for marketing research purposes is the natural clustering of such units as
compared to atomistic or highly diffused population which is generally assumed in sampling
literature. For example, if one was interested in the population of Indian composite textile
mills,it would be apparent that nearly 80% of all textile mills are clustered in two or three cities
of India. Similarly, if one was trying to prepare a frame for heavy and medium users of tooth
pastes, etc. four metropolitan cities of Bombay, Calcutta, Delhi and Madras might account for
more than 50% of users/usage. However, a master list of heavy and medium users of tooth
pastes will not be available. In another study a researcher was interested in collecting
information from textile wholesalers in India and he was told by knowledgeable persons that
they are located only in Ahmedabad and Bombay and that also clustered in a few physical
locations in Ahmedabad and Bombay. However, there was no master list of all of these in one
place which could be used as a frame. There were wholesalers' associations who had booklets
listing all their members but these did not include all wholesalers. Hence, an actual census of
these physically clustered (into a few markets) wholesale organizations was carried out to
prepare the frame.

Researchers have to consider the trade-off between using an incomplete or slightly loss
representative available list (made for some other purpose by some other persons/
organizations) compared to the cost and time required to prepare a master list afresh. More
often researchers prefer to use available ready lists like voters' list, radio owners' list (from P &
T department licences), house owners/tenants list (from municipal records), automobile
owners' list (from regional transport office), telephone directory (i.e. telephone owners/users),
a trade or industry directly, etc. Even while using such lists the researcher will have to take a
few steps to work on these lists to make them ready for drawing a sample from.

A researcher was interested in taking a random sample of Ahmedabad households. He


wanted to use the municipal records. He found out that these records were kept in a number
of big registers. First, he looked through some of these registers to examine whether they were
complete. He found out that some entries were cancelled, some of the registers were not there,
there were inconsistencies in the way of recording things in different registers, etc. He found
out to his surprise that if he wanted to select the sample members using random sampling
method, he will have to put in considerable amount of effort to straighten out these registers
(frame). In fact it took two research assistants three Mondays each to prepare the frame and
identify the selected sample units from the registers.

C. Steps/Activities Required in Preparation of Frame

In preparing a sampling frame the researcher has to carry out various activities and make
various decisions.
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

7 of 20 IIMA/MAR0258TEC

a) He will have to make a search to locate available lists/frames which could be relevant
for his study purpose. Contextual familiarity and prior knowledge of relevant
population would be very useful for this purpose.

b) He will have to decide whether to prepare a sampling frame afresh or should use some
available list/frame. Again prior knowledge/experience is very valuable.

c) If he decides not to prepare a sampling frame afresh but to use an available list, he will
have to choose a list out of alternative lists available. The important considerations for
this are (i) representativeness of the list with respect to relevant population/universe, (ii)
up-to-datedness of the list (most of the lists are historical documents prepared at a point
of time), (iii) the reliability of the list in terms of images and perceptions about the
organizations/persons who have prepared such lists, (iv) the purposes for which the
lists were prepared, and (v) the case or difficulty of using the list for actual selection.

d) The researcher will have to work upon the available list for making it ready for sample
selection. This might involve activities like updating of the list, serializing of list, if
special sublist, or supplementary sublists (like in the case of a telephone directory) are
forming part of the list decide as to how to handle them, or the way in which they
should be combined, and so on.

D. Impact of Characteristics of Marketing Situations and Research Purposes on


Sampling Frame Preparation

In an earlier section we had described important characteristics of marketing situations which


would influence various sampling decisions. We had mentioned that as compared to
assumption of diffused and dispersed location of elements of population or universe,
marketing studies' populations are generally clustered (composite textile mills, textile
wholesalers, etc.). This feature makes the frame preparation task much easier. Similarly, a
marketing researcher having probably prior experience and understanding of various
marketing populations could help in defining relevant population/universe for specific study
or even a strata of population very relevant for specific studies with considerable ease. This
prior experience also might include knowledge of alternative lists which could possibly be
used for this purpose.

Research purposes also would influence frame preparation. For studies which are aimed at
identifying new variables, (exploratory) small samples would be enough. What might be
required is a fair cross-section of sample units. For such studies frame preparation might be
comparatively easy. If no random process was to be used for selection, frame preparation
might require much less care and effort. It is only in those studies whose purposes are to draw
inferences, either for describing, explanation or action decisions, that reliability of higher order
become necessary, which requires some kind of probabilistic sampling methods. Much more
care and effort would be required for preparation of the sampling frame for such studies
because the researcher wants to know the "chance" of selection of individual element for his
estimation purposes - to draw the required inferences.

III. SELECTION OF BASIC APPROACH TO SAMPLING

The decision about the sampling procedure to be used for a particular study is generally a
two-step process. First, the researcher has to make up his mind about the basic sampling
approach he needs to use - i.e. whether probability, non-probability or Bayesian - and select a
specific sampling method out of alternative methods available under any of the basic
approaches selected. This latter choice is discussed in another section after the section on
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

8 of 20 IIMA/MAR0258TEC

description and explanation of various sampling methods. In this section attention will be
given to the first, namely, selection of the basic sampling approach.

The earlier section outlined some important considerations which influence the choice of the
sampling approach and method to be used, as well as sample size which would be necessary.
What is the influence of these considerations on the selection of basic sampling approach is
described now.

The kinds of choices which are available are:

a) Should one approach all the units in the population or only a fraction (sample) of units?

b) If a fraction of units was to be selected, should one use a probability or a non-probability


(purposive) approach for sample selection?

c) If the probability sampling approach is to be used, should one use a traditional


probability sampling approach for a Bayesian one?

Once the decision about the basic approach to be used is made, the next step will be to
consider pros and cons of several alternative methods available under each one of the
probability and non-probability basic approaches and then make the choice of the specific
method.

a) Census Vs. Sample: Whether one should use a census approach or a sample (fraction of
units) approach is somewhat an easy question to answer. If the units of source of information
(respondents, marketing situations) are very few and easily accessible, the researcher might
decide to use all of these units of source of information and hence the sample selection
question is redundant. However, in many marketing research situations (both industrial and
consumer products), there are either a few (50-100) units of source of information or a large
number (few lakhs) of units in the population. The researcher might feel, in the case of a few
units’ population, that he can get needed information with desired level of accuracy without
necessarily approaching all units even though reaching all of them is not impossible. The
question of sample selection becomes relevant and he has to decide as to which units he
would use and which he would not. In the second case, he has to use a sample approach
invariably, because the census approach is impossible, too costly, or too time-consuming.

He generally uses the criteria of costs, time, and desired accuracy level in deciding whether to
use the census or sample approach and decide how large the sample should be. He knows that
cost and time required to approach all the units would be higher (in the second case beyond
limits) than using only a portion of these units. In terms of desired accuracy level of
information to be collected he might find out that approaching only a portion of units is not
necessarily going to affect unfavourably the accuracy level of information collected.

In some situations, the researcher might decide to use an approach which approximates a
census approach. For example, if in a marketing situation about 80 to 90% of potential
respondents are located in one or two cities, disregarding the units which are sparsely
scattered over several other cities. Unless there are strong reasons to believe that those units
which are excluded from the frame are significantly different compared to the units included
in these two cities frame, approaching all units in these two cities would tantamount to taking
a census.
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

9 of 20 IIMA/MAR0258TEC

It should be noted that the census approach does not always necessarily result into more valid,
reliable information as compared to a sample approach. Because, in a way, one is increasing
the change of expanding the non-sampling error by approaching a larger number of units.

b) Probability vs. Non-profitability Sampling Approaches: Once the decision is made that
only a fraction (sample) of units of source of information is to be approached, two questions
become relevant:

i) How many units should be approached? - Sample size, and


ii) How to identify (i.e. select) the sampling units to be approached?

The size of the sample has a serious implication on the cost of the study, because larger the
size of the sample to be approached, greater the cost of the study. Even the method of
identifying (selecting) sample members could have an impact on the cost even though the
sample size could be the same. For example, if the method leads to selecting units which are
scattered over a wide geographical area, travel costs increase exponentially, and the number of
interviews an average investigator can complete in a normal day would go down significantly,
thus having time implications too.

Therefore, in terms of costs, time, and resources considerations; one would choose such a size
of the sample and use such a method of selection which brings equally accurate results at
lowest cost/time/ resource.

In non-probability sampling, sometimes without identifying or preparing any sampling frame,


sample units are selected. These are selected either using the researcher's or interviewer's
judgement or specific sample units are selected because of convenience of ease of study and
explicit guidelines (to interviewers) are given as to how sample units are to be selected.

In probability sampling the researcher does not delegate the task of selection to interviewers at
all. He also uses a rigorous random selection process which avoids any bias of the researcher
or interviewers. He will ensure that he has serialized the sampling frame available and then
use random number tables to select sample elements. He would know the "chance"
(probability) of selection of any sample unit. He can later on use well-known statistical
methods for making precise quantitative estimates of population characteristics. He would be
able to estimate the error due to sampling and therefore would be able to make inferential
statements about the population values to be falling within certain limits with certain
confidence levels. In addition, he can use a well accepted statistical formula to arrive at
optimal sample size, if he uses the probabilistic sampling approach.

The purpose of any research study is to collect relevant, reliable and useful information.
Selection of the basic sampling approach affects the reliability of the information collected.
When one uses only a sample of units of source of information to generate/collect required
information, he will like to ensure that the sample is adequately representative of the
population so as to result in collection of information of desired level of accuracy.

The question of probability vs. non-probability sampling relates to the question of


"representativeness" and its consequences in terms of the error in population characteristics'
estimates which are made using sample information. It is generally believed that the
probability sampling approach would lead to more accurate estimates than non-probability
methods. In fact, if a sample is selected using a non-probability sampling approach the
researcher does not have any statistical methods (tools) to arrive at estimates of population
characteristics using sample information.
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

10 of 20 IIMA/MAR0258TEC

As mentioned in the note on research designs, if the major thrust of the research is exploratory,
namely, identifying important dimensions, etc. the researcher is not interested in making
precise quantitative estimates of the value of population parameters. In such situations, a
small sample size is accepted as quite satisfactory and non-probability or purposive sampling
approach is used. Whatever inferences (if at all) are to be drawn about relevant dimensions or
formulation of plausible hypotheses are not based on quantitative estimation of values but use
only qualitative logic.

Non-probability sampling is used not only for exploratory research but also for other kinds of
research, namely, descriptive and causative research. But the researcher in many of such
situations is ready to tolerate considerable degree of error in his findings and/or estimates, or
is not bothered by the lack of knowledge of the extent of error due to sampling. What he
probably wants is a believable basis for the judgement he wants to make.

The probability sampling approach involves more cost and time but the trade-off is between
this cost and time, and the accuracy requirements. It is, therefore, used when the researcher is
interested in drawing precise inferences and make quantitative estimates of population
characteristics for drawing his inferences. It is therefore used more in descriptive and
causative research studies.

In summary, we can state that (i) decisions of great economic importance require the use of
information of known accuracy (using probability sampling) even though the total cost of
obtaining it may appear to be high by comparison with the cost of securing a comparable
number of interviews (same sample size) with non-probabilistic sampling. Probability
sampling will have to be used in such circumstances; (ii) the relatively high fixed cost of
probability sampling is a disadvantage of the approach if the budget available is extremely
limited or if sampling is resorted to infrequently; (iii) the use of probability sampling may
sometimes be completely out of question, i.e. there are some investigations that probably
cannot be conducted without selecting sample units by using methods other than non-
probability sampling; (iv) when the required information is of relatively minor importance
and great accuracy is not required, non-probability sampling would work well.

Non-probability sampling seems to have arisen when practical men, confronted with the need
for information, selected sample by rough and ready standards which seemed feasible and
plausible. As this is a frequently used approach it needs to be given its due importance and
respect and is not to be derided. Most of the non-probability, sampling uses in practice have
always contained a tinge of probability, randomness, and absence of bias of objectiveness. The
purpose is to get a fair cross-section of sample of members. The bias in non-probability
sampling can be reduced if discretion (in selection) is lessened or controlled. This can be done
through use of stratification, specifying explicit procedures to interviewers for selection and
replacement, etc.

c. Traditional Probabilistic Sampling vs. Bayesian Approach: The choice of traditional vs.
Bayesian probabilistic sampling approach has a major impact on sample size. Under the
Bayesian approach, the sample size required would generally be smaller. If the problem is to
decide on how much to spend on information gathering (i.e. study) this can be meaningfully
handled only using the Bayesian approach, which gives explicit considerations to the costs of
sampling and the costs of wrong decisions, in addition to the risks of making wrong decisions.
Under the Bayesian approach prior and posterior analysis is carried out to mesh the probables
and take advantage of prior experience and judgement.

Traditional probability sampling does not permit the utilization of a prior probability
distribution over the parameter(s) of interest. Instead, the parameter is considered to be fixed.
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

11 of 20 IIMA/MAR0258TEC

The confidence interval within which a parameter's value is supposed to fall is generally
stated when traditional probability sampling is used. Even though traditional statistics do not
attach much meaning to such statements of confidence intervals, etc. the decision maker is
likely to use them as if he believed in these confidence level statements. Traditional statistical
inference utilizes only "known" conditional distributions of sample statistics, given specified
parameter values. No probabilities are assigned (either before or after) to the possible
parameter values themselves.

It is easy to make a choice in terms of Bayesian vs. traditional probabilistic approach because
the Bayesian approach is much superior from the viewpoint of the decision maker in terms of
costs and drawing useful inferences. However, calculations 1 involved in the Bayesian
approach are cumbersome and difficult and requires lot more effort and understanding from
the researcher as well as the executives. Because of the ease and familiarity of the traditional
probabilistic approach, there is a much larger usage of this approach than the Bayesian, even
though the latter is superior.

IV. NON-PROBABILITY METHODS OF SAMPLING

In the case of non-probability sampling techniques, the judgement of the researcher plays an
important role. As sample selection is not based on any statistical tools, estimation of the error
becomes difficult. Judgement sampling, convenience sampling and quota sampling are the
methods used for sample selection. Each of these would be described briefly.

A. Convenience Sampling

The researcher selects a sample purely on the basis of convenience. Determination of the
sample size is on the basis of availability of time and cost. This technique is suitable when the
time and the resources available for a study are very much limited, quick decisions are to be
made, the elements of universe are quickly accessible and homogenous, and desired level of
accuracy is not high. The major limitation of this method is that bias of the investigator plays a
very important role in sample selection. An objective assessment of the results is not possible
as the sampling error cannot be estimated. The only advantage of this method is that it is
operationally viable when cost and time are major constraints. Execution of this method is
quite easy.

B. Judgement Sampling

This method tries to overcome the bias of convenience sampling to some extent by using the
expert's judgement for selecting a sample. A specialist in the subject matter of the study
chooses what he believes to be the best sample for that particular study. This method is
suitable when the population under study is not very large, the parameters of the study are
known, time and money are the major constraints, and the required level of accuracy is not
very high.

The major drawbacks of this method are (i) bias of the judges cannot be avoided, (ii) errors
cannot be estimated, and (iii) sample is not necessarily representative of the population. The
advantage is that it is better than convenience sampling as bias is somewhat reduced. It has a
good chance of providing useful information if the sample represents a fair cross-section of the
population.

See Green and Tull, Research for Marketing Decisions, New Delhi; Prentice-Hall of India, 3rd edition, 1975, pp-228-50.
1
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

12 of 20 IIMA/MAR0258TEC

C. Quota Sampling

Quota sampling is the most commonly used sampling method in marketing research studies.
In this method, the researcher first selects the basis on which the population is divided into a
number of mutually exclusive and collectively exhaustive strata. Each stratum is assigned a
number of samples on some basis. Definition of each stratum and the number of sample
respondents to be interviewed are communicated well to investigators. The task of actually
selecting individual respondents in the field, then, is left to the individual investigator to
whom particular strata are assigned.

The researcher plays a much larger role in designing the sampling plan in this method of non-
probability sampling compared to other methods. The determination of dimensions of strata is
the first task of the researcher. He selects a few critical parameters of the population about
which information on population is available. In the case of consumer research these could be
socio-economic-demographic variables on which information is generally available. In the
case of industrial marketing research situations, these variables could be size, industry type,
sector, etc. After selecting the key dimensions, the researcher finds two/three/more cut-off
values of each variable. The distribution of the target population is then obtained from known
sources. A simple strata classification for a consumer product on two dimensions – sex and
age – might be as follows:

Age/Sex Sex
Male Female
Below 15 years A% B%
Above 15 years C% D%

Top left cell of the table would read as A% of total population of interest is composed of male
population and is below the age of 15 years.

The researcher then proceeds to determine the total sample size as well as the sample to be
interviewed in each stratum. The total sample size is generally determined on the basis of time
and cost availability. Individual cell size sample is either determined on the basis of
proportion of total population in each cell or a minimum number required in each cell.

The researcher, therefore, is trying to get a sample of the population on the basis of known
probability of a population member being selected in the sample, though there could be equal
or unequal distribution for all the cells.

The actual selection of respondents is left to the investigator once the limits of the control
variable for each cell and the number to be interviewed are explained to him. The bias of the
researcher could introduce error in this respect. It is not uncommon to find actual respondents
in a cell clustering around the limits of control variables. For example, in the above illustration,
the respondents in the cell "Male - below 15 years of age" could cluster around 13 to 15 years
and may not well represent the entire range of age below 15 years. Cross- checks on values of
the range of control variables within each cell could provide better validity. The quota sample
could also be partially validated by estimating values of these parameters which are not used
as control but about which population estimates are available from published sources.

Planning and execution of quota sampling is less costly and less time consuming compared to
probability sampling methods. Partial validation of quota samples is possible through the
methods described above. They are fairly easy to execute also. Given the requirements of
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

13 of 20 IIMA/MAR0258TEC

decision situation (cost, time, and little less concern for high degree of precision), quota
sampling seems to be the most appropriate method.

V. METHODS BASED ON CLASSICAL PROBABILISTIC METHODS

There are a number of methods of sampling based on classical (traditional) statistics. Some of
the better known and used methods in marketing research would be described below:

a) Simple random sampling


b) Systematic sampling
c) Stratified sampling
d) Area sampling

A. Simple Random Sampling

This is a sample selecting in which each member of the universe (list of respondent/units) has
an equal chance of being selected as a sample unit. The chance (or probability) can be
determined by dividing the sample size by the population size. There are two decisions
required in this method. One is the determination of sample size and the other the specific
method of selecting the sample from a sampling frame (list).

For determining the required sample size, it is necessary to know an estimate of the mean and
standard error of the population parameter and the confidence interval acceptable to the
decision maker. If the parameter of interest is a proportion of the population, the mean value
of the expected proportion is enough to provide both mean and variance of the population. If
the mean and standard error are known, a pilot test might have to be conducted. However, an
experienced marketing research executive might be able to specify these. Also, the confidence
that the researcher wants in the estimation of the mean value also needs to be specified.
Standard formulae exist in books on statistics to estimate sample size in the case of several
types of sampling methods.

In terms of actually selecting the sample respondents, given that the sample size has been
determined, two extreme methods are drawing of lots and use of a random number table.
Drawing of lots becomes cumbersome if the list size is large. Use of random number tables is a
preferred mode (not described here, as the reader is likely to be familiar with random number
table use).

Simple random sampling could be applied in case a serialized list of the population members
is available. If such a list does not exist it may have to be prepared, which in the case of a large
population is a costly and time-consuming task. The method would provide unbiased and
valid estimates of parameters if the population is somewhat homogenous with respect to the
characteristic of interest in the study. The method has several difficulties. The first difficulty
pertains to generation of list of population members if one does not exist. The second difficulty
is the generation of a large number of random numbers from tables (this is overcome
nowadays by use of computers). The third difficulty arises out of the method being relatively
inefficient in case the population is not homogeneous on the characteristic being studied.
Lastly, a number of call backs, resulting into loss of time and money, might have to be made to
contact those respondents who are not available at the time the interviewer makes the call.

B. Systematic Random Sampling

This method is slightly different from simple random sampling in that once the first unit is
selected randomly (with the help of a random number generated from the table) from the
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

14 of 20 IIMA/MAR0258TEC

population, all succeeding units will be selected at a regular interval (i.e. other random
numbers get pre-specified once one number is generated). It assumes that the population
elements are ordered in some fashion, say in alphabetical listing. If the population contains `N'
ordered elements and sample size is `n', then, N/n is the interval at which the units are
chosen.

The main advantage of this technique is its simplicity. It is faster and less subject to error than
simple random selection. It is also found that it is more efficient statistically than simple
random sampling. But a difficult arises if there is no periodicity in the population (i.e.
clustering of elements in a manner that certain kind of elements will not appear in the choice).
It is also difficult to estimate the variance of the universe from the variance of the sample.

In very large samples, the method would cut down cost and time considerably.

C. Stratified Random Sampling

This method involves dividing the entire population of interest into a number of mutually
exclusive and collectively exhaustive strata. Within each stratum, simple random samples of
appropriate size (whether proportionate or disproportionate to the population size of the
strata) are drawn. Stratification of the population is done using the guideline that parameter
variance within a stratum should be small and across strata large. Such stratification leads to
better estimate of the parameter value. The required sample size too would be smaller
compared to the method of simple random sampling.

The major difficulty in executing this method is non-availability of population lists having
some information about the possible stratification dimensions in several marketing research
situations. However, in certain populations like industries and trade establishments, such
information may be available and the method could be used to considerable advantage in
terms of economy in sample size and representativeness of the sample.

D. Cluster Sampling

Sometimes the researcher, even though he is interested in marketing characteristics of


elementary elements in the population (an individual, or a family), may decide to select
primary sample units larger than such elementary units, (i.e. a cluster of elementary units
together) - for example, a block of a city or a district in a state. Therefore, for cluster sampling
the researcher uses available divisions of the population. These divisions are on some natural
basis like geographic, area, political boundaries, etc. First a few clusters out of available
clusters are selected using the random process and then either all the elementary units in a
selected cluster are studied or a few elements from the selected cluster are selected using the
random process.

There are a number of issues which the researcher has to decide in cluster sampling: How
clusters should be formed? How high the cluster should he? Should clusters be of equal size or
unequal size? Should be select all the elements in the selected cluster or should he resort to
sub-sampling (i.e. selecting a few) after selecting the clusters?

The advantages of cluster sampling are: (i) probability sampling could be applied even though
master lists - ready sampling frames - are not available but only a list of larger primary units
(blocks, towns, districts, etc.) are available, (ii) field expenses of collecting data is significantly
reduced because by clustering large reduction in travel cost takes place, (iii) cluster sampling is
particularly useful in test marketing situations.
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

15 of 20 IIMA/MAR0258TEC

As this is a probability based sampling method, each element of the population has a known
positive probability of appearing in the selected sample.

Compared to simple random sampling the primary benefit of cluster sampling lies in cost
benefits and not in its greater reliability. If elements in the selected population are more
homogeneous the efficiency of cluster sample is reduced.

Stratified random sampling and cluster sampling appear to be similar because in both
breaking up of a larger group (population) into smaller groups takes place. However, an
important difference is that all strata of the population are automatically included in a
stratified sample, while in the case of cluster sampling only a few (sample) of the clusters are
selected.

E. Area Sampling

Clustering the population by geographical area amounts to area sampling. It could be done by
breaking the population by states, cities, towns, etc. To such a process of breaking down of
population amounts to the subjects which are exhaustive and the units in a subject are nearby
in a geographical sense. The sample size required as in cluster sampling would be smaller
compared to the one required under simple random sampling.

Area sampling is useful for survey of consumer households on a rational basis as people living
nearby can be interviewed easily which would result in savings in travel cost and time. The
major limitation of this method is that people living nearby tend to have similar characteristics
and hence there are chances of getting somewhat less representative and inadequate results.

It should be noted that in area sampling practical requirements modify the actual working of
the theoretical design. Under this method, it is necessary that each element could be associated
as a part of specific area but practical problems arise due to transients, occasional travellers
and people with no established dwelling units. Also, there are problems of people who are not
at home (needing call backs) and those who refuse to cooperate.

VI. BAYESIAN PROBABILISTIC METHODS

Use of Bayesian methods in sampling is essentially for determination of optimal sample sizes,
whether for the entire population or for different strata (defined in the same manner as
defined in quota sampling or stratified random sampling method).

The Bayesian approach is the only one which combines sampling and non-sampling errors
and helps in determining an economic optimal sample size for the research study.

The analysis of a decision situation in a Bayesian framework provides critical information


about the upper limit on the amount of money to be spent on collecting additional information
(known as the Expected Value of Perfect Information - EVPI - in Bayesian terminology). A
knowledge of cost of sampling can easily provide the upper limit on the sample size in a
research situation. Most often, given the fact that marketing executives have some knowledge
of the marketing phenomenon under study, this upper limit on the sample size is considerably
smaller than the once obtained through application of traditional approaches. Further analysis
should, in several situations, show that there actually is no need of a sample (i.e. further
information).
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

16 of 20 IIMA/MAR0258TEC

In determining the sample size for total population, as against stratified sampling, a decision
rule which would provide the optimal sample size is as follows. If the upper limit on the
sample size is `n' for each of the sample sizes work out the expected monetary value and select
the one which gives the maximum expected monetary value. The method consists of
conducting pre-posterior analysis for each of the total sample size. The sample size so arrived
would generally be much smaller than the upper limit arrived at earlier by use of EVPI.

On the whole, therefore, the Bayesian approach could be used generally with the help of a
computer as the calculations are too numerous in conducting the pre-posterior analysis to
arrive at optimal sample size.

VII. CHOICE OF SAMPLING METHOD

In Sections III, IV, V, and VI the issues of selecting basic sampling approach as well as various
alternative specific methods of sample selection were described. If the researcher chose the
Bayesian probabilistic approach, there are no alternative methods to be considered. However,
if the choice of non-probability or traditional probabilistic approach is made, a further choice
will have to be made from among alternative methods available under each of the two
approaches.

a) Criteria for Selection

In earlier sections various criteria or considerations which influence choice of sampling


approach/method were mentioned. These are equally relevant for this second step choice too.

These are

i) Population size
ii) Clear definition of population
iii) Availability of ready universe/population frame
iv) Extent of known homogeneity in the population
v) Some prior knowledge of population parameters
vi) Desired level of accuracy required (depending) upon research purposes, type of research
designs)
vii) Time available/constraints for sample selection
viii) Cost aspects/constraints
ix) Sample size required
x) Favourable features
xi) Unfavourable features

b) Selection for Non-Probability Sampling Approach

The three most used non-probability methods are: (i) convenience sampling, (ii) judgement
sampling, and (iii) quota sampling. The following table gives the relative evaluation of these
three methods on the above criteria.
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

17 of 20 IIMA/MAR0258TEC

Convenience Judgement Quota Sampling


Sampling Sampling
a) Population size Small Small Large
b) Clear definition of population Not required Partially required Required
c) Availability of ready universe list Not required Not required Not required
d) Homogeneity in Population Homogeneous Not necessary Homogeneous
e) Knowledge of population parameters Not required Not required Not required
f) Desired level of accuracy Low Low High
g) Time required for selection Low Little higher High
h) Cost aspects Low Low Little larger
i) Sample size required Small Small Little larger
j) Favourable features Operationally Less bias than Best among non-probability
easy convenience sampling - least bias and can
sampling be validated as probabilistic
k) Unfavourable features Bias of selector Bias of judges Some bias of investigator
Error cannot be Error cannot be Error might be estimated if
estimated estimated validated

Depending upon the purposes of research whether exploratory or inferential, the choice
would be from convenience to judgement to quota sampling. If qualitative (motivational)
research is to be used for action purposes, an attempt is made to get a fair size cross-section of
sample similar to quota type but no estimation is made. For group discussion study choice
would be made on convenience or judgement basis. For descriptive and causative kind of
studies where somewhat higher degree of reliability of inferences is to be drawn, quota
sampling is the obvious choice.

c) Choice among the Traditional Probability Sampling Methods


The important probability methods discussed in an earlier section were (i) simple random
sampling, (ii) systematic sampling, (iii) stratified sampling, (iv) cluster sampling, and (v) area
sampling. The actual choice of the specific method will be influenced by the critera for
selection.

Simple Systematic Stratified Cluster Area


Random Sampling Sampling Sampling Sampling
Sampling
a) Population size Large Large Large Large Large
b) Clear definition of population Required Required Required Required Required
c) Availability of ready universe list Required Required Sub-list Natural Area lists
required cluster lists required
required
d) Homogeneity in population Required No Within strata Not required Not
periodicity required
e) Knowledge of population Not Not Not required Not required Not
parameters required required required
f) Desired level of accuracy High High High High High
g) Time required for selection High Medium Medium Medium Medium
h) Cost aspects High High Medium Medium Medium
i) Sample size required Large Large Less than Large Large
large
j) Favourable features Known Known Known Known Known
accuracy accuracy accuracy accuracy accuracy
k) Unfavourable features Costly, Little less Less costly Less costly Less
time- time than stratified than costly than
consuming random stratified stratified
sampling random random
sampling sampling
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

18 of 20 IIMA/MAR0258TEC

In addition to the above listed specific methods, books on statistical sampling 2 describe a few
other methods like sequential sampling, multi-stage sampling, double sampling, sampling for
panels (of consumers or stores), etc. These are all probability sampling methods.

In the choice of the method researchers compare statistical and economic efficiencies of
alternative methods. One sample method is said to be statistically more efficient than another
if, for a given sample size, its standard error of the mean is smaller. If rupee expenditures on
different sample methods are compared it would give economic efficiency. Economic
efficiency could be higher of a method if sample size required for the same accuracy level is
smaller in one method compared to another, or with the same sample size the per rupee
expenditures if leads to higher reliability of results it is more efficient. Sometimes the choice
would be dictated by the nature of marketing situations; for example, a national sample of
households would be difficult to select by using simple random sampling, because serialized
universe frame will be impossible to prepare. Area or multi-stage sampling might be chosen in
this respect.

VIII. SAMPLE DESIGN PROBLEMS IN EXPERIMENTAL RESEARCH

In experimental marketing research, the nature of sample, selection changes somewhat. The
researcher first decides the exact design of experiment he wants to carry out - like before-after.
before-after with control group, Latin square design, etc. The quality and accuracy of results -
inferences based on data collected - is significantly affected by the kind of experimental design
used.

However, even in experimental research the information is finally to be collected from selected
units of source of information (respondents, stores, etc.). How should such sample members
should be selected is the sampling question. Number of experimental designs require control
groups, etc. and matching of sample members in either before-after groups or experimental
and control group. Selection of sample members, therefore, is affected by the matching
considerations where matching is done on various characteristics. Characteristics for matching
are chosen using logic of relevance, etc. There are no statistical methods of selection used for
matching purposes. However, some kind of stratification and quota kind of ideas are used
while selecting sample members for experimental research. Secondary data about the
experimental, control groups (towns) are used for this purpose.

IX. IMPLEMENTATION OF SAMPLING PLAN

This will include (a) preparation of clear instructions to the field staff for facilitating correct
identification and selection of members in the sample, (b) pretesting the sampling plan, and (c)
actual selection of sample members and obtaining data from them.

If a sampling plan requires actions on the part of the field staff these should be clearly spelt
out so as to avoid use of judgement. In a national study of textile distribution, sample of
wholesalers, semi-wholesalers and retailers were to be selected and studied. The task of
preparing, master list (census) of wholesalers and semi-wholesalers was delegated to the field
supervisor. He was given clear instructions about how to prepare such a list, then how to use
such a frame to select specific number of samples members, etc. More often one over-samples
the universe to begin with by some 20 to 30% and then lays down the rules of replacement of
sample members. It is decided as to how many call-backs will be made before a selected

2See Boyd & Westfall, Marketing Research; Text and Cases, Chapters 8 and 10.
Luck, Wales, et.al., Marketing Research, fifth edition, Chapters 7 and 8.
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

19 of 20 IIMA/MAR0258TEC

sample member is considered to be not found and to be replaced. It some kind of quota
sampling is to be carried out, explicit instructions are given about the conditions which a
selected member of the quota will have to satisfy. Instructions might be given about stopping
further selection once the quota is met, etc.

Special care needs to be taken to ensure that actual sample members are selected according to
the original sample design and no significant deviation is made in actual implementation.

Sometimes it might be a good idea to carry out a pretest of the sampling plan to obtain a feel
for the kinds of problems which might be faced in actual sample selection. Benefits of such a
pretest will be as follows:

i) Pretesting would indicate how readily the sampling plan could be administered in the
field
ii) It would reveal if these were misunderstanding, miscommunication and errors in
instructions to the field staff
iii) It may provide further significant stratification criteria in terms of range of characteristics
of population
iv) It may reveal homegeneity, kind of distribution and/or multiplicity of the population(s)
samples which might call for revision of sampling plan
v) It would provide some indication of the likely stability of sample, etc.

A system of checks and controls should be devised and administered to ensure proper
implementation of actual sample selection.

X. CONCLUDING REMARKS

When a researcher wants to select a sample from the population, he wants to ensure that the
sample should represent the population accurately, should be stable, should provide precise
and detailed information, and be such that research resources are used efficiently.

The above discussion of various steps in the sampling process were addressed to the question
of efficient and satisfactory sample selection. Throughout the discussion, how could a
researcher make several choices was elaborated upon in terms of type of considerations which
influence and the care he should take in making the choice.

A few remarks are made here in terms of some other practical advice based on experiences of
research organizations.

1. There is no relationship between the size of the sample and the size of the population.
The formula for determining sample size does not use population size as a variable.
However, in real life, research organizations have found out that it would be very useful
to select a sample size such that a certain proportion of the population is in the sample.
Household studies conducted in Ahmedabad or similar large cities use a ratio of 1: 1,000,
i.e. for 3 lakh families of Ahmedabad a sample of 300 is considered adequate.

2. Another rough norm for many industrial marketing research studies is to take 10 to 15%
of the population units in the sample.

3. Statistically speaking sample size of above 30 is considered large. However, a sample


size of 50 is regarded as "large enough" for some marketing research studies.
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

20 of 20 IIMA/MAR0258TEC

4. There are two kinds of errors which bring in inaccuracies in inferences and action
possibilities. One is due to the way in which sample members are selected. However, the
other error could be due to the kind of measurement method used or the deficiencies of
method used for data collection. This is sometimes described as response error.
Generally the response error is much higher in size than the error due to sampling.
However, one finds that researchers worry much about reducing sampling error rather
than the response error.

5. In some of the sampling methods attempt is made to ensure a fair cross-section of the
relevant sub-groups. This fair cross-section selection is not ad-hoc selection, but is based
on rigorous and information-based logic.

6. In spite of all the care, etc. no probability sampling is hundred per cent probabilistic/
scientific because there are various points at which small deviations would take place in
actual selection. Similarly, all of non-probability sampling is not ad-hoc or opportunistic.
There is always a tinge of (sometimes even an ounce) probability in the non-probability
sampling method used. Marketing researchers should aim to make non-probabilistic
(purposive) sampling as random and systematic a selection as possible.
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

Indian Institute of Management


Ahmedabad IIMA/MAR0258S

Supplement to the
Note on Sampling in Marketing Research

This supplement has two exhibits.


Exhibit 1 deals with the question of determining sample size for a study.
Exhibit 2 presents the sampling plan used for the National Readership Survey (1970).

Exhibit 1
Formulae for Computing Sample Size: Simple Random Sampling

1. Points to Note

a) The following formulae apply to computation of sample size for estimating the value of a variable
(item 2 below) or proportion of population having an attribute (item 3 below) in the case of simple
random sampling method.
b) In case the estimation of more than one variable/attribute is required with certain confidence
interval, it may be necessary to compute sample sizes for each of the critical variables/attributes
separately. Given this information and cost of sampling respondents, one could then make a
decision about the sample size. It might imply estimation of different variables with different
confidence intervals.
c) The sample size determined through the formulae are of the actual number of responses needed.
If it is felt that certain proportion of respondents are not likely to respond, the sampling plan should
adjust the sample size accordingly (upwards) so that finally the particular number of responses are
available.

2. Sample Size of Study for Estimation Mean Value of One Variable


2
a) Sample Size (n)= Standard deviation of variable on population (s)
Desired Standard Error or Sample Mean (S)

b) If sample size is more than 5 per cent of population size, adjusted sample size would be:

Adjusted Sample Size (na) = Sample Size as above (n) x Population Size (N)

Population Size (N) + Sample Size (n)

c) In these two formulae (in a and b above)

i) Desired Standard Error of Sample Mean (SN) = Half of desired range of the mean
Area under standardized normal curve for
the desired confidence level

Prepared by Professors M.N. Vora and Abhinandan K. Jain.


Copyright © 1982 by the Indian Institute of Management, Ahmedabad.
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

2 of 7 IIMA/MAR0258S

Exhibit 2
Sampling Plan For National Readership Survey (1970) ∗

1. The objective of the survey was to measure readership of selected publications and the extent of
exposure to radio and cinema among adults aged 15 and over.

2. Sampling Procedure: The sample is designed to be representative of all the individuals aged 15
years and above in India excepting those living in Jammu & Kashmir, NEFA, and offshore
territories like Andaman and Nicobar Islands.

The multistage stratified sampling design has been adopted in the survey. The sample has been
drawn in the form of two independent inter-penetrating sub-samples.

India is first stratified into 14 primary strata comprising political states or group of states. These are:

North Zone: 1. Delhi, Punjab, Haryana, and Himachal Pradesh


2. Rajasthan
3. Uttar Pradesh
East Zone: 4. Assam, Manipur, Tripura, and Nagaland
5. Bihar
6. Orissa
7. West Bengal
West Zone: 8. Gujarat, Dadra Nagar Haveli, and Diu-Daman
9. Madhya Pradesh
10. Maharashtra and Goa
11. Andhra Pradesh
12. Kerala
13. Mysore
14. Tamil Nadu and Pondicherry

Each of the 14 states or state groups are again stratified into two secondary strata, urban and rural
areas, as defined in the 1961 census of population (1971 census data was not available at the time of
planning of this survey which was done in early 1970).

Sampling in Urban Areas

In urban areas, in each state or state group, cities and towns were further stratified on the basis of
population size as under:

i) For example, say for a desired range of +5 (range of 10), a confidence interval of 95 per cent is
required. The area under the normal curve for 95 per cent confidence interval (from tables of area
under normal curve) is 1.96. Therefore,

SX = 10/2 = 2.55
1.96

ii) Standard deviation of the variable over the population should be estimated by
- Past studies
- Taking a small sample
- One-sixth of the estimate of the range of variable (through some means/judgment) over the
population

3. Sample Size of a Study for Estimation of Mean Proportion of Population having an Attribute

a) Sample Size when Population Size is Large


Extracts from: Operations Research Group, National Readership Survey - Methodology, Baroda, 1972, pp. 2-9.
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

3 of 7 IIMA/MAR0258S

Proportion of Population Having the Attribute (p) x


Proportion not having the Attribute (q or l—p)
Sample Size (n) =
2 2
(Desired Standard Error of the Proportion) or (Sp)

b) Sample Size when Population Size is Small (say less than 1000) and known

Proportion of Population Having the Attribute (p) x


Proportion not Having the Attribute (q or l—p)
Sample Size (n) =
2
Half of allowable range pxq
Areas under normal curve for Population Size
the desired confidence interval

c) In the above formulae the terms other than p and q have similar meaning as described in 2c
above.

Town Stratum Population Size of Towns as per 1961 Census


I 5 lakh and over
II Between 1 and 5 lakh
III Between 50,000 and 1 lakh
IV Between 20,000 and 50,000
V Below 20,000

In each of the urban sub-strata thus formed, the sample was selected by using the three-stage
sampling procedure.

In the first stage, the towns were selected. In Stratum I all the towns were sampled. In the remaining
strata, the towns were sampled with probability proportional to the total population of the sampled
towns (ppts).

In the second stage, selection of individuals within each sampled town was done with the help of
electoral rolls.

The electoral rolls are the most comprehensive listing of persons available as a sampling frame for
selection of individuals. These, however, do not include certain categories of population that were to
be represented in the survey. These were:

a) individuals aged 15 years and over but not of voting age

b) individuals who have moved from their registered address since the compilation of electoral rolls.

On account of this, the population covered by the survey was classified into two distinct groups.

Electors: All persons listed in the current electoral rolls and still residing at their registered address
when the interviewer called.

Non-Electors: All the persons aged 15 years and over. These comprise persons falling in categories
(a) and (b) above.

In each town, a sample of required number of electors was drawn from the electoral roll by using
systematic sampling procedure with a random start. This constituted a sample representing all
electors.

Each sampled elector together with his address identified a dwelling unit in which he was then staying
or was formerly staying. All the members residing at that dwelling unit were enumerated and their
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

4 of 7 IIMA/MAR0258S

names compared with the names of voters listed under that address. Those members who were aged
15 years and over but who were not listed as voters were classified as non-electors and one out of
them was sampled at random to represent that category of population. In the towns with population of
5 lakh and over sampling of dwelling units was preceded by sampling of electoral wards. The dwelling
units were sampled within the wards, thus adding one more stage in the sampling procedure.

Sampling in Rural Areas

In rural areas, villages were stratified on the basis of population size as under:

Village Stratum Population size of villages as per 1961 census


I 5,000 and over
II Between 1,000 and 4,999
III Below 1,000

In each state (state group), villages with population of 5,000 and over were grouped to form a stratum
and the required number of villages were sampled on (ppts) basis, that is, with probability proportional
to total population of the sampled villages.

For representing the remaining rural areas in the state, first a sample of districts was selected with
probability proportional to the total rural population of the sampled districts. Within each district two
taluks/tehsils were sampled with probability proportional to the total rural population of sampled taluks.
Villages with population below 5,000 in the sampled taluk/tehsil were then stratified so as to form two
strata, namely villages with population between 1000 and 4999 and villages with population below
1000. From each of these village strata, the required number of villages were sampled on (ppts) basis.

The procedure adopted in selection of individuals was the same as that used in the case of urban
areas.

In the case of villages with population above 5,000 two independent sets of villages were selected
within each state to form interpenetrating sub-samples. To achieve this for the remaining areas, two
independent sets of districts were selected within each state.

3. Sample Size: The sample size for each town and village was arrived at by using the following
sampling proportions in different towns and village strata:

Stratum Sample size


Town Strata I, II, III 50 per lakh of population
Town Strata IV and V 40 per lakh population
Village Stratum I 10 per lakh population
Village Strata II and III 3 per lakh population

A few exceptions in the above scheme were made in some of the states in order to ensure a certain
minimum sample size either at the stratum level or at the state level.

The allocation of sample size to the towns and villages selected within a stratum was made, as far as
possible, in proportion to their population.

In all, a sample of 55,622 adults over the age of 15 was selected spread over 254 towns and 704
villages. The distribution of towns and villages over various strata of towns and villages is given below:
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

5 of 7 IIMA/MAR0258S

No. of Settlements Sample Size


Towns
Stratum I Over 5 lakh 13 9,731
II 1-5 lakh 50 10,788
III 50,000-1 lakh 43 5,592
IV 20,000-50,000 62 6,488
V Below 20,000 86 7,218
Sub-Total 254 39,817
Villages
Stratum I 5,000 and above 106 4,389
II 1,000-4,999 293 5,951
Ill Below 1,000 305 5,465
Sub-Total 704 15,805
Total 958 55,622

The distribution of 55,622 adult interviewees over electors and non- electors is as follows:

North East West South


Total
Zone Zone Zone Zone
Urban
Electors 4890 4337 5448 5531 20206
Non-Electors 3930 4551 4961 4961 19611
Sub-Total 8820 8888 10409 11700 39817
Rural
Electors 2383 2068 2031 2851 9333
Non-Electors 1430 1816 1180 2046 6472
Sub-Total 3813 3884 3211 4897 15805
TOTAL
Electors 7273 6405 7479 8382 29539
Non-Electors 5360 6367 6141 8215 26083
Total 12633 12772 13620 16597 55622

Statewise details of sample size are given in Tables 1 and 2 for urban and rural areas respectively.
List of towns selected with the sample is given in Appendix II. Districtwise data on number of villages
selected with the total sample size is given in Appendix III.

Note: Appendices II and III are not included in this supplement.


This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

6 of 7 IIMA/MAR0258S

Table 1: Towns Selected for Readership Survey and Achieved Sample Size by States

Stratum Total
No. Sample No. Sample No. Sample No. Sample Sample
No Sample Size No.
Size Size Size Size Size
50000 – 20000—
5 lakh and over 1 — 5 lakh Below 20000
1 lakh 50000
North Zone
Delhi, Punjab
1 1030 3 606 4 386 4 388 6 443 18 2853
etc.
Rajasthan -- -- 4 662 2 192 4 333 6 500 16 1687
UttarPradesh 3 960 6 1494 4 510 6 628 8 688 27 4280
Total 4 1990 13 2762 10 1088 14 1349 20 1631 61 8820
East Zone
Assam -- -- 2 442 2 102 2 102 4 115 10 651
Bihar -- -- 4 774 2 215 4 358 4 365 14 1712
Orissa -- -- 1 340 2 205 2 227 4 233 9 1005
West Bengal 2 2244 4 986 4 1045 4 825 4 1423 38 8529
Total 2 2244 11 2432 10 1567 12 1512 16 1133 51 8888
West Zone
Gujarat 1 642 4 547 4 400 4 471 6 519 19 2579
Madhya
-- -- 4 848 2 208 4 356 8 610 18 2022
Pradesh
Maharashtra 3 2685 4 1042 4 442 6 706 8 933 25 5808
Total 4 3327 12 2437 10 1050 14 1533 22 2062 62 10409
South Zone
Andhra
1 761 4 741 2 298 6 579 8 667 21 3045
Pradesh
Kerala -- -- 4 750 3 478 4 379 2 231 12 1838
Mysore 1 539 3 634 2 415 4 387 8 706 18 2681
Tamil Nadu 1 870 4 1032 6 696 8 749 10 788 29 4135
Total 3 2170 14 3157 13 1887 22 2094 28 2392 80 11700
1078
All India 13 9731 50 43 5592 62 6488 86 7218 254 39817
8
This document is authorized for use only in IILM University's UG; BRM I course by Jasdeep Chadha from Jul 29, 2020 to Dec 31, 2022.

7 of 7 IIMA/MAR0258S

Table 2: Villages Selected for Readership Survey and Achieved Sample Size by States

Stratum I Stratum II Stratum III Total


Above 5,000 1.000-4,999 Below 1,000
Sample Sample Sample Sample
No. No. No. No.
Size Size Size Size
North Zone
Delhi, Punjab,
4 85 14 360 16 307 34 752
etc.
Rajasthan 4 127 16 239 16 319 36 685
10
UttarPradesh 8 242 47 1007 48 1127 2376
3
17
Total 16 454 77 1606 80 1753 3813
3
East Zone
Assam 4 111 8 140 16 247 28 498
Bihar 10 296 32 592 32 540 74 1428
Orissa 4 119 16 166 24 344 44 629
West Bengal 6 250 17 594 23 485 46 1329
19
Total 24 776 73 1492 95 1616 3885
2
West Zone
Gujarat 4 124 16 317 16 230 36 671
Madhya Pradesh 4 106 16 229 40 675 60 1010
Maharashtra 8 360 16 630 24 540 48 1530
14
Total 16 590 48 1176 80 1445 3211
4
South Zone
Andhra Pradesh 10 298 32 612 16 198 58 1108
Kerala 20 1594 15 140 2 35 37 1769
Mysore 4 148 16 352 16 287 36 787
Tamil Nadu 16 529 32 573 16 131 64 1233
19
Total 50 2569 95 1677 50 651 4897
5
10 29 30 70
All-India 4389 5951 5465 15805
6 3 5 4

6. Estimation of Procedure: In the first instance, estimates for towns and villages were built up taking
into account the probability of selection of different electors and non-electors within the town or
villages.

The town village estimates were then projected, using the appropriate number of steps, to different
town village class levels in the state. The total of these estimates gave the estimates for the state.

You might also like