Research Methods - Chapter 4

ACFN 3111-
Research Methods in Accounting & Finance
Chapter V
Sampling Design
1
The Concept of Sampling
2
Content of the lecture
 Sampling Design
 Census and sample survey
 The Need for sampling
 Steps in sampling Design
 Criteria for selecting a sampling procedures
 Types of sample Design
 Probability sampling Design
 Non probability sampling Design
3
What is Sampling?
 Some studies involve only small number of people and thus all
of them can be included.
 But when the population is large, it is usually not possible
to undertake a census.
 Therefore, we will be forced to work with a smaller, more

manageable number of people to take part in their research.
 The selection of some cases from a larger group of

potential cases is called sampling.
 It aims at obtaining consistent and unbiased estimates of
the population parameters.
4
What is Sampling?
 The best sample is a representative sample, or is a model of
the population.
 A sample is a portion of a larger whole.
 Of course, no sample is perfect, as it usually has some degree

of bias or error.
 But, if this sample is chosen carefully using the correct
procedure, it is possible to generalize the results to the
whole of the population.
5
What is Sampling?
6
Reasons for Sampling
 Reduced cost: since data are secured from a small fraction of
the population, cost will be reduced.
 Greater speed: for the same reasons as above, sample surveys

can be reported faster than that of census.
 Greater scope and accuracy: since samples deal with fewer

units than complete census, it is possible to attain greater
accuracy.
 highly trained personnel, careful supervision, specialized equipment,
etc.
7
 Feasibility: some investigations could only be addressed by
sample surveys: for example
 when studying infinite populations,
 laboratory testing of one’s blood,
 when conducting quality assurance tests (especially when

the test involves the destruction of the product), etc.
8
 Representativeness
 There are two principles in representativeness:
 The need to avoid bias and the need to gain maximum
precision.
 Bias can arise for instance:

 if the selection of the sample is done by some non-
random method
 if the sampling frame (i.e. list, index, population record)

does not adequately cover the target population.
9
 Representativeness is important particularly if you want to
make generalization about the population.
 So, for Quantitative Studies:

 Samples should be drawn in such a way that it is
representative of the population.
 For Qualitative Studies:

 representativeness of the sample is NOT a primary concern.
 In qualitative studies we select study units which give you
the richest possible information.
 you go for INFORMATION-RICH cases!
10
Steps in Sampling Design
 The sampling Process
 Representative samples are generally obtained by
following a set of well-defined procedures:
1. Defining the target population
2. Choosing the sampling frame
3. Selecting the sampling method
4. Determining the sample size
5. Implementing the sampling plan
11
a) Identifying the relevant population:
 Determine the relevant population from which the sample
is going to be drawn.
 Example: if the study concerns income, then the definition
of the population as individuals or households can make a
difference.
b) Determining the method of sampling:

 Whether a probability sampling procedure or a non-probability
sampling procedure has to be used.
12
c) Securing a sampling frame:
 A list of elements from which the sample is actually drawn
is important and necessary.
d) Identifying parameters of interest:

 what specific population characteristics (variables and
attributes) may be of interest.
e) Determining the sample size

 The determination of the sample size deepens on several
factors.
 Obviously, the bigger, the better
13
Determining the sample size
 One of the questions researchers tend to ask is ‘how many
people should I speak to?’
 This is not an easy question.
 The decision on the sample size hinges on how large an error

one is willing to tolerate in estimating population parameters.
 The issue sample size is a very important thing in

quantitative research.
14
 This obviously depends on the type of research.
 But in the final decision, statistical precision must be

balanced against time, cost, and other practical
considerations.
 Research designs with too small of a sample size are unethical

because they waste resources since they can only provide
anecdotal evidence.
 If the sample size is too small, it will not be possible to make

valid generalizations.
15
 Research studies that use too large sample i.e., larger than
needed also are unethical because
 time and financial resources wasted,
 human subjects undergo experimental procedures that

could be distressful, painful.
 So, it is unethical to start work on a research project, which

burdens research subjects and consumes resources.
 The sample size will also depend on what you want to do with
your results.
16
 The general rule in quantitative research is that the larger the
sample the better it is.
 However, you have to remember that you are probably

restricted by time and money
 Therefore, you have to make sure that you construct a
sample which will be manageable.
17
 Also, you have to account for non-responses and you may
need to choose a larger sample size to overcome this
problem.
 There will be some non-responders – people who do not

agree to take part in your research – and some whom
you may not be able to contact.
 select at least 10% more than you hope to gain

responses from.
18
In general, the sample size depends on several factors.
i) Degree of homogeneity: The size of the population variance is the
most important parameter.
 The greater the dispersion in the population the larger the
sample must be to provide a given estimation precession.
ii) Degree of confidence required: you must determine how much

precision you need.
 Precision is measured in terms of
 An interval range.
 The degree of confidence
 i.e. the sample size is determined by the level accuracy
required in the study.
19
iii) Number of sub groups to be studied:
 When the researcher is interested in making estimates
concerning various subgroups of the population then the
sample must be large enough for each of these subgroups
to meet the desired quality level.
iv) Cost: All studies have some budgetary constraint and hence
cost dictates the size of the sample.
 The level of precision obviously increases with increase in the

size of the sample thus researchers are usually challenged by the
interest of balancing accuracy and cost.
20
v) Practicality: Of course the sample size you select must make
sense.
 Therefore the sample size is usually a compromise between
what is DESIRABLE and what is FEASIBLE.
FEASIBLE
 For researchers with limited time and resources, the sample

size is more likely to be influenced by the resources available
and the ease of access to the sampled cases.
 But the limitations of a smaller sample need to be reported

in the research report or dissertation.
21
VI) Other Considerations:
 (i)Prior information: If our process has been studied before, we

can use that prior information to determine our sample size.
 This can be done by using prior mean and variance

estimates and by stratifying the population to reduce
variation within groups.
 Note: If you know the mean and variance statistical methods

can be used to determine the size of the sample required for a
given level of accuracy.
22
 (VII) Rule of Thumb: is based on past experience with samples
that have met the requirements of the statistical methods.
 For small populations (under 1000 a large sampling ratio
(about 30%). Hence, a sample size of about 300 is required.
 For moderately large population (10,000), a smaller

sampling ratio (about 10%) is needed – a sample size
around 1,000.
 To sample from very large population (over 10 million),
one can achieve accuracy using tiny sampling ratios
(.025%) or samples of about 2,500.
23
 (VIII) Using Cochran’s Formula: You need to determine a few
things about the sample you need.
 Margin of Error (Confidence Interval) — No sample will be

perfect, so you need to decide how much error to allow.
 The confidence interval determines how much higher or
lower than the population mean you are willing to let your
sample mean fall.
 It will look something like this: “68% of voters said yes to

proposition Z, with a margin of error of +/- 5%.”
24
 Confidence Level — How confident do you want to be that the
actual mean falls within your confidence interval?
 The most common confidence intervals are 90% confident,
95% confident, and 99% confident.
 Standard of Deviation — How much variance do you expect in

your responses?
 The safe decision is to use 0.5 – this is the most commonly used
number and ensures that your sample will be large enough.
25
 Cochran (1963) has developed the following formula to
determined sample size - infinite population when we are to
estimate the proportion in the universe.
 where n represents sample size, Z is z-score which is confidence

level of 95%, p represents standard deviation which commonly
takes a value of 0.5, q= 1-p, e represents the level of precision
(margin of error commonly used is +/- 5% with confidence
interval of 95%).
 for finite population:……………….
26
Example 1: What should be the size of the sample if a simple random sample
from a population of 4000 items is to be drawn to estimate the per cent
defective within 2 per cent of the true value with 95.5 per cent
probability? What would be the size of the sample if the population is
assumed to be infinite in the given case?
Solution: In the given question we have the following:
N = 4000;
e = .02 (since the estimate should be within 2% of true value);
z = 2.005 (as per table of area under normal curve for the given confidence
level of 95.5%).
As we have not been given the p value being the proportion of defectives in
the universe, let us assume it to be p = .02 (This may be on the basis of our
experience or on the basis of past data or may be the result of a pilot study).
27
28
Example 2: Suppose a certain hotel management is interested in
determining the percentage of the hotel’s guests who stay for more than
3 days. The reservation manager wants to be 95 per cent confident that
the percentage has been estimated to be within ± 3% of the true value.
What is the most conservative sample size needed for this problem?
Solution: We have been given the following: Population is infinite;
e = .03 (since the estimate should be within 3% of the true value);
z = 1.96 (as per table of area under normal curve for the given
confidence level of 95%). As we want the most conservative sample size
we shall take the value of p = .5 and q = .5. Using all this information, we
can determine the sample size for the given problem as under:
29
Sample Size in Qualitative Studies
 Sample size is not big issue in qualitative research since the
emphasis is obtaining new information to have deep
understanding of a phenomenon.
 i.e., no fixed rules for sample size in qualitative research.
 The sample size depends on WHAT you try to find out, and
from what different informants or perspectives you try to
find that out.
 the sample size is therefore estimated, but not
determined.
30
Types of sampling
31
Probability and non-probability sampling
 We can arrange the different approaches to the sampling
process in a spectrum.
 At one end of the spectrum are the sampling approaches

that are based on statistical theory.
 They aim to produce a sample that can be highly

representative of the whole population
 These are called probability samples.
32
Probability and non-probability sampling
 At the other end of the spectrum are approaches to sampling
that are concerned with selecting cases that will enable the
researcher to explore the research questions in depth,
 These are called purposive or non probability samples
33
Probability sampling
 Probability sampling is based on the concept of random
selection that assures that each population element is given a
known non zero chance of being selected.
 i.e., a sample drawn in such a way that the probability of being

chosen is known – for example, a random sample.
 A randomization process is used in order to reduce or

eliminate sampling bias so that the sample is
representative of the population from which it is drawn.
 In other words, random selection minimizes human

bias
34
 Probability sampling requires a sampling frame (a listing of all
study units).
 Only probability sampling provides a statistical basis for saying

that a sample is representative of the target population.
 And a sample will be representative of the population from

which it is drawn, if all members of the population have an
equal chance of being included in the sample.
35
 In summary:
 probability samples are more representative than any other
type of sample.
 sampling errors can be calculated only for probability

samples.
 probability samples rely on random process, i.e. the selection

process operates in a truly random method.
 since each element has an equal chance or probability of

being selected it is possible to get consistent and unbiased
estimate of the population parameter.
36
 Several types of probability sampling methods could be
identified:
 Simple Random Sampling Technique
 Systematic sampling Technique
 Stratified Sampling Technique
 Cluster Sampling Technique.
 Hybrid Sampling
37
1. Simple Random Sampling (SRS): each person in a population
has the same chance of being included in a sample as every
other person.
 each element of the population has an equal chance of being
selected into the sample.
 This is the simplest and easiest method of probability sampling.

 It assumes that an accurate sampling frame exists.
 Selection could be done either by using table of random

numbers or by the lottery method.
38
 E.g., simple random sampling for household surveys
 Population = all households in the country
 Sampling frame = the list of all households (20 million in
Ethiopia?)
 Sample size = say we have resources to cover only 20,000
households
 Sampling fraction 20,000/20,000,000 or 0.1%
 Select randomly 20,000 households from the long list of
20,000,000 households
39
 Merits of SRS
 No investigator bias or discretion
 Help us to obtain a more representative sample
 Can produce better estimates for the population
 Limitations
 It needs up-to date list of the population units
 Units selected might be scattered geographically, hence,
high cost of data collection
40
2. Systematic Sampling Technique
 In systematic sampling individuals are chosen at regular
intervals (for example every kth) from the sampling frame.
 The first item is selected randomly and then the remaining

units will automatically be selected with some
predetermined pattern.
 A sampling interval (the standard distance between the

elements selected in the sample) identified.
 Simplicity and flexibility are its major advantages.
41
 Steps to draw a systematic sample:
 Calculate K (the sampling interval).
 Select a number between 1 and K at random, say that
number is r.
 Then this means we select the rth element at random.
 Then the rth, (r+K) th, (r+2K) th, …, [r+(n-1)K] th elements of
the population will be selected in our sample .
42
 E.g., a systematic sample is to be selected from 1200 students
of a school.
 The sample size to be selected is 100.
 The sampling fraction is: 100/1200= sample size/study
population = 1/12
 The sampling interval is therefore 12.
 The first student in the sample is chosen randomly, for
example by blindly picking one out of twelve pieces of paper,
numbered 1 to 12.
 If number 6 is picked - every twelfth student will be
included –i.e. 6, 18, 30, 42, etc.
43
3. Stratified Sampling
 Useful when we have heterogeneous populations.
 One disadvantage of the SRS is that small groups in which the

researcher is interested may not appear in the sample
 Stratified sampling divides a population into the
appropriate strata and a simple random sample taken using
either SRS or SS techniques from each stratum.
 The elements in a stratum are supposed to be homogeneous

with respect to the given characteristics .
44
45
The reasons for stratifying
1. To increase a sample’s statistical efficiency (smaller
standard errors-less variation).
2. To provide adequate data for analyzing the various
subpopulation.
3. To enable different research methods and procedures to be
used in different strata.
4. The absence or poor quality of a sampling frame makes it
necessary to first select a sample of geographical units, and
then to construct a sampling frame only within those
selected units.
46
 multiple stage stratified random sampling could also be
considered.
 E.g., in the household survey we may be interested to have
sufficient number of households from each region of
Ethiopia;
 So stratify by region!
How to Stratify
 Three major decisions must be made in order to stratify the
given population into some mutually exclusive groups.
47
 (1) What stratification base to use: stratification would be based
on the principal variable under study such as income, age,
education, sex, location, religion, etc.
 (2) How many strata to use: there is no precise answer as to

how many strata to use.
 The more strata the closer one would be to come to
maximizing inter-strata differences and minimizing intra-
strata variances.
48
(3) What strata sample size to draw: different approaches could be
used:
 One could adopt a proportionate sampling procedure.
 If the number of units selected from the different strata

are proportional to the total number of units of the
strata then we have proportionate sampling.
 Or non-proportional sampling where the number of

items studied in each stratum is disproportionate to the
respective proportion of the stratum in the population.
49
4. Cluster Sampling:
 It may be difficult or impossible to take a simple random
sample because a complete sampling frame does not exist, or
 Due to logistical difficulties
 E.G: interviewing people who are scattered over a
large area may be too time-consuming.
 The selection of groups of study units (clusters) instead of the

selection of study units individually is called CLUSTER
SAMPLING.
 It is cost effective (High economic efficiency)
 Similar to stratified sampling- you need to divide the

population into discrete groups.
50
 For instance, if the total area of interest is a big one it can be
divided into a number of smaller non –overlapping areas
(clusters) and some of the clusters are selected randomly.
 Clusters are often geographic units (e.g., districts, villages)

or organizational units (e.g., clinics, etc.).
 The primary sampling unit is not units of the population but

groups within the population (clusters)
51
 E.g., sampling for household survey in Addis Ababa

 Probably no complete sampling frame and costly to cover
through simple random sampling Procedures
 Randomly select sub-cities (clusters)
 Randomly select kebeles from selected sub-cities (clusters)
 Then randomly select households from the selected

kebeles
52
Non-probability sampling
Non-Probability Sampling: selection is non random i.e., sampling
units/elements have unequal chance of being selected
 Non-probability samples are chosen based on judgment

regarding the characteristics of the target population and the
needs of the survey.
 Sometimes a probability sample is infeasible.

 Example: If we want to conduct a lengthy experiment using
human subjects we use whoever is willing to participate.
 E.g. a study of drug users (hearsay, criminal records)
53
 Three conditions to use non-probability sampling.
 First, if there is no desire to generalize to a population
parameter.
 Secondly, because of cost and time requirements.

 probability sampling could be prohibitively expensive
since it requires more planning and repeated callbacks.
 Thirdly, probability sampling may break down in its

applications.
 The total population may not be available.
54
(1) Convenience or accidental sampling: The method selects anyone
who is convenient.
 Units that are convenient for the investigator are selected
(e.g. volunteers)
 It can produce highly un-representative samples.
 Such samples are cheap, however, biased and full of systematic

errors.
 E.G: the person on the street interview conducted by
television programs.
 Drop by the cafeteria and ask questions of whoever is there.
55
(2) Quota Sampling: subgroups are identified and a specified
number of individuals from each group are included in the
research – based on certain criteria.
 Identify categories of people (e.g., male, female) then

decides how many to get from each category.
 Example: In a school, find 10 elementary teachers, 10
middle school and 10 high school teachers.
 is used in opinion pollsters, marketing research and other
similar research areas.
 No randomization – difficult to know the sampling error.

56
(3) Purposive or Judgment sampling
 When we select a limited number of informants,
strategically so that their in-depth information will give
optimal insight into an issue is known as purposeful
sampling.
 It uses the judgment of the expert in selecting cases.

 participants are selected because of some desirable
characteristics, like expertise in the area.
 Could be useful when used by skilled investigator
57
(4) Snowball (Network) Sampling – chain sampling
 This is a method for identifying and selecting the cases in a
network.
 You start with one or two information-rich key
informants and ask them if they know persons who know
a lot about your topic of interest.
 Contact the first few and ask them for names of others, and so on.
 Useful when there is no sampling frame

 E.g., Becker’s (1963) study of marijuana users
 Illegal migrants, sex workers, drug users, etc.
58
Problems in Sampling
Sampling errors occur randomly and are equally likely to be in either direction.
The magnitude of the sampling error depends upon the nature of the universe; the
more homogeneous the universe, the smaller the sampling error. Sampling error
is inversely related to the size of the sample i.e., sampling error decreases as the
sample size increases and vice-versa.
59
 Survey errors: The discrepancy between statements from
survey estimates and the reality (the true value) is called
survey errors.
 There are two types of survey errors.

 sampling errors and non-sampling errors.
 and survey error is the sum of the two.
60
 Sampling errors are errors which are attributable to sampling,
and which therefore, are not present in information gathered in
a census.
 It is not a mistake
 Can be controlled by well developed sampling theory
 This error arises because it is unlikely that one will end up with a
truly representative sample, even when probability sampling is
employed
 Sampling errors can be calculated only for probability

samples.
61
 Sampling error is related to confidence intervals.
 A narrower confidence interval means more precise estimates

of the population for a given level of confidence.
 The confidence interval for the true population mean is given

by: 
Mean  z
n
 The sampling error is given by:

z
n
62
 Non-sampling error: Such errors are present whether it is
sampling or census survey that we are dealing with.
 these include all errors apart from sampling error and are
mostly mistakes by one party or another.
Non-Sampling Error includes:

 Non-coverage error
 Wrong population is being sampled
 Non response error
 Instrument error
 Interviewer’s error
63
Non-Coverage sampling error: This refers to sample frame defect.
 Omission of part of the target population (e.g., soldiers,
students living on campus, people in hospitals, prisoners,
households without a telephone in telephone surveys, etc.).
The wrong population is sampled

 Researchers must always be sure that the group being
sampled is drawn from the population they want to
generalize about or the intended population.
64
Non response error – Common in self-administered surveys
 This error occurs when you are not able to find those whom
you were supposed to study.
 Some people refuse to be interviewed because they are ill,

are too busy, or simply do not trust the interviewer.
 When one is forced to interview substitutes, an unknown
bias is introduced.
65
Instrument error
 The instrument in sampling survey is the device in which
we collect data- usually a questionnaire.
 When a question is badly asked or worded, the resulting
error is called instrument error.
 Example: leading questions or carelessly worded

questions may be misinterpreted by some researchers.
66
Interviewer Error :
 Enumerators can distort the results of a survey by in-
appropriate suggestions, word emphasis, tone of voice and
question rephrasing.
 Cheating by enumerators -with only limited training and
under little direct supervision.
 Perceived social distance between enumerator and
respondent also has a distorting effect.
 E.G: questions about sexual behavior might be

differently answered depending on the gender of the
interviewer.
67
Characteristics of a Good Sample Design: Summary
 Sample design must result in a truly representative sample.

 Sample design must be such which results in a small sampling
error.
 Sample design must be viable in the context of funds available
for the research study.
 Sample design must be such so that systematic bias can be
controlled in a better way.
 Sample should be such that the results of the sample study
can be applied, in general, for the universe with a reasonable
level of confidence.
68
The Concept of Sampling
69

Research Methods - Chapter 4

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Research Methods - Chapter 4

Uploaded by

Copyright:

Available Formats

ACFN 3111-

Research Methods in Accounting & Finance

 Therefore, we will be forced to work with a smaller, more

 The selection of some cases from a larger group of

 A sample is a portion of a larger whole.

 Of course, no sample is perfect, as it usually has some degree

 Greater speed: for the same reasons as above, sample surveys

 Greater scope and accuracy: since samples deal with fewer

 when studying infinite populations,

 laboratory testing of one’s blood,

 when conducting quality assurance tests (especially when

 Bias can arise for instance:

 if the sampling frame (i.e. list, index, population record)

 So, for Quantitative Studies:

 For Qualitative Studies:

b) Determining the method of sampling:

d) Identifying parameters of interest:

e) Determining the sample size

 The decision on the sample size hinges on how large an error

 The issue sample size is a very important thing in

 But in the final decision, statistical precision must be

 Research designs with too small of a sample size are unethical

 If the sample size is too small, it will not be possible to make

 human subjects undergo experimental procedures that

 So, it is unethical to start work on a research project, which

 However, you have to remember that you are probably

 There will be some non-responders – people who do not

 select at least 10% more than you hope to gain

ii) Degree of confidence required: you must determine how much

 The level of precision obviously increases with increase in the

 For researchers with limited time and resources, the sample

 But the limitations of a smaller sample need to be reported

 (i)Prior information: If our process has been studied before, we

 This can be done by using prior mean and variance

 Note: If you know the mean and variance statistical methods

 For moderately large population (10,000), a smaller

 Margin of Error (Confidence Interval) — No sample will be

 It will look something like this: “68% of voters said yes to

 Standard of Deviation — How much variance do you expect in

 where n represents sample size, Z is z-score which is confidence

 At one end of the spectrum are the sampling approaches

 They aim to produce a sample that can be highly

 These are called probability samples.

 These are called purposive or non probability samples

 i.e., a sample drawn in such a way that the probability of being

 A randomization process is used in order to reduce or

 In other words, random selection minimizes human

 Only probability sampling provides a statistical basis for saying

 And a sample will be representative of the population from

 sampling errors can be calculated only for probability

 probability samples rely on random process, i.e. the selection

 since each element has an equal chance or probability of

 This is the simplest and easiest method of probability sampling.

 Selection could be done either by using table of random

 The first item is selected randomly and then the remaining

 A sampling interval (the standard distance between the

 Simplicity and flexibility are its major advantages.

 One disadvantage of the SRS is that small groups in which the

 The elements in a stratum are supposed to be homogeneous

 (2) How many strata to use: there is no precise answer as to

 If the number of units selected from the different strata

 Or non-proportional sampling where the number of

 The selection of groups of study units (clusters) instead of the

 Similar to stratified sampling- you need to divide the