You are on page 1of 69

ACFN 3111-

Research Methods in Accounting & Finance

Chapter V
Sampling Design

1
The Concept of Sampling

2
Content of the lecture
 Sampling Design
 Census and sample survey
 The Need for sampling
 Steps in sampling Design
 Criteria for selecting a sampling procedures
 Types of sample Design
 Probability sampling Design
 Non probability sampling Design

3
What is Sampling?
 Some studies involve only small number of people and thus all
of them can be included.
 But when the population is large, it is usually not possible
to undertake a census.

 Therefore, we will be forced to work with a smaller, more


manageable number of people to take part in their research.

 The selection of some cases from a larger group of


potential cases is called sampling.
 It aims at obtaining consistent and unbiased estimates of
the population parameters.

4
What is Sampling?
 The best sample is a representative sample, or is a model of
the population.

 A sample is a portion of a larger whole.

 Of course, no sample is perfect, as it usually has some degree


of bias or error.
 But, if this sample is chosen carefully using the correct
procedure, it is possible to generalize the results to the
whole of the population.

5
What is Sampling?

6
Reasons for Sampling
 Reduced cost: since data are secured from a small fraction of
the population, cost will be reduced.

 Greater speed: for the same reasons as above, sample surveys


can be reported faster than that of census.

 Greater scope and accuracy: since samples deal with fewer


units than complete census, it is possible to attain greater
accuracy.
 highly trained personnel, careful supervision, specialized equipment,
etc.

7
Reasons for Sampling
 Feasibility: some investigations could only be addressed by
sample surveys: for example

 when studying infinite populations,

 laboratory testing of one’s blood,

 when conducting quality assurance tests (especially when


the test involves the destruction of the product), etc.

8
Reasons for Sampling
 Representativeness
 There are two principles in representativeness:
 The need to avoid bias and the need to gain maximum
precision.

 Bias can arise for instance:


 if the selection of the sample is done by some non-
random method

 if the sampling frame (i.e. list, index, population record)


does not adequately cover the target population.

9
Reasons for Sampling
 Representativeness is important particularly if you want to
make generalization about the population.

 So, for Quantitative Studies:


 Samples should be drawn in such a way that it is
representative of the population.

 For Qualitative Studies:


 representativeness of the sample is NOT a primary concern.
 In qualitative studies we select study units which give you
the richest possible information.
 you go for INFORMATION-RICH cases!

10
Steps in Sampling Design
 The sampling Process
 Representative samples are generally obtained by
following a set of well-defined procedures:
1. Defining the target population
2. Choosing the sampling frame
3. Selecting the sampling method
4. Determining the sample size
5. Implementing the sampling plan

11
Steps in Sampling Design
a) Identifying the relevant population:
 Determine the relevant population from which the sample
is going to be drawn.
 Example: if the study concerns income, then the definition
of the population as individuals or households can make a
difference.

b) Determining the method of sampling:


 Whether a probability sampling procedure or a non-probability
sampling procedure has to be used.

12
Steps in Sampling Design
c) Securing a sampling frame:
 A list of elements from which the sample is actually drawn
is important and necessary.

d) Identifying parameters of interest:


 what specific population characteristics (variables and
attributes) may be of interest.

e) Determining the sample size


 The determination of the sample size deepens on several
factors.
 Obviously, the bigger, the better

13
Determining the sample size
 One of the questions researchers tend to ask is ‘how many
people should I speak to?’
 This is not an easy question.

 The decision on the sample size hinges on how large an error


one is willing to tolerate in estimating population parameters.

 The issue sample size is a very important thing in


quantitative research.

14
Determining the sample size
 This obviously depends on the type of research.

 But in the final decision, statistical precision must be


balanced against time, cost, and other practical
considerations.

 Research designs with too small of a sample size are unethical


because they waste resources since they can only provide
anecdotal evidence.

 If the sample size is too small, it will not be possible to make


valid generalizations.
15
Determining the sample size
 Research studies that use too large sample i.e., larger than
needed also are unethical because
 time and financial resources wasted,

 human subjects undergo experimental procedures that


could be distressful, painful.

 So, it is unethical to start work on a research project, which


burdens research subjects and consumes resources.

 The sample size will also depend on what you want to do with
your results.
16
Determining the sample size
 The general rule in quantitative research is that the larger the
sample the better it is.

 However, you have to remember that you are probably


restricted by time and money
 Therefore, you have to make sure that you construct a
sample which will be manageable.

17
Determining the sample size
 Also, you have to account for non-responses and you may
need to choose a larger sample size to overcome this
problem.

 There will be some non-responders – people who do not


agree to take part in your research – and some whom
you may not be able to contact.

 select at least 10% more than you hope to gain


responses from.

18
Determining the sample size
In general, the sample size depends on several factors.
i) Degree of homogeneity: The size of the population variance is the
most important parameter.
 The greater the dispersion in the population the larger the
sample must be to provide a given estimation precession.

ii) Degree of confidence required: you must determine how much


precision you need.
 Precision is measured in terms of
 An interval range.
 The degree of confidence
 i.e. the sample size is determined by the level accuracy
required in the study.
19
Determining the sample size
iii) Number of sub groups to be studied:
 When the researcher is interested in making estimates
concerning various subgroups of the population then the
sample must be large enough for each of these subgroups
to meet the desired quality level.

iv) Cost: All studies have some budgetary constraint and hence
cost dictates the size of the sample.

 The level of precision obviously increases with increase in the


size of the sample thus researchers are usually challenged by the
interest of balancing accuracy and cost.

20
Determining the sample size
v) Practicality: Of course the sample size you select must make
sense.
 Therefore the sample size is usually a compromise between
what is DESIRABLE and what is FEASIBLE.
FEASIBLE

 For researchers with limited time and resources, the sample


size is more likely to be influenced by the resources available
and the ease of access to the sampled cases.

 But the limitations of a smaller sample need to be reported


in the research report or dissertation.

21
Determining the sample size
VI) Other Considerations:

 (i)Prior information: If our process has been studied before, we


can use that prior information to determine our sample size.

 This can be done by using prior mean and variance


estimates and by stratifying the population to reduce
variation within groups.

 Note: If you know the mean and variance statistical methods


can be used to determine the size of the sample required for a
given level of accuracy.

22
Determining the sample size
 (VII) Rule of Thumb: is based on past experience with samples
that have met the requirements of the statistical methods.
 For small populations (under 1000 a large sampling ratio
(about 30%). Hence, a sample size of about 300 is required.

 For moderately large population (10,000), a smaller


sampling ratio (about 10%) is needed – a sample size
around 1,000.
 To sample from very large population (over 10 million),
one can achieve accuracy using tiny sampling ratios
(.025%) or samples of about 2,500.

23
Determining the sample size
 (VIII) Using Cochran’s Formula: You need to determine a few
things about the sample you need.

 Margin of Error (Confidence Interval) — No sample will be


perfect, so you need to decide how much error to allow.
 The confidence interval determines how much higher or
lower than the population mean you are willing to let your
sample mean fall.

 It will look something like this: “68% of voters said yes to


proposition Z, with a margin of error of +/- 5%.”

24
Determining the sample size
 Confidence Level — How confident do you want to be that the
actual mean falls within your confidence interval?
 The most common confidence intervals are 90% confident,
95% confident, and 99% confident.

 Standard of Deviation — How much variance do you expect in


your responses?

 The safe decision is to use 0.5 – this is the most commonly used
number and ensures that your sample will be large enough.

25
Determining the sample size
 Cochran (1963) has developed the following formula to
determined sample size - infinite population when we are to
estimate the proportion in the universe.

 where n represents sample size, Z is z-score which is confidence


level of 95%, p represents standard deviation which commonly
takes a value of 0.5, q= 1-p, e represents the level of precision
(margin of error commonly used is +/- 5% with confidence
interval of 95%).
 for finite population:……………….

26
Determining the sample size
Example 1: What should be the size of the sample if a simple random sample
from a population of 4000 items is to be drawn to estimate the per cent
defective within 2 per cent of the true value with 95.5 per cent
probability? What would be the size of the sample if the population is
assumed to be infinite in the given case?
Solution: In the given question we have the following:
N = 4000;
e = .02 (since the estimate should be within 2% of true value);
z = 2.005 (as per table of area under normal curve for the given confidence
level of 95.5%).
As we have not been given the p value being the proportion of defectives in
the universe, let us assume it to be p = .02 (This may be on the basis of our
experience or on the basis of past data or may be the result of a pilot study).

27
Determining the sample size

28
Determining the sample size
Example 2: Suppose a certain hotel management is interested in
determining the percentage of the hotel’s guests who stay for more than
3 days. The reservation manager wants to be 95 per cent confident that
the percentage has been estimated to be within ± 3% of the true value.
What is the most conservative sample size needed for this problem?
Solution: We have been given the following: Population is infinite;
e = .03 (since the estimate should be within 3% of the true value);
z = 1.96 (as per table of area under normal curve for the given
confidence level of 95%). As we want the most conservative sample size
we shall take the value of p = .5 and q = .5. Using all this information, we
can determine the sample size for the given problem as under:

29
Determining the sample size
Sample Size in Qualitative Studies
 Sample size is not big issue in qualitative research since the
emphasis is obtaining new information to have deep
understanding of a phenomenon.
 i.e., no fixed rules for sample size in qualitative research.

 The sample size depends on WHAT you try to find out, and
from what different informants or perspectives you try to
find that out.
 the sample size is therefore estimated, but not
determined.
30
Types of sampling

31
Probability and non-probability sampling
 We can arrange the different approaches to the sampling
process in a spectrum.

 At one end of the spectrum are the sampling approaches


that are based on statistical theory.

 They aim to produce a sample that can be highly


representative of the whole population

 These are called probability samples.

32
Probability and non-probability sampling
 At the other end of the spectrum are approaches to sampling
that are concerned with selecting cases that will enable the
researcher to explore the research questions in depth,

 These are called purposive or non probability samples

33
Probability sampling
 Probability sampling is based on the concept of random
selection that assures that each population element is given a
known non zero chance of being selected.

 i.e., a sample drawn in such a way that the probability of being


chosen is known – for example, a random sample.

 A randomization process is used in order to reduce or


eliminate sampling bias so that the sample is
representative of the population from which it is drawn.

 In other words, random selection minimizes human


bias
34
Probability sampling
 Probability sampling requires a sampling frame (a listing of all
study units).

 Only probability sampling provides a statistical basis for saying


that a sample is representative of the target population.

 And a sample will be representative of the population from


which it is drawn, if all members of the population have an
equal chance of being included in the sample.

35
Probability sampling
 In summary:
 probability samples are more representative than any other
type of sample.

 sampling errors can be calculated only for probability


samples.

 probability samples rely on random process, i.e. the selection


process operates in a truly random method.

 since each element has an equal chance or probability of


being selected it is possible to get consistent and unbiased
estimate of the population parameter.

36
Probability sampling
 Several types of probability sampling methods could be
identified:
 Simple Random Sampling Technique
 Systematic sampling Technique
 Stratified Sampling Technique
 Cluster Sampling Technique.
 Hybrid Sampling

37
Probability sampling
1. Simple Random Sampling (SRS): each person in a population
has the same chance of being included in a sample as every
other person.
 each element of the population has an equal chance of being
selected into the sample.

 This is the simplest and easiest method of probability sampling.


 It assumes that an accurate sampling frame exists.

 Selection could be done either by using table of random


numbers or by the lottery method.

38
Probability sampling
 E.g., simple random sampling for household surveys
 Population = all households in the country
 Sampling frame = the list of all households (20 million in
Ethiopia?)
 Sample size = say we have resources to cover only 20,000
households
 Sampling fraction 20,000/20,000,000 or 0.1%
 Select randomly 20,000 households from the long list of
20,000,000 households

39
Probability sampling
 Merits of SRS
 No investigator bias or discretion
 Help us to obtain a more representative sample
 Can produce better estimates for the population
 Limitations
 It needs up-to date list of the population units
 Units selected might be scattered geographically, hence,
high cost of data collection

40
Probability sampling
2. Systematic Sampling Technique
 In systematic sampling individuals are chosen at regular
intervals (for example every kth) from the sampling frame.

 The first item is selected randomly and then the remaining


units will automatically be selected with some
predetermined pattern.

 A sampling interval (the standard distance between the


elements selected in the sample) identified.

 Simplicity and flexibility are its major advantages.

41
Probability sampling
 Steps to draw a systematic sample:
 Calculate K (the sampling interval).
 Select a number between 1 and K at random, say that
number is r.
 Then this means we select the rth element at random.
 Then the rth, (r+K) th, (r+2K) th, …, [r+(n-1)K] th elements of
the population will be selected in our sample .

42
Probability sampling
 E.g., a systematic sample is to be selected from 1200 students
of a school.
 The sample size to be selected is 100.
 The sampling fraction is: 100/1200= sample size/study
population = 1/12
 The sampling interval is therefore 12.
 The first student in the sample is chosen randomly, for
example by blindly picking one out of twelve pieces of paper,
numbered 1 to 12.
 If number 6 is picked - every twelfth student will be
included –i.e. 6, 18, 30, 42, etc.
43
Probability sampling
3. Stratified Sampling
 Useful when we have heterogeneous populations.

 One disadvantage of the SRS is that small groups in which the


researcher is interested may not appear in the sample
 Stratified sampling divides a population into the
appropriate strata and a simple random sample taken using
either SRS or SS techniques from each stratum.

 The elements in a stratum are supposed to be homogeneous


with respect to the given characteristics .

44
45
Probability sampling
The reasons for stratifying
1. To increase a sample’s statistical efficiency (smaller
standard errors-less variation).
2. To provide adequate data for analyzing the various
subpopulation.
3. To enable different research methods and procedures to be
used in different strata.
4. The absence or poor quality of a sampling frame makes it
necessary to first select a sample of geographical units, and
then to construct a sampling frame only within those
selected units.

46
Probability sampling
 multiple stage stratified random sampling could also be
considered.
 E.g., in the household survey we may be interested to have
sufficient number of households from each region of
Ethiopia;
 So stratify by region!

How to Stratify
 Three major decisions must be made in order to stratify the
given population into some mutually exclusive groups.

47
Probability sampling
 (1) What stratification base to use: stratification would be based
on the principal variable under study such as income, age,
education, sex, location, religion, etc.

 (2) How many strata to use: there is no precise answer as to


how many strata to use.
 The more strata the closer one would be to come to
maximizing inter-strata differences and minimizing intra-
strata variances.

48
Probability sampling
(3) What strata sample size to draw: different approaches could be
used:
 One could adopt a proportionate sampling procedure.

 If the number of units selected from the different strata


are proportional to the total number of units of the
strata then we have proportionate sampling.

 Or non-proportional sampling where the number of


items studied in each stratum is disproportionate to the
respective proportion of the stratum in the population.

49
Probability sampling
4. Cluster Sampling:
 It may be difficult or impossible to take a simple random
sample because a complete sampling frame does not exist, or
 Due to logistical difficulties
 E.G: interviewing people who are scattered over a
large area may be too time-consuming.

 The selection of groups of study units (clusters) instead of the


selection of study units individually is called CLUSTER
SAMPLING.
 It is cost effective (High economic efficiency)

 Similar to stratified sampling- you need to divide the


population into discrete groups.
50
Probability sampling
 For instance, if the total area of interest is a big one it can be
divided into a number of smaller non –overlapping areas
(clusters) and some of the clusters are selected randomly.

 Clusters are often geographic units (e.g., districts, villages)


or organizational units (e.g., clinics, etc.).

 The primary sampling unit is not units of the population but


groups within the population (clusters)

51
Probability sampling

 E.g., sampling for household survey in Addis Ababa


 Probably no complete sampling frame and costly to cover
through simple random sampling Procedures

 Randomly select sub-cities (clusters)

 Randomly select kebeles from selected sub-cities (clusters)

 Then randomly select households from the selected


kebeles

52
Non-probability sampling
Non-Probability Sampling: selection is non random i.e., sampling
units/elements have unequal chance of being selected

 Non-probability samples are chosen based on judgment


regarding the characteristics of the target population and the
needs of the survey.

 Sometimes a probability sample is infeasible.


 Example: If we want to conduct a lengthy experiment using
human subjects we use whoever is willing to participate.
 E.g. a study of drug users (hearsay, criminal records)

53
Non-probability sampling
 Three conditions to use non-probability sampling.
 First, if there is no desire to generalize to a population
parameter.

 Secondly, because of cost and time requirements.


 probability sampling could be prohibitively expensive
since it requires more planning and repeated callbacks.

 Thirdly, probability sampling may break down in its


applications.
 The total population may not be available.

54
Non-probability sampling
(1) Convenience or accidental sampling: The method selects anyone
who is convenient.
 Units that are convenient for the investigator are selected
(e.g. volunteers)
 It can produce highly un-representative samples.

 Such samples are cheap, however, biased and full of systematic


errors.
 E.G: the person on the street interview conducted by
television programs.
 Drop by the cafeteria and ask questions of whoever is there.

55
Non-probability sampling
(2) Quota Sampling: subgroups are identified and a specified
number of individuals from each group are included in the
research – based on certain criteria.

 Identify categories of people (e.g., male, female) then


decides how many to get from each category.
 Example: In a school, find 10 elementary teachers, 10
middle school and 10 high school teachers.
 is used in opinion pollsters, marketing research and other
similar research areas.

 No randomization – difficult to know the sampling error.


56
Non-probability sampling
(3) Purposive or Judgment sampling
 When we select a limited number of informants,
strategically so that their in-depth information will give
optimal insight into an issue is known as purposeful
sampling.

 It uses the judgment of the expert in selecting cases.


 participants are selected because of some desirable
characteristics, like expertise in the area.

 Could be useful when used by skilled investigator

57
Non-probability sampling
(4) Snowball (Network) Sampling – chain sampling
 This is a method for identifying and selecting the cases in a
network.
 You start with one or two information-rich key
informants and ask them if they know persons who know
a lot about your topic of interest.
 Contact the first few and ask them for names of others, and so on.

 Useful when there is no sampling frame


 E.g., Becker’s (1963) study of marijuana users
 Illegal migrants, sex workers, drug users, etc.

58
Problems in Sampling

Sampling errors occur randomly and are equally likely to be in either direction.
The magnitude of the sampling error depends upon the nature of the universe; the
more homogeneous the universe, the smaller the sampling error. Sampling error
is inversely related to the size of the sample i.e., sampling error decreases as the
sample size increases and vice-versa.
59
Problems in Sampling
 Survey errors: The discrepancy between statements from
survey estimates and the reality (the true value) is called
survey errors.

 There are two types of survey errors.


 sampling errors and non-sampling errors.
 and survey error is the sum of the two.

60
Problems in Sampling
 Sampling errors are errors which are attributable to sampling,
and which therefore, are not present in information gathered in
a census.
 It is not a mistake
 Can be controlled by well developed sampling theory

 This error arises because it is unlikely that one will end up with a
truly representative sample, even when probability sampling is
employed

 Sampling errors can be calculated only for probability


samples.

61
Problems in Sampling
 Sampling error is related to confidence intervals.

 A narrower confidence interval means more precise estimates


of the population for a given level of confidence.

 The confidence interval for the true population mean is given


by: 
Mean  z
n
 The sampling error is given by:

z
n
62
Problems in Sampling
 Non-sampling error: Such errors are present whether it is
sampling or census survey that we are dealing with.
 these include all errors apart from sampling error and are
mostly mistakes by one party or another.

Non-Sampling Error includes:


 Non-coverage error
 Wrong population is being sampled
 Non response error
 Instrument error
 Interviewer’s error

63
Problems in Sampling
Non-Coverage sampling error: This refers to sample frame defect.
 Omission of part of the target population (e.g., soldiers,
students living on campus, people in hospitals, prisoners,
households without a telephone in telephone surveys, etc.).

The wrong population is sampled


 Researchers must always be sure that the group being
sampled is drawn from the population they want to
generalize about or the intended population.

64
Problems in Sampling
Non response error – Common in self-administered surveys
 This error occurs when you are not able to find those whom
you were supposed to study.

 Some people refuse to be interviewed because they are ill,


are too busy, or simply do not trust the interviewer.
 When one is forced to interview substitutes, an unknown
bias is introduced.

65
Problems in Sampling
Instrument error
 The instrument in sampling survey is the device in which
we collect data- usually a questionnaire.
 When a question is badly asked or worded, the resulting
error is called instrument error.

 Example: leading questions or carelessly worded


questions may be misinterpreted by some researchers.

66
Problems in Sampling
Interviewer Error :
 Enumerators can distort the results of a survey by in-
appropriate suggestions, word emphasis, tone of voice and
question rephrasing.
 Cheating by enumerators -with only limited training and
under little direct supervision.
 Perceived social distance between enumerator and
respondent also has a distorting effect.

 E.G: questions about sexual behavior might be


differently answered depending on the gender of the
interviewer.
67
Characteristics of a Good Sample Design: Summary

 Sample design must result in a truly representative sample.


 Sample design must be such which results in a small sampling
error.
 Sample design must be viable in the context of funds available
for the research study.
 Sample design must be such so that systematic bias can be
controlled in a better way.
 Sample should be such that the results of the sample study
can be applied, in general, for the universe with a reasonable
level of confidence.

68
The Concept of Sampling

69

You might also like