You are on page 1of 51

Population and Sampling

Population and Sampling


• Population: refers to the total collection of elements
about which we want to make some generalization or
inferences.
• Example all households in a given district, all
employees of a firm, all companies in a leather
industries, etc.
• A census is a count of all the elements in a population.
• Sampling is the process of selecting some of the
elements in a population we want to study.
Population and Sampling…
Target Population: is a group of individuals who
meet some established criteria.
Population Element: An individual member of a
specific population.
Sampling Frame: Is the complete list of all
population elements from which the sample is
drawn.
Parameter and Statistic
 Parameter: is a characteristic of a population.
• A specified value of the population, such as mean or
variance or proportion is named as parameter.
• Population mean (μ) is a parameter
 Statistic: is a characteristic of a sample or
• a specific value in the sample is termed a statistic.
 Example:
• when we work out certain measurement like, mean
from a sample they are called statistics.
• the sample mean (x) is a statistics
Point estimation
Sample data is used to estimate parameters of a
population
Statistics are calculated using sample data.
Parameters are the characteristics of population data

sample mean Population mean


𝒙 estimates

Sample SD Population SD
𝑺 
The Need for Sampling
 Sampling is used because of the following
comparative advantages it has as compared to
population:
◦ Lower cost
◦ Greater accuracy
◦ Greater speed of data collection
◦ Availability of population elements.
Characteristics of a good sample
 Representativeness
 A sample must be representative of the population.
 Probability sampling techniques yield representative
sample.
 Accuracy
 Accuracy is defined as the degree to which bias is
absent from the sample.
 An accurate (unbiased) sample is one which almost
exactly represents the population.
 It is relatively free from any influence that causes
any difference between sample value and population
value.
Limitations of Sampling

 Sampling demands a thorough knowledge of sampling


methods and procedures and an exercise of great care;
 Otherwise the results obtained may be incorrect or
misleading.
 When the characteristic to be measured occurs only
rarely in the population, a very large sample is required
to secure units that will give reliable information about it.
 And a large sample has all the drawbacks of census
survey.
Limitations…
 It may not be possible to ensure the representativeness of
the sample, even by the most perfect sampling
procedures.
 Therefore sampling results in a certain degree of sampling
errors, i.e., there will be some difference between the
sample value and the population value.
Sampling Methods
 Sampling techniques are broadly classified into two
types:
a. Probability sampling methods: gives a known equal
chance to every element in the population.
 Each element in the population has a known chance of being
included in a sample.
b. Non-probability sampling methods: are arbitrary
and subjective.
 Each member of the population does not have a known
chance of being included.
Probability sampling methods

The process of probability sampling can be divided


into four stages:
i. Identify a suitable sampling frame based on your
research question(s) or objectives.
ii. Decide on a suitable sample size.
iii. Select the most appropriate sampling technique
and select the sample.
iv. Check that the sample is representative of the
population.
Probability sampling methods
 Many statistical techniques assume that a sample was
selected on a random basis.
 There are five basic types of random sampling
techniques.
• Simple Random Sampling
• Stratified Random Sampling
• Systematic Random Sampling
• Cluster Sampling
• Multi-Stage Sampling
a. Simple random sampling
 This sampling technique gives each element an equal
and independent chance of being selected.
 An equal chance means equal probability of selection.
• For example, in a population of 300, each element
theoretically has 1/300 chance of being selected.
 An independent chance means that the draw of one
element will not affect the chances of other elements
being selected.
 Hence all elements should be included in the sample
frame to draw a random sample.
Simple Random Sampling Procedure
 The procedure of drawing a simple random sample consists
of
 list of all elements in the population,
 preparation of a list of all elements, giving them
numbers in a serial order 1,2,3….. and so on,
 drawing sample numbers by using
 Lottery method
 A table of random numbers or
 A computer
Conditions for Simple Random Sampling
 This technique is suitable and may yield a
representative sample under the following conditions:
 Where the population is a homogenous group with
reference to the specified characteristics,
 Where the population is relatively small,
 Where a complete list of all elements is available or
can be prepared.
b. Stratified sampling
 Used when the population is heterogeneous with
respect to the variables or characteristics. under study.
 the technique of stratified sampling gives more efficient
and accurate results.
 the population is sub-divided into homogeneous
groups or strata, and from each stratum, random sample
is drawn.
 Example:
 the employees of an organization may be divided into
managers and non-managers and each of those two
groups may be sub-divided into salary-grade-wise strata.
Need for stratification
 It ensures representation to all relevant sub-
groups of the population.
 It is essential when the researcher wants to study the
characteristics of population subgroups, e.g., male
and female employees of an organization.
 It is also useful when different methods of data
collection, etc. are used for different parts of the
population.
 Example: interviewing for executives and self-
administered questionnaire for workers.
Stratification process…
 involves three major decisions:
1. The stratification base or bases to be used should be
decided.
 For example, if size of the firms is a primary variable,
the firms may be stratified on the basis of the block
capital employed.
2. The number of strata:
 Larger the number of strata, greater may be the degree
of representatives of the sample.
 The decision may be based on the number of sub-
population groups to be studied and the cost of
stratification.
Stratification process…
3. Strata sample sizes:
 There are two alternatives.
1. First, the strata sample sizes may be proportionate
to strata’s shares in the total population.
2. Second, they may be disproportionate to strata’s
shares.
 Accordingly, stratified random sampling may be
classified into
a. Proportionate stratified sampling
b. Disproportionate stratified sampling
Proportionate stratified sampling
 involves drawing a sample from each stratum in proportion to
the latter’s share in the total population.
 Example: if the final year MBA students of the management
department consists of the following specialization groups:

Specialization stream No. of students Proportion of each


stream

Finance 40 0.4
Marketing 20 0.2
HRM 30 0.3
Accounting 10 0.1
100 1.0
 The researcher wants to draw an overall sample of 30.
 Then the strata sample size would be:

Strata Sample size


Finance 0.4 × 30 = 12
Marketing 0.2 × 30 = 4
HRM 0.3 × 30 = 9
Accounting 0.1 × 30 = 1
30
Disproportionate Stratification
 This method does not give proportionate
representation to strata.
 It necessarily involves giving overrepresentation to some
strata and underrepresentation to others.
 There may be several disproportionate schemes.
 All strata may be given equal weight, even though their
shares in the total population vary.
 Alternatively some substrata may be given greater
weight and other lesser weight.
c. Systematic Sampling
 In this method, a sample is taken from a list prepared
on a systematic arrangement either on the basis of
alphabetic order or on house number or any other
method.
 Here, only the first sample unit is selected at random
and the remaining units are automatically selected in a
definite sequence at equal spacing from one
another.
Steps in Systematic Sampling
 First, the population is arranged in serial numbers from 1
to N, and the size of sample (n) is determined.
 The sample interval is determined by dividing the
population size by the sample size:
i.e. N/ n = K
Where K = sample interval
n = sample size
N = size of population
 Any number is selected at random from the first sampling
interval.
 The subsequent samples are selected at equal or regular
intervals.
Systematic Sampling: Example
 Suppose it is desired to select a sample of 20 students,
from a list of 300 students. For selecting this sample:
1. Divide the population total of 300 by 20.
 The quotient is 15.
2. Select a number at random between 1 and 15, using
lottery method or a table of random numbers.
 Suppose the selected number is 9.
 Then the students numbered 9, 24 (9+15), 39 (24+15)
54 (39+15), etc. are selected as the sample elements.
 continue until you reach your desired size.
Applications of systematic sampling
 Systematic selection can be applied to various
populations such as
 Students in a class,
 Houses in a street,
 Telephone directory,
 Customers of a bank,
 Assembly line output in a factory,
 Members of an association and so on.
d. Cluster sampling
 Used when the population elements are scattered over a
wider area and a list of population elements is not
readily available,
 Cluster sampling is random selection of groups consisting of
population elements.
 Each of such subgroup is a cluster of population elements.
 Unlike strata, clusters don’t often meet the need for
heterogeneity, and, instead, are homogeneous.
 Then from each selected cluster, a sample is drawn by either
simple random selection or stratified random selection.
Cluster….
 Example:
 a researcher wants to select a random sample of 1,000
households out of 40,000 estimated households in a city
for a survey.
 A direct sample of individual households would be difficult to
select, because a list of households does not exist and would be
too costly to prepare.
 Instead, he can select a random sample of a few blocks.
 The number of blocks to be selected depends up on the average
number of estimated households per block.
 Suppose the average number of households is 200, then 5 blocks
comprise the sample.
 Since the number of households per block varies, the actual
sample size depends on the block which happens to be selected.
Multi-stage sampling
 refers to a sampling technique which is carried out in various
stages.
 is generally used in selecting a sample from a very large
geographic area.
 The population is regarded as made up of a number of
primary stage units, secondary stage units, third stage units
and so on till we ultimately reach the desired sample units in
which we are interested.
 At each stage there is a random selection and the size of
sample may be proportional or disproportional depending on
the size and characters of variations are event to the purpose
of inquiry.
Non-probability Sampling Methods
 This sampling techniques do not provide a known
chance of selection to each population element.
 The known merits of this type of sampling are
simplicity, convenience and low cost.
 It does not ensure a known selection chance to each
population element.
 Non-probability sampling plan does not perform
inferential function, i.e., the population parameters
cannot be estimated from the sample values.
 It suffers from sampling bias which will distort
results.
Non-probability…
 Non-probability sampling is not a desirable method.
 However, the following are some reasons that require the use of
this method:
• When there is no other feasible alternative due to non-
availability of a list of population.
• When the study does not aim at generalizing the findings
to the population, but simply at feeling the range of
conditions, or nature of the phenomenon,
• When the costs required for probability sampling may be
too large, and the benefit expected from it is not
commensurate with such costs,
• When probability sampling requires more time, but the
time constraints and the time limit for completing the study
do not permit it.
Non-probability …
 Types of non-probability sampling techniques
 Convenience Sampling
 Purposive Sampling
 Quota Sampling
 Snowball Sampling
Convenience sampling
 It is also known as unsystematic, accidental or
opportunistic sampling.
 It means selecting sample elements in a just ‘hit and
miss’ fashion.
 Under this method a sample is selected according to the
convenience of the investigator.
 Is the least reliable, cheapest and easiest design
 This convenience may be in respect of availability of
data, accessibility of the units, etc.
 This method may be used in the following cases:
• When the universe is not clearly defined.
• When sampling frame is not clear.
Judgment sampling
 This method means deliberate selection of sample units
that conform to some pre-determined criteria.
 Judgment sampling is also “purposive sampling” or
“deliberate sampling”.
 This involves selection of cases which we judge as the most
appropriate ones for the given study.
 It is based on the judgment of the researcher or some
expert.
 The chance that a particular case be selected for the sample
depends on the subjective judgment of the researcher.
Quota sampling
 Quota sampling is a nonrandom sampling technique which
has procedures similar to stratified sampling.
 First, the population is stratified, preferably on the basis of the
characteristics of the population under study.
 Next, the number of sample units to be selected from each
stratum is decided by the researcher in advance.
 This number is known as quota, which may be fixed according
to some specific characteristics such as income groups, gender,
occupation, political or religious affiliation etc.
 The choice of the particular units for investigation is left to the
investigators themselves.
Snowball sampling
 This is a technique of building up a list or a sample of a
special population by using an initial set of its members as
informants.
 For example: if a researcher wants to study the problems
faced by Ethiopians in another country, say, he may identify
an initial group of Ethiopians through some source like
Ethiopian Embassy.
 Then he can ask each one of them to supply names of other
Ethiopians known to them, continue this procedure until he
gets an exhaustive list of a sample.
Selecting Sampling Techniques
The purpose of the survey.
Measurability.
Degree of precision.
Information about population.
The nature of the population.
The geographical area covered by the survey.
Fund availability.
Time
Sample Size Determination
 Applicable only if you select probability sampling methods
in your study
 Sample size is number of elements in a sample.
 Some misconceptions about the required size of a sample:
• sample should not be less than 10% of the population;
commonly known as the 1/10th rule, it is not relevant to
large populations.
• Another misconception is: the larger the sample size, the
greater may be the accuracy of the sample results.
• But, a large sample size does not guarantee the accuracy of
the results.
Sample Size…
 The sample size determination is purely statistical
activity, which needs statistical knowledge.
 There are a number of sample size determination
methods some of which are the following:

1. Personal judgment:
 The personal judgment and subjective decision of the
researcher in some cases can be used as a basis to
determine the size of the sample.
Sample size…
2. Budgetary approach:
 Under this approach the sample size is determined by
the available fund for the proposed study.
 Suppose, if cost of surveying of one individual or unit
is $25 and if the total available fund for survey is say
$2000, the sample size then will be determined as,
 Sample size (n) = total budget of survey /Cost of unit
survey,
 Accordingly, the sample size will be 80 units (2000 / 25
= 80 units)
Sample Size…
3. Traditional inferences:
 This is based on precision rate and confidence level.
 To estimate sample size using this approach we need to
have information about the estimated variance of the
population, the magnitude of acceptable error and the
confidence interval.
a. Variance of the population:
 It refers to the standard deviation of the population
parameter.
 The sample size depends up on the variance of the
population.
 If the population is homogenous small sample size can be
enough.
Sample Size…
 If not available, a researcher is expected to estimate
population variance using the procedures below:
 The researcher can carry out either a pilot study for
estimating the population standard deviation or use
the rule of the thumb.
 the rule of the thumb says standard deviation is one-
sixth of the range.
 If the household’s yearly average income is expected to
range between $3,000 and $21,000, using the rule of
thumb, the standard deviation will be 1/6(18,000) =
3,000
Traditional inferences…
b. Magnitude of Acceptable Error
 The magnitude of error (range of possible error)
indicates how precise the study must be.
 It is an acceptable error for a particular study.
 The researcher makes subjective judgment about
the desired magnitude of error.
 For example, to estimate the average income of
household, one may allow an error say ± 30
Traditional inferences…
c. Confidence level and significance level
• The confidence level is the expected percentage of times that
the actual value will fall within the stated precision limits.
• A confidence level of 95% means there are 95 chances in 100
(or .95 in 1) that the sample results represent the true
condition of the population within a specified precision
range against 5 chances in 100 (or .05 in 1) that it does not.
• Precision is the range within which the answer may vary and
still be acceptable;
• Confidence level indicates the likelihood that the answer will
fall within that range, and the significance level indicates the
likelihood that the answer will fall outside that range.
Sample size…
 Once the above concepts are understood, the size of
sample is quite simple to determine.
 It is determined based on the following relationship.
n = (ZS/e)2
 Where Z represents standardization value indicating a
confidence level.
For example: Z can be 1.96 for 95% confidence, 1.6449 for
90% or 2.5758 for 99%.
 e represents acceptable magnitude of error ± an error
factor.
 S represents sample SD or an estimate of the population
SD
Sample size…
 Example:
 Suppose we want to study the household monthly
expenditure on food.
 We wish to have a 95% confidence level
 Acceptable range of error of not less than 20 birr.
 And the estimated value of the SD is 200
Z = 1.96
e = 20
S = 200
n = (ZS/e)2 = (1.96 *200/20)2 = 384.16 or 385
 If the range of error (e) is reduced to 10-sample size will
increase.
Yemane’s (1967) formula
A sample size can be determined using the Yemane’s
(1967) formula which is as follows
n= __N____
1+ Ne2
Where:
n is the sample size
N is the population size
e is the degree of margin of error
1 is a constant value
Leedy and Ormrod, (2001).
They developed guideline based on the size of the work
force.
Leedy and Ormrod (2001) suggest:
 for small population (<100) to include the entire
population,
 if the population is around 500, 50% should be
sampled;
 if the population size is around 1,500, 20% should be
sampled.
 Population sizes beyond this are adequately
represented by 8 % of the population size (Leedy &
Ormrod, 2001).
End

You might also like