You are on page 1of 41

Population and Sample:

The population is the collection of units (people, objects or


whatever) that researchers are interested in knowing about.
The number of individuals in a population is called population
size. A sample is a smaller collection of units selected from the
population i.e. a finite subset of individuals in a population is
called a sample and the number of individuals in a sample is
called sample size. Population may be finite or infinite. In finite
population the number of items is certain, but in case of an
infinite population the number of items is infinite, i.e. we
cannot have any idea about the total number of items. The
population of city, the number of officers in Nepal Rastra Bank
etc. are examples of finite population whereas customers of
United Trade Centre, number of stars in the sky, listeners of a
specific radio program etc. are examples of infinite population.
05/24/2021 Prepared by: Pravat Uprety 1
Parameter and Statistic:

• A statistic is a characteristic of a sample and a


parameter is a characteristic of a population.
Population Sample

Definition Collection of all items Part or portion of population


under study chosen for study

Characteristics Parameters Statistics

Symbols Population size = N Sample size = n

Population mean = μ Sample mean =

Pop standard deviation = σ Sample standard deviation = S

Pop correlation coeff. = ρ Sample correlation coeff. = r

05/24/2021 Prepared by: Pravat Uprety 2


Census and Sampling
Under the census or complete enumeration survey method, data are
collected for each and every unit (person, household, shop,
organization etc.) of the population.
Some merits of census method are:
• Data are obtained from each and every unit of the population.
• The results obtained are likely to be more representative, accurate
and reliable.
• It is an appropriate method of obtaining information on rare events
such as the number of persons of certain age groups, their
distribution by sex, educational level of people etc.
• Data of complete enumeration census can be widely used as a
basis for various surveys.

05/24/2021 Prepared by: Pravat Uprety 3


• Sampling is simply the process of learning about the
population on the basis of sample drawn from it. Thus, in the
sampling technique instead of every unit of the population
only a part of the population is studied and the conclusions
are drawn on that basis for the entire population.
Some merits of sampling method are:
• The sample can save money
• The sample can save time.
• For given resources, the sample can broaden the scope of the
study.
• Because the research process is sometimes destructive, the
sample can save product.
• If accessing the population is impossible, the sample is the
only option.
05/24/2021 Prepared by: Pravat Uprety 4
Sampling Process (steps in sample design):

• Define the population


• Specify the sampling frame
• Specify sampling unit
• Selection of sampling method
• Determination of sample size
• Specify the sampling plan
• Select the sample
05/24/2021 Prepared by: Pravat Uprety 5
• Define the population:
The first step in developing any sampling process is to clearly define the set of
objects called population. Population must be defined in terms of elements,
sampling units, extent and time. Defining a population incorrectly may render the
results of the study meaningless or even misleading. It is sometimes difficult to
define population properly. At times, a research project is required to define the
population before the study for which it is to be used can begin.
• Specify the sampling frame:
It is the list of elements from which the sample is accurately drawn. Ideally, it is a
complete and correct list of population members only. A sampling frame may be
telephone directory, an employee roaster, voter list, or list of all students attending
a college. Thus, a perfect sampling frame is one in which every element of the
population is represented.
• Specify sampling unit:
A decision has to be taken concerning a sampling unit before selecting sample. The
sampling unit is the basic unit containing the elements of the population to be
sampled. The sampling selected is often dependent upon the sampling frame.
Sampling unit may be a geographical one such as state, district, village, etc. or a
construction unit such as house, flat etc. or it may be a social unit such as family,
club, school, etc. or it may be an individual.
05/24/2021 Prepared by: Pravat Uprety 6
Selection of sampling method:

The researcher must decide the type of sample he will use i.e. he
must decide about the technique to be used in selecting the items for
the sample. The researcher faces a basic choice: a probability or non
probability sample.

Probability Sampling Non probability sampling


(Random) (Non random)
• Simple random sampling Convenience sampling
• Stratified random sampling Quota sampling
• Systematic sampling Judgment sampling
• Cluster samplingSnowball sampling
• Multistage sampling

05/24/2021 Prepared by: Pravat Uprety 7


• Determination of the sample size:
This refers to the number of items to be selected from the universe to
constitute a sample. This is a major problem before a research. The size
of sample should neither be excessively large, nor too small. It should
be optimum. An optimum sample is one which fulfills the requirements
of efficiency, representativeness, reliability and flexibility. While
deciding the size of sample, researcher must determine the desired
precision as also an acceptable confidence level for the estimate.
• Specify the sampling plan:
The sampling plan involves the specification of how each of the
decisions made thus far is to be implemented. The operational
procedures for selection of sampling units are thus selected.
• Select the sample:
The final step in the sampling process is the actual selection of the
sample elements. This requires a substantial amount of office and
fieldwork, particularly if personal interviews are involved.

05/24/2021 Prepared by: Pravat Uprety 8


Methods of sampling (Types, Techniques):

• The two main types of sampling are random and nonrandom. In


random sampling every unit of the population has the same chance of
being selected into the sample. Random sampling implies that chance
enters into the process of selection. In nonrandom sampling not every
unit of the population has the same chance of being selected into the
sample. Members of nonrandom samples are not selected by chance.

• Sometimes random sampling is called probability sampling and


nonrandom sampling is called non probability sampling. Because
every unit of the population is not equally likely to be selected,
assigning a probability of occurrence in nonrandom sampling is
impossible.

05/24/2021 Prepared by: Pravat Uprety 9


Random sampling techniques (Probability sampling)
The five basic random sampling techniques are:
• Simple random sampling
• Stratified sampling
• Systematic sampling
• Cluster sampling
• Multistage sampling

• Each technique offers advantages and disadvantages.


Some techniques are simpler to use, some are less costly,
and others show greater potential for reducing sampling
error.
05/24/2021 Prepared by: Pravat Uprety 10
Simple random sampling:
• A simple random sampling is one in which every individual or item from a frame has the
same chance of selection as every other individual or item. In addition, every sample of
a fixed size has the same chance of selection every other sample of that size. Simple
random sampling is the most elementary random sampling technique. It forms the basis
for the other random sampling techniques.
• With simple random sampling, n is used to represent the sample size and N is used to
represent the frame size. Every item or person in the frame is numbered from 1 to N.
The chance that any particular member of the frame is selected on the first draw is 1/N.
There are two basic methods by which samples are selected: with replacement and
without replacement.
• Sampling with replacement means that after a person or an item is selected, it is
returned to the frame, where it has the same probability of being selected again. Simple
random sampling method with replacement selects every unit with equal probability of
1/N.
• Sampling without replacement means that a person or item, once selected, is not
returned to the frame and therefore, cannot be again. Simple random sampling method
without replacement selects a first unit with probability 1/N, second unit with
probability 1/(N -1), third unit with probability 1/(N- 2), and the n th unit with probability
1/(N - n +1).

05/24/2021 Prepared by: Pravat Uprety 11


• Selection of a simple random sample:
Random sample refers to that method of
sample selection in which every item has an
equal chance of being selected. Random
sample can be obtained by any of the
following methods:
• Lottery Method
• Random number method
• Random number generator (by different
softwares)

05/24/2021 Prepared by: Pravat Uprety 12


• Stratified random sampling:
This sampling method is used when we have to select samples from a heterogeneous
population. In other words, if we want to represent different sections of the population in
our study such as male and female, or educated and uneducated, or employed and
unemployed, this method of sampling is suitable. In stratified random sampling, the
population is divided into subgroups or strata and a simple random sample is taken from
each such subgroup. The units thus picked up from the subgroups together constitute a
stratified sample. Because samples are selected from each stratum, we can be sure that each
segment of the population is represented in our study.
• There are three reasons why a researcher chooses a stratified random sample:
• To increase a sample’s statistical efficiency.
• To provide adequate data for analyzing the various subpopulations or strata
• To enable different research methods and procedures to be used in different strata.
• With the ideal stratification, each stratum is homogeneous internally and heterogeneous
with other strata. In this instance, stratification makes a pronounced improvement in
statistical efficiency.
• Stratified random sampling can be either proportionate or disproportionate. Proportionate
stratified random sampling occurs when the percentage of the sample taken from each
stratum is proportionate to the percentage that each stratum is within the whole population.
• Whenever the proportions of the strata in the sample are different from the proportion of
the strata in the population, disproportionate stratified random sampling occurs.
05/24/2021 Prepared by: Pravat Uprety 13
The process for drawing a stratified sample is:

• Determine the variables to use for stratification.


• Determine the proportions of the stratification
variables in the population.
• Select proportionate or disproportionate stratification
based on project information needs and risks.
• Divide the sampling frame into separate frames for
each stratum.
• Follow random or systematic procedures to draw the
sample from each stratum.

05/24/2021 Prepared by: Pravat Uprety 14


• Systematic random sampling:
This sampling method involves the random selection of the first item and then the selection
of a sample item at every kth interval. This is simplest and most widely used method of
drawing a sample. The interval k is fixed by dividing the population by sample size.
• To select a systematic sample of items, we need to first calculate the sampling interval
defined as:
K = Size of population / Size of sample required
= N/n
Then take a random number between 1 and K, which determines the first member for the
sample. Selecting every kth member after the random start and doing this n-1 times
determines the remaining n-1 members of the sample. We continue to pick up every kth
member until we get our desired sample size.

To draw systematic sample, we have to follow the following procedures:


• List the total number of units in the population (N).
• Decide the sample size (n).
• Calculate the sampling interval (k).
• Identify the random start.
• Draw a sample by choosing every kth entry.

05/24/2021 Prepared by: Pravat Uprety 15


• Cluster Sampling:
• Cluster (or area) sampling involves dividing the population into non
overlapping areas or clusters. However, in contrast to stratified random
sampling where strata are homogeneous, cluster sampling identifies clusters
that tend to be internally heterogeneous. In theory, each cluster contains a
wide variety of elements, and the cluster is a miniature, or microcosm, of
the population. Examples of clusters are towns, companies, homes, colleges,
areas of a city, and geographical regions. Often clusters are naturally
occurring groups of the population and are already identified then some of
these clusters are randomly selected for inclusion in the overall sample.
Although area sampling usually refers to clusters that are areas of the
population, such geographical regions and cities, the terms cluster sampling
and area sampling are used interchangeably in this context.
• Cluster or area sampling offers several advantages. Two of the foremost
advantages are convenience and cost. Clusters are usually convenient to
obtain, and cost of sampling from entire population is reduced because the
scope of the study is reduced to the clusters.

05/24/2021 Prepared by: Pravat Uprety 16


• Multistage Sampling:
• Multistage sampling is a further development of the principle
of cluster sampling. Instead of enumerating all the sample
units in the selected clusters one can obtain better and more
efficient estimators by restoring to sub sampling within the
clusters. This type of sampling which consists of first selecting
the clusters and then selecting a specified number of
elements from each selected cluster is known as sub sampling
or two stage sampling. In such sampling designs, clusters
which form the units of sampling at the first stage are called
the first stage units (fsu) or primary sampling units (psu) and
the elements within clusters are called second stage units
(ssu). This procedure can be generalized to three or more
stages and is termed multistage sampling.

05/24/2021 Prepared by: Pravat Uprety 17


Non random sampling (Non probability sampling):
Sampling techniques used to select elements from
the population by any mechanism that does not
involve a random selection process are called
nonrandom sampling techniques.
The four important non random sampling techniques
are:
• Convenience sampling
• Quota sampling
• Judgment (purposive) sampling
• Snowball sampling
05/24/2021 Prepared by: Pravat Uprety 18
• Convenience Sampling:
In convenience sampling, elements for the sample are selected for the
convenience of the researcher. The researcher typically chooses elements
that are readily available, nearby or willing to participate. The sample
tends to be less variable than the population because in many
environments the extreme elements of the population are not readily
available. The researcher will select more elements from the middle of the
population.
For example, a convenience sample of homes for door to door interviews
might include houses people are at home, houses with no dogs, houses
near the street, first floor apartments and houses with friendly people. In
contrast, a random sample would require the researcher to gather data
only from houses and apartments that have been selected randomly, no
matter how inconvenient or unfriendly the location.
If a research firm is located in a mall, a convenience sample might be
selected by interviewing only shoppers who pass the shop and look
friendly.
05/24/2021 Prepared by: Pravat Uprety 19
• Quota Sampling:
Quota sampling appears to be similar to stratified random sampling.
Certain population subclasses, such as age group, gender or geographical
region are used as strata. However, instead of randomly sampling from
each stratum, the researcher uses a nonrandom sampling method to
gather data from each stratum until the desired quota of samples is filled.
Quotas are described by quota controls, which set the sizes of the
samples to be obtained from the subgroups. Generally, a quota is based
on the proportions of the subclasses in the population.

Quotas often are filled by using available, recent or applicable elements.


In quota sampling, an interviewer would begin by asking a few filter
questions; if the respondent represents a subclass whose quota has been
filled, the interviewer would terminate the interview.

Quota sampling can be useful if no frame is available for the population


and it is also less expensive than most random sampling techniques.

05/24/2021 Prepared by: Pravat Uprety 20


Judgment (purposive) sampling:
Judgment sampling occurs when elements selected for the sample
are chosen by the judgment of the researcher. Researchers often
believe they can obtain a representative sample by using sound
judgment, which will result in saving time and money. Sometimes
ethical, professional researchers might believe they can select a more
representative sample than the random process will provide. When
sampling is done by judgment, calculating the probability that an
element is going to be selected into the sample is not possible. The
sampling error cannot be determined objectively because
probabilities are based on nonrandom selection. In this technique
researcher tends to make errors of judgment one direction. These
systematic errors lead to what are called biases. The researcher also
is unlikely to include extreme elements.
Example: Market selection for the construction of consumer price
index
05/24/2021 Prepared by: Pravat Uprety 21
• Snowball sampling:
In this method survey subjects are selected based on
referral from other survey respondents. The
researcher identifies a person who fits the profile of
subjects wanted for the study. The researcher then
asks this person for the names and locations of others
who also fit the profile of subjects wanted for the
study. Through this referral, survey subjects can be
identified cheaply and efficiently, which is particularly
useful when subjects are difficult to locate. It is the
main advantage of snowball sampling; its main
disadvantage is that it is nonrandom.

05/24/2021 Prepared by: Pravat Uprety 22


• Choosing non probability versus probability sampling
• In probability sampling every element in the population has
a nonzero probability of selection. For this complete list of
population element (sampling frame) is required. In certain
cases, the complete list can not be formed (Customer
satisfaction survey of Kathmandu Mall), at such case non
probability sampling technique is used instead of
probability sampling technique.
• The choice between non probability and probability
samples should be based on considerations such as nature
of the research, relative magnitude of non sampling versus
sampling errors, variability in the populations, as well as
statistical and operational considerations as following tables

05/24/2021 Prepared by: Pravat Uprety 23


FACTORS CONDITIONS FAVORING THE USE OF

Non Probability Probability Sampling


Sampling
Sampling frame is Frame is unavailable Frame is available
available or not
Nature of the research Exploratory Conclusive

Relative magnitude of Non sampling errors are Sampling errors are


sampling and non larger larger
sampling errors
Variability in the Homogeneous (Low) Heterogeneous (High)
population
Statistical consideration Unfavorable Favorable

Operational Favorable Unfavorable


consideration

05/24/2021 Prepared by: Pravat Uprety 24


Sampling Vs Non Sampling Error
Sampling error Non Sampling error
Census X √

Sampling √ √

05/24/2021 Prepared by: Pravat Uprety 25


Sampling versus non sampling error:
• Sampling errors:
• The difference between statistic and parameter.
• In sample surveys, since only a small portion of the population is studied, its results
are bound to differ from the census results and thus have a certain amount of error.
Thus, the error arises due to estimating population parameters only by selecting few
units (sample) is called sampling error. This error is inherent and unavoidable in any
and every sampling scheme. A sample with the smallest sampling error is considered
to be a good representative of the population. Increasing the size of sample can
reduce sampling error i.e. the sampling error is inversely proportional to the sample
size. The error can be completely eliminated by increasing the sample to include
every item in the population.
The sampling errors committed due to
• Faulty selection of the sample
• Substitution of convenient unit of the population
• Faulty demarcation of sampling units
• Improper choice of the statistic for estimating the population parameter

05/24/2021 Prepared by: Pravat Uprety 26


• Non sampling errors:
• Non sampling errors are not attributed to chance and are a consequence
of certain factors, which are with in human control. In other words, they
are due to certain causes, which can be traced and may arise at any
stage of the inquiry such as planning and execution of the survey and
collection, processing and analysis of the data. Non sampling errors are
thus present both in census surveys as well as sample surveys. Thus, the
data obtained in a complete enumeration, although free from sampling
errors would still be subject to non sampling errors whereas data
obtained in a sample survey would be subject to both sampling and non
sampling errors.
• Non sampling errors can occur at every stage of the planning or
execution of census or sample survey. The preparation of an exhaustive
list of all the sources of non sampling errors is a very difficult task.
However, a careful examination of the major phases of a survey
(enumeration or sample) indicates that some of the more important non
sampling errors arise from the following factors.

05/24/2021 Prepared by: Pravat Uprety 27


Faulty Planning or Definitions:
• The planning of a survey consists in explicitly stating the
objectives of the survey and these objectives are then
translated into i) a set of definitions of the characteristics for
which data are to be collected, and ii) into a set of
specifications for collection, processing and publishing. Here
non sampling error can be due to:
• Data specification being inadequate and inconsistent with
respect to the objectives of the survey.
• Errors due to location of the units and actual measurement of
the characteristics, errors in recording the measurements,
errors due to ill designed questionnaire, etc.
• Lack of trained and qualified investigators and lack of adequate
supervisory staff.

05/24/2021 Prepared by: Pravat Uprety 28


Response errors (Giving the wrong answer):
These errors are introduced as a result of the responses furnished by the
respondents and may be due to any of the following reasons:
• Response errors may be accidental
• Prestige bias
• Self-interest/Intentionally
• Bias due to interviewer
• Failure of respondent’s memory
Non-response bias (Not giving the answer) :
• In house-to-house survey, non-response usually results if the respondent is
not found at home even after repeated calls, or if he is unable to furnish the
information on all the questions or if he refuses to answer certain questions.
Errors in coverage:
• If the objectives of the survey are not precisely stated, this may result i) the
inclusion in the survey of certain units which are not to be included, or ii) the
exclusion of certain units which were to be included in the survey under the
objectives.

05/24/2021 Prepared by: Pravat Uprety 29


• Compiling errors:
Various operations of data processing such as editing and
coding of the responses, punching of cards, tabulation and
summarizing the original observations made in survey are
the potential source of error.
• Publication errors:
Publication errors i.e. the errors committed during
presentation and printings of tabulated results are
basically due to two sources. The first refers to the
mechanics of publication- the proofing error. The other,
which is of more serious nature, lies in the failure of the
survey organization to point out the limitations of the
statistics.
05/24/2021 Prepared by: Pravat Uprety 30
A census method involves only non sampling error while a sample survey contains
both sampling and non sampling errors. In a sample survey, non sampling error
can be effectively controlled by
• Employing qualified and trained personnel for the planning and execution of the
survey
• Using more sophisticated statistical techniques and equipment for the processing
and analysis of the data
• Providing adequate supervisory checks on the fieldwork
• Pre-testing or conducting a pilot survey
• Through editing and scrutiny of the results
• Effective checking of all the steps in the processing and analysis of data
• More effective follow up of non-response cases
• Imparting through training to the investigators for efficient conduct of the inquiry
• By providing adequate motivational and awareness program

Moreover, the sampling error in a sample survey can be minimized by taking an


adequately large sample selected by appropriate sampling plan.

05/24/2021 Prepared by: Pravat Uprety 31


Determining sample size:

In the economic and business research,


determining the proper sample size is a
complicated procedure, subject to the
constraints of budget, time, ease of selection
and required precision. What factors should
be considered while determining the size of
the sample? We should try to select a sample
that includes enough participants to ensure a
valid research investigation.
05/24/2021 Prepared by: Pravat Uprety 32
Affecting factors (before using the statistical
formula)
• Population and its distribution
• Objectives
• No of subgroups
• Variability
• Time
• Budget
• Geography and ease of selection

05/24/2021 Prepared by: Pravat Uprety 33


Statistical factors
• Sampling error – higher the sampling error
lower the sample size (inverse)
• Confidence level (1 – α)– higher the
confidence level higher the sample size
• Variability – Higher the variability higher the
sample size
• No of subgroups – higher the no of subgroups
higher the sample size

05/24/2021 Prepared by: Pravat Uprety 34


Formula for mean
• For mean
n = ( Z2 σ2 ) / e2
• Where Z = value of z at α% level of significance
(table)
• σ = standard deviation (given)
• e = sampling error (given/within ±)

05/24/2021 Prepared by: Pravat Uprety 35


Example (Mean)
• If the quality control manager wants to
estimate the mean life of light bulbs to within
±20 hours with 95% confidence and also
assumes that the process standard deviation is
100 hours, what sample size is needed?

05/24/2021 Prepared by: Pravat Uprety 36


Example
• If the quality control manager wants to estimate the mean life
of light bulbs to within ±20 hours with 95% confidence and also
assumes that the process standard deviation is 100 hours, what
sample size is needed?
Solution:
Given that
Sampling error (e) = ±20
Confidence level (1 – α) = 95%
α = 5%, then z at 5% = 1.96 (from table/two tailed test)
Standard deviation (σ) = 100
Then sample size n = ( Z2 σ2 ) / e2 = 96.04 = 97

05/24/2021 Prepared by: Pravat Uprety 37


For proportion
• For proportion
n = Z2 P (1-P ) / e2
• Where Z = value of z at α% level of significance
(table)
• e = sampling error (given)
• P= Proportion of similar past survey (given)
= 0.5 (If it is not given)

05/24/2021 Prepared by: Pravat Uprety 38


Example (Proportion)
A manager of a finance company wants to determine
whether the proportion of delinquent consumer
loans has changed from its previous level of 0.07.
a) The manager would like 95% confidence that her
prediction is correct to within +0.02. What sample
size is needed?
b) If you have no previous idea about the proportion
of delinquent consumer, would you answer to part
(i) change?

05/24/2021 Prepared by: Pravat Uprety 39


Example
A manager of a finance company wants to determine whether the proportion of
delinquent consumer loans has changed from its previous level of 0.07.
a) The manager would like 95% confidence that her prediction is correct to within +0.02.
What sample size is needed?
b) If you have no previous idea about the proportion of delinquent consumer, would you
answer to part (i) change?
Given that
a) Proportion of past survey (P) = 0.07
Confidence level (1- α ) = 95% then α = 5%
Then Z at 5% = 1.96
sampling error = +0.02
Then n = Z2 P (1-P ) / e2 = 625.22 = 626
b) Proportion (P) = 0.5 (Since previous value is not given)
Confidence level (1- α ) = 95% then α = 5%
Then Z at 5% = 1.96
sampling error = +0.02
n = Z2 P (1-P ) / e2 = 2401
05/24/2021 Prepared by: Pravat Uprety 40
• Sample size determination (Exercise)
1. If the quality control manager wants to estimate the mean life of light bulbs to
within ±20 hours with 95% confidence and also assumes that the process standard
deviation is 100 hours, what sample size is needed?
2. The personnel director of a large corporation wishes to study absenteeism among
clerical workers at the corporation’s central office during the year
  Assuming that the personnel director also wishes to take a survey in a branch office
answer these questions:
What sample size is needed if the director wishes to be 95% confident of being correct
to within ± 1.5 days and the population standard deviation is assumed to be 4.5 days?
What sample size is needed if the director wishes to be 90% confident of being correct
within ± 0.075 of the population proportion of workers who are absent more than 10
days if no previous estimate is available?
3. A manager of a finance company wants to determine whether the proportion of
delinquent consumer loans has changed from its previous level of 0.07.
The manager would like 99% confidence that her prediction is correct to within +0.02.
What sample size is needed?
If you have no previous idea about the proportion of delinquent consumer, would you
answer to part (i) change?

05/24/2021 Prepared by: Pravat Uprety 41

You might also like