You are on page 1of 5

Important Terms of Sampling

Sampling: It is the process of choosing a representative sample from a target population and collecting data
from that sample in order to come to a conclusion about the population as a whole .

Census: A Census is the process of collecting information about every member of a population. A census
includes every member in a selected population. Census provides statistical information about the population
of a country, ratio of different genders, number of employed people, total number of educated people, etc.

Objective of Sampling: To obtain the optimum results, i.e., the maximum information about the
characteristics of the population with the available sources at our disposal in terms of time, money and
manpower by studying the sample values only. Obtain the best possible estimates of the population
parameters.

 Population/Statistical Population: A collection of UNITS being studied. UNITS can be people, places,


objects, epochs, drugs, procedures, or many other things.
 Animate and inanimate: An animate object is anything living ex. an animal, a person, a plant etc. An
inanimate object is anything without life ex. a baseball bat, a fork, a lamp etc.
 Concrete and abstract: Abstract is a concept that you cannot always see or touch. Concrete is
something that can be seen or known.
 Finite and infinite: The set A= {1,2,3,4,5} is finite and countable. The set of integers is considered
*infinite *and countable.
 Existent and Hypothetical: if the population consists of concrete existent individuals then it is
called existent population. If the individuals of the population do not exist in reality but only exist in
imagination then the population is hypothetical.
 Sampled and Target: Target population is usually the ideal population or universe to which
research results are to be generalized. For example, all adult population of the U.S. whereas sampled
population is a subset of subjects that is representative of the entire population. The sample must
have sufficient size to warrant statistical analysis.

 Discrete and Continuous: Discrete variables are usually obtained by counting. There are a finite or
countable number of choices available with discrete data. You can't have 2.63 people in the room.
Continuous variables are usually obtained by measuring. Length, weight, and time are all examples
of continuous variables. Since continuous variables are real numbers, we usually round them. This
implies a boundary depending on the number of decimal places. For example: 64 is really anything
63.5 <= x < 64.5. Likewise, if there are two decimal places, then 64.03 is really anything 63.025 <= x
< 63.035. Boundaries always have one more decimal place than the data and end in a 5.
Parameter: the summary description of a given variable in a population

Population Size (N):  The set of individuals, items, or data from which a statistical sample is taken.

Sample: A sample is a collection of units from a population.

Random Sample: Random Sample is a subset of a statistical population in which each member of the subset
has an equal probability of being chosen. A simple random sample is meant to be an unbiased representation
of a group.

Non-Random Sample:

Statistics: summary description of a given variable in a sample.

Sampling Units/Units: A sample from a population can be drawn one UNIT at a time, or more than one unit at
a time (one can sample clusters of units). The fundamental unit of the sample is called the sampling unit. It
need not be a unit of the population.

Sample Size (n): The number of elements in a sample from a population.

Variable of Interest (Random Variable X): A random variable is an assignment of numbers to possible
outcomes of a RANDOM EXPERIMENT . For example, consider tossing three coins. The number of heads showing when
the coins land is a random variable: it assigns the number 0 to the outcome {T, T, T}, the number 1 to the outcome
{T, T, H}, the number 2 to the outcome {T, H, H}, and the number 3 to the outcome {H, H, H}.

Population Distribution (Distribution of random variable X): The probability distribution of


a RANDOM VARIABLE specifies the chance that the variable takes a value in any subset of the real
numbers. (The subsets have to satisfy some technical conditions that are not important for this
course.) The probability distribution of a RANDOM VARIABLE is completely characterized by
the CUMULATIVE PROBABILITY DISTRIBUTION FUNCTION; the terms sometimes are used synonymously.
The probability distribution of a DISCRETE RANDOM VARIABLE can be characterized by the chance that
the RANDOM VARIABLE takes each of its possible values. For example, the probability distribution of
the total number of spots S showing on the roll of two fair dice. The probability distribution of
a CONTINUOUS RANDOM VARIABLE can be characterized by its PROBABILITY DENSITY FUNCTION.

Reliability (standard error of statistic & its exact sampling distribution):

Advantages of Sampling:

 Sampling saves time and money.


 Sampling saves labor.
 A sample coverage permits a higher overall level of adequacy than a full enumeration.
 Complete census is often unnecessary, wasteful, and the burden on the public .
Sample Survey: A survey based on the responses of a sample of individuals, rather than the entire
population.

Sample design: A set of rules or procedures that specify how a sample is to be selected. This can either be
probability or non-probability.

Sampling Frame: List of sampling units from which the sample, or some stage of the sample, is selected. It is
simply a list of the study population.

Probability Sampling: A sample drawn from a population using a random mechanism so that every element
of the population has a known chance of ending up in the sample.

Non-probability Sampling: Non-probability sampling is a sampling technique where the samples are
gathered in a process that does not give all the individuals in the population equal chances of being selected.

Sampling with and without replacement: Suppose we have a bowl of 100 unique numbers from 0 to 99. We
want to select a random sample of numbers from the bowl. After we pick a number from the bowl, we can put
the number aside or we can put it back into the bowl. If we put the number back in the bowl, it may be
selected more than once; if we put it aside, it can selected only one time. When a population element can be
selected more than one time, we are sampling with replacement. When a population element can be
selected only one time, we are sampling without replacement.

Sampling errors:  Degree of error to be expected for a given sample design or the difference between the
sample mean and the population mean.

Non-sampling errors: A statistical error caused by human error to which a specific statistical analysis is
exposed. These errors can include, but are not limited to, data entry errors, biased questions in a
questionnaire, biased processing/decision making, inappropriate analysis conclusions and false information
provided by respondents.

Sampling Bias: The notion that those selected are not "typical" or "representative" of the larger populations
that have been chosen from.

Random Number Table: A random number table is a list of numbers, composed of the digits 0, 1, 2, 3, 4, 5,
6, 7, 8, and 9. Numbers in the list are arranged so that each digit has no predictable relationship to the digits
that preceded it or to the digits that followed it. In short, the digits are arranged randomly. Numbers in a
random number table are random numbers.

 Probability sampling (SRS, Stratified Sampling, Systematic sampling, Cluster Sampling, Multi Stage,
Multi-Phase Sampling, Sequential sampling)
 Simple Random Sampling: A simple random sample (SRS) of size n is produced by a scheme which
ensures that each subgroup of the population of size n has an equal probability of being chosen as the
sample.

 Stratified Sampling: There may often be factors which divide up the population into sub-
populations (groups / strata) and we may expect the measurement of interest to vary among the
different sub-populations. This has to be accounted for when we select a sample from the population
in order that we obtain a sample that is representative of the population. This is achieved by
stratified sampling. A stratified sample is obtained by taking samples from each stratum or sub-
group of a population. When we sample a population with several strata, we generally require that
the proportion of each stratum in the sample should be the same as in the population. Stratified
sampling techniques are generally used when the population is heterogeneous, or dissimilar, where
certain homogeneous, or similar, sub-populations can be isolated (strata). Simple random sampling
is most appropriate when the entire population from which the sample is taken is homogeneous.
Some reasons for using stratified sampling over simple random sampling are:

a. the cost per observation in the survey may be reduced;


b. estimates of the population parameters may be wanted for each sub-population;
c. Increased accuracy at given cost.

Example 
Suppose a farmer wishes to work out the average milk yield of each cow type in his herd which consists of
Ayrshire, Friesian, Galloway and Jersey cows. He could divide up his herd into the four sub-groups and take
samples from these.

 Systematic sampling: A type of probability sampling method in which sample members from a
larger population are selected according to a random starting point and a fixed, periodic interval.
This interval, called the sampling interval, is calculated by dividing the population size by the desired
sample size. Despite the sample population being selected in advance, systematic sampling is still
thought of as being random, provided the periodic interval is determined beforehand and the
starting point is random.
 Cluster Sampling: Cluster sampling is a sampling technique where the entire population is divided
into groups, or clusters, and a random sample of these clusters are selected. All observations in the
selected clusters are included in the sample.
 Multi stage sampling: Sometimes the population is too large and scattered for it to be practical to
make a list of the entire population from which to draw a SRS. For instance, when the polling
organization samples US voters, they do not do a SRS. Since voter lists are compiled by counties, they
might first do a sample of the counties and then sample within the selected counties. This illustrates
two stages. In some instances, they might use even more stages. At each stage, they might do a
stratified random sample on sex, race, income level, or any other useful variable on which they could
get information before sampling.
 Multi- Phase sampling:
A sampling method in which certain items of information are drawn from the whole units of a sample 
and certain other items of information are taken from the subsample.
 Sequential sampling: Sequential sampling is a non-probabilistic sampling technique, initially
developed as a tool for product quality control.  The sample size, n, is not fixed in advanced, nor is the
timeframe of data collection.  The process begins, first, with the sampling of a single observation or a
group of observations.  These are then tested to see whether or not the null hypothesis can be
rejected.  If the null is not rejected, then another observation or group of observations is sampled and
test is run again.  In this way the test continues until the researcher is confident in his or her results.

Purposive Sampling: Purposive sampling is when a researcher chooses specific people within the
population to use for a particular study or research project. Unlike random studies, which deliberately
include a diverse cross section of ages, backgrounds and cultures, the idea behind purposive sampling is to
concentrate on people with particular characteristics who will better be able to assist with the relevant
research.

Quota Sampling: Quota sampling is a method of sampling widely used in opinion polling and market
research. Interviewers are each given a quota of subjects of specified type to attempt to recruit for example,
an interviewer might be told to go out and select 20 adult men and 20 adult women, 10 teenage girls and 10
teenage boys so that they could interview them about their television viewing. It suffers from a number of
methodological flaws, the most basic of which is that the sample is not a random sample and therefore the
sampling distributions of any statistics are unknown.

Sampling distribution of Mean:

Central Limit Theorem: The central limit theorem in its shortest form states that the sampling distribution
of the sampling means approaches a normal distribution as the sample size gets larger, regardless of the
shape of the population distribution. So the sample means will be normally distributed (especially when the
sample is above 30) if the population is positively skewed, negatively skewed or even binomial (having only 2
outcomes). 

Sampling Distribution of Sampling Proportion:

Sampling Distribution of Variance:

You might also like