Professional Documents
Culture Documents
Sampling and Error and Predictive Analysis: Manu Upadhyay Faculty VJTI
Sampling and Error and Predictive Analysis: Manu Upadhyay Faculty VJTI
Predictive Analysis
Manu Upadhyay
Faculty VJTI
Concept of Sampling
▪ Stratified Sampling
▪ Cluster Sampling
▪ Multi-stage Sampling
Types of Sampling
• Probability Sampling When each sampling unit has • Non probability Sampling When sampling units
an equal and known chance of being included in do not have an equal chance of being included
sample.
in the sample.
Probability Sampling involves selecting procedure to
ensure that all study units have an equal and / or
Types of Non Probability Sampling
known chance of being included in the study.
• Convenience Sampling
▪ Simple Random Sampling
• Quota Sampling
▪ Systematic Random Sampling
• Snow-ball Sampling
▪ Stratified Sampling
• Temporal Sampling
▪ Cluster Sampling
▪ Multi-stage Sampling
When to use probability and non-probability sampling
If a sampling frame is not available, it is not possible to sample the study units in a
such a way that the probability for the different units to be selected in such case
use non-probability sampling.
If the sampling frame is exist or can be compiled, each study unit has a known
probability of being selected in the sample. In such situation probability sampling
can be used
Sampling Frame
• An important issue influencing the choice of the most appropriate sampling
method is whether a sampling frame is available or not.
• Sampling Frame is the listing of all the units that from the study population.
Sampling and Non Sampling Errors
● “Sampling error is the error that arises in a data collection process as a result
of taking a sample from a population rather than using the whole population.
● Sampling error is one of two reasons for the difference between an estimate
of a population parameter and the true, but unknown, value of the population
parameter.
● The sampling error for a given sample is unknown but when the sampling is
random, for some estimates (for example, sample mean, sample proportion)
theoretical methods may be used to measure the extent of the variation
caused by sampling error.”
● Sampling error is mainly cause due to the reason that sample not whole
population.
● Non-sampling error is the error that arises in a data collection process as a
result of factors other than taking a sample.
● Non-sampling errors have the potential to cause bias in polls, surveys or
samples.
● There are many different types of non-sampling errors and the names used to
describe them are not consistent.
● This may be due to poor sampling method, measurement errors, and
behavioural effect.
Skewness and Kurtosis
Kurtosis, on the other hand, refers to the pointedness of a peak in the distribution curve.
The main difference between skewness and kurtosis is that the former talks of the degree of
symmetry, whereas the latter talks of the degree of peakedness, in the frequency distribution.
Data can be distributed in many ways, like spread out more on left or on the right or evenly
spread.
When the data is scattered uniformly at the central point, it called as Normal Distribution
(perfectly symmetric) both the sides are equal, and so it is not skewed. Here all the three mean,
median and mode lie at one point.
The term ‘skewness’ is used to mean the absence of
symmetry from the mean of the dataset.
The crux of the distribution is that in skewness the plot of the probability distribution is stretched to either side.
On the other hand, kurtosis identifies the way; values are grouped around the central point on the frequency
distribution.
Where, Xi is observation
X is sample mean
n is number of observations
Simple, Partial and Multiple Correlation and Rank
Correlation
The correlation is one of the most common and most useful statistics.