SAMPLING

STREAM SAMPLING
KIT-601
STREAM COMPUTING
• Stream computing is a way to analyze and process Big Data in real time to gain current
insights to take appropriate decisions or to predict new trends in the immediate future
• Stream computing is a computing paradigm that reads data from collections of software or
hardware sensors in stream form and computes continuous data streams.
• Stream computing uses software programs that compute continuous data streams.
• Stream computing is one effective way to support Big Data by providing extremely low-
latency velocities with massively parallel processing architectures.
• It is becoming the fastest and most efficient way to obtain useful knowledge from Big Data.
SAMPLING
Sampling is a method that allows us to get information about the

population based on the statistics from a subset of the population
(sample), without having to investigate every individual.
STREAM SAMPLING
It is the practice of selecting an individual group from a population to study the whole
population
Stream sampling is the process of collecting a representative sample of the elements

of a data stream. The sample is usually much smaller than the entire stream, but can
be designed to retain many important characteristics of the stream, and can be used
to estimate many important aggregates on the stream. Every sampling type comes
under two broad categories:
• Probability sampling - Random selection techniques are used to select the sample.
• Non-probability sampling - Non-random selection techniques based on certain criteria are

used to select the sample.
DATA SAMPLING PROCESS
1. Defining the population. The population is the entire set of data from which the sample is drawn. To
guarantee that the sample is representative of the entire population, the target population must be
precisely defined, including all essential traits and criteria.
2. Selecting a sampling technique. The next step is to choose the best sampling method based on the
research question and the characteristics of the population under study. There are several methods for
drawing samples from data such as simple random sampling, cluster sampling, stratified sampling and
systematic sampling.
3. Determining the sample size. The optimum sample size required to produce accurate and reliable results
should be decided in this phase. This decision may be influenced by certain factors, such as money, time
constraints and the requirement for greater precision. The sample size should be large enough to be
representative of the population, but not so large that it becomes impractical to work with.
4. Collecting the data. The data is collected from the sample using the
sampling approach that was chosen, such as interviews, surveys or
observations. This may entail random selection or other stated criteria,
depending on the research question. For example, in random sampling, data
points are selected at random from the population.
5. Analyzing the sample data. After collecting the data sample, it's processed
and analyzed to draw conclusions about the population. The results of the
analysis are then generalized or applied to the entire population.

TYPES OF SAMPLING TECHNIQUES
• PROBABILITY SAMPLING: In probability sampling, every element of
the population has an equal chance of being selected. Probability
sampling gives us the best chance to create a sample that is truly
representative of the population
• NON-PROBABILITY SAMPLING: In non-probability sampling, all

elements do not have an equal chance of being selected.
Consequently, there is a significant risk of ending up with a non-
representative sample which does not produce generalizable results
TYPES OF SAMPLING TECHNIQUES
SIMPLE RANDOM SAMPLING
This is a type of sampling technique you must have come across at
some point. Here, every individual is chosen entirely by chance and
each member of the population has an equal chance of being selected.
In this type of sampling, members are chosen randomly from the

population, merely by chance. This can be done by either putting chits
in a bowl like a lottery system or spinning the wheel. The advantage of
simple random sampling it that it is easy cost-efficient, reliable and
represents the whole population.
SYSTEMATIC SAMPLING
• In this type of sampling, the first individual is selected randomly and others are
selected using a fixed ‘sampling interval’. Let’s take a simple example to understand
this.
• In systematic sampling, every nth unit from the population is taken. That means a
sample from the population is selected at every regular interval. The starting point
is selected randomly and after that, every nth element is selected. In the below
figure, n=3, so every 3rd element is selected.
• Say our population size is x and we have to select a sample size of n. Then, the next
individual that we will select would be x/nth intervals away from the first individual.
STRATIFIED SAMPLING
In this type of sampling, we divide the population into subgroups (called
strata) based on different traits like gender, category, etc. And then we select
the sample(s) from these subgroups.
Example: If we want to find a review of a book in a country. We can divide the

population according to the age groups like 18-25years, 25-35 years, 35-
45years, 45-55 years and 55-65 years. Each age group represents each
stratum. Then, a particular number of members is selected from each age
group to take a review of the book. These members are the final samples.
CLUSTER SAMPLING
In a clustered sample, we use the subgroups of the population as the sampling unit
rather than individuals. The population is divided into subgroups, known as clusters,
and a whole cluster is randomly selected to be included in the study.
• In this type of sampling, the whole population is divided into some groups
or clusters. Units with similar characteristics are kept in one cluster. For example,
People can be grouped according to their age or country.
• These clusters are also known as strata. Now, the researcher will pick some strata
(according to the requirement and resources) randomly and perform his research
on that.
NON-PROBABILITY SAMPLING
1. AVAILABILITY SAMPLING: This is also known as convenience sampling. This occurs
when the researcher selects the samples based on availability. For example: If a
student wants to do research on how many college students are using the canteen
for lunch. He will select his own college and nearby colleges to do the survey.
2. JUDGMENTAL SAMPLING: It is also called purposive sampling. In this samples are
selected on basis of the researcher’s own knowledge, experience and intuition.
The researcher selects this technique when they feel that other sampling
techniques are time-consuming and he is confident about his knowledge.

NON-PROBABILITY SAMPLING
4. QUOTA SAMPLING: In this type of sampling, the researcher divides the
population into some quotas according to some characteristic and select the
members from each quota.
5. SNOWBALL SAMPLING: This is also known as chain-referral sampling. In this,
reference from existing samples is taken to collect the samples.
• For Example: if a person is doing a survey on a rare disease and he knows only a few patients,
then he can take the contacts of other persons from these patients and in this way, using snowball
sampling, researchers can get in touch with these hard-to-contact sufferers.

SAMPLING

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SAMPLING

Uploaded by

Copyright:

Available Formats

STREAM SAMPLING

hardware sensors in stream form and computes continuous data streams.

latency velocities with massively parallel processing architectures.

Sampling is a method that allows us to get information about the

Stream sampling is the process of collecting a representative sample of the elements

• Non-probability sampling - Non-random selection techniques based on certain criteria are

precisely defined, including all essential traits and criteria.

sampling approach that was chosen, such as interviews, surveys or

observations. This may entail random selection or other stated criteria,

depending on the research question. For example, in random sampling, data

points are selected at random from the population.

analysis are then generalized or applied to the entire population.

• NON-PROBABILITY SAMPLING: In non-probability sampling, all

In this type of sampling, members are chosen randomly from the

Example: If we want to find a review of a book in a country. We can divide the

2. JUDGMENTAL SAMPLING: It is also called purposive sampling. In this samples are

selected on basis of the researcher’s own knowledge, experience and intuition.

techniques are time-consuming and he is confident about his knowledge.

You might also like