You are on page 1of 12

STATISTICS AND PROBABILITY

Week 6: Sampling Techniques and Measurement of Sample Size

Population - The totality of all the elements or persons for which one has an interest at a particular time. It is denoted
by N.

Sample - It is a subset of a population. Specific group from the population. It is denoted by n.

Advantages And Disadvantages Of Sampling

How to choose your respondents?

TWO TYPES OF SAMPLING TECHNIQUES

NONPROBABILITY SAMPLING - Nonprobability Samples: members are selected from the population in some nonrandom
manner

PROBABILITY SAMPLING - Probability Samples: each member of the population has a known non-zero probability of
being selected

Two types of Sampling Techniques

Non Probability Sampling

When to use Non probability Sampling?

• -There are particular traits needed

• -Usually, Qualitative research

• -Random is impossible

• -There is no generalization from the population

• -Used in Initial study


Probability Sampling

When to use Probability Sampling?

• -Bias has to be reduced

• -Quantitative research

• -Diverse population

• -To create an accurate sample

• -Generalizing the population

Non Probability Sampling

Convenience Sampling, Purposive Sampling, Quota Sampling, Snowball Sampling

Convenience Sampling

When to use?

- Often used during preliminary research efforts to get an estimate without incurring the cost or time required to select a
random sample.

- Results are needed immediately.

Process:

- Researcher determines the required sample size and then simply collects data on that number of individuals who are
available easily.

Advantage:

- Cheap, efficient and Simple to Implement

Disadvantage:

- Possibility of biased results, Lacks of Generalizibity

Purposive Sampling

When to use?

- Used when you want to access a particular subset of people (typical group)

Process:

- The sample is selected based upon judgment.

- Researcher selects a "typical group" of individuals who might represent the larger population and then collects data
from this group.

Advantage:

- helps you make the most out of a small population of interest and arrive at valuable research outcomes
Disadvantage:

- Unbiased results may arise if the group of respondents is not appropriately selected.

Quota Sampling

When to use?

- Useful when the time frame to conduct a survey is limited, the research budget is very tight, or survey accuracy is not
the priority.

Process:

- The selection of the sample is made by the researcher, who decides the quotas for selecting sample from specified sub
groups of the population

Advantage:

- Quicker and easier to carry out because it does not require a sampling frame and the strict use of random sampling
techniques

Disadvantage:

- Risky to project the research result to the whole population.

Snowball Sampling

When to use?

- used when researchers have difficulty finding participants for their studies. This typically occurs in studies on hidden
populations, such as criminals, drug dealers or sex workers, as these individuals tend to be difficult for researchers to
access.

Process:

- This technique relies on referrals from initial subjects to generate additional subjects.

Advantage:

- It allows for studies to take place where otherwise it might be impossible to conduct because of a lack of participants.
Snowball sampling may help you discover characteristics about a population that you weren't aware existed.

Disadvantage:

- Representativeness of the sample is not guaranteed.

Probability Sampling

Simple Random, Sampling Systematic Random Sampling, Stratified Sampling, Cluster Sampling

Simple Random Sampling

When to use?
- Each member of the population has an equal and known chance of being selected.

Process:

- Randomly selects participants without any consideration/biases.

Advantage:

- Lack of bias, Simplicity and Less knowledge required.

Disadvantage:

- Although simple random sampling is intended to be an unbiased approach to surveying, sample selection bias can
occur. When a sample set of the larger population is not inclusive enough, representation of the full population is
skewed and requires additional sampling techniques.

Systematic Random Sampling

When to use?

- Same with Simple Random sampling but easier and faster technique.

Process:

- After the required sample size has been calculated, every Nth record is selected from a list of population members.

Advantage:

- eliminating the phenomenon of clustered selection and a low probability of contaminating data

Disadvantage:

- Disadvantages include over- or under- representation of particular patterns and a greater risk of data manipulation

Stratified Random Sampling

When to use?

- You should use stratified sampling when your sample can be divided into mutually exclusive and exhaustive subgroups
that you believe will take on different mean values for the variable that you're studying.

Process:

- Step 1: Define your population and subgroups

- Step 2: Separate the population into strata

- Step 3: Decide on the sample size for each stratum

- Step 4: Randomly sample from each stratum

Advantage:

- A stratified sample can provide greater precision than a simple random sample of the same size

Disadvantage:

- A stratified random sample can only be carried out if a complete list of the population is available
How to compute samples for stratified Sampling?

Example 1:

Compute for the respondents needed in a stratified sampling for every strata with the given:

Cluster Sampling

When to use?

- Best used to study large, spread out populations or the population size is unknown.

Process:

- Step 1: Define your population.

- Step 2: Divide your sample into clusters.

- Step 3: Randomly select clusters to use as your sample.

- Step 4: Collect data from the sample.

Advantage:

- It allows for research to be conducted with a reduced economy.

- Cluster sampling reduces variability.

- It is a more feasible approach.

Disadvantage:

- It is easier to create biased data within cluster sampling.

- Many clusters are placed based on self-identifying information.

Sample Size for Known and Unknown Population

1. Define population size or number of people (N)

2. Designate your margin of error

3. Determine your confidence level


4. Predict expected variance

5. Finalize your sample size

Five steps in finding your sample size

1. Define population size or number of people (N)

Your sample size needs will differ depending on the true population size or the total number of people you’re
looking to conclude on. That’s why determining the minimum number of individuals required to represent your selection
is an important first step.

2. Designate your margin of error

Margin of error is how much error you intend to permit: that’s your margin of error. Sometimes called a
"confidence interval," a margin of error indicates how much you’re willing for your sample mean to differ from your
population mean. It’s often expressed alongside statistics as a plus-minus (±) figure, indicating a range which you can be
relatively certain about.

For example, say you take a sample proportion of your colleagues with a designated 3% margin of error and find
that 65% of your office uses some form of voice recognition technology at home. If you were to ask your entire office,
you could be sure that in reality, as low as 62% and as high as 68% might use some form of voice recognition technology
at home. ***Margin of error can be calculated but in research it is usually assigned by the researchers.

3. Determine your confidence level

Your confidence level reveals how certain you can be that the true proportion of the total population would pick
an answer within a particular range. The most common confidence levels are 90%, 95%, and 99%. Researchers most
often employ a 95% confidence level. Your confidence level corresponds to something called a "z-score." A z-score is a
value that indicates the placement of your raw score (meaning the percent of your confidence level) in any Number of
standard deviations below or above the population mean. Z-scores for the most common confidence intervals are:

90% = 1.65

95% = 1.96

99% = 2.576

In our example from the previous step, when you put confidence levels and intervals together, you can say you’re 95%
certain that the true percentage of your colleagues who use voice recognition technology at home is within ± three
percentage points from the sample mean of 65%, or between 62% and 68%.
4. Predict expected variance

The last thing you’ll want to consider when calculating your sample size is the amount of variance you expect to see
among participant responses. Standard deviation measures how much individual sample data points deviate from the
average population. Don’t know how much variance to expect? Use the standard deviation of 0.5 to make sure your
group is large enough.

5. Finalize you Sample Size


Week 7: Sampling Distribution

Population and Sampling Distribution

Sampling Distribution

• Population Distribution is the probability distribution of the population data. The probability of sample mean is called
sampling distribution. The sampling distribution is the list of various values of x can assume and the probability of each
value of sample mean x

• Sampling Distribution of sample means is a probability distribution consisting of all possible sample means of a given
sample size n selected from a population N, and the probability of occurrence with each sample mean.

Sampling Distribution can be applied in two ways:

1. Sampling without replacement- is when a datum, once chosen from the population, will not be returned in the
population frame.

2. Sampling with replacement- is when a datum is chosen from the population to form the sample frame and is
returned to the population in which it has a chance to be chosen again.

Each individual observation in a random sample has the population probability distribution p(x). A finite population is
one that contains a fixed number of observations, while an infinite population contain an infinite number of
observations.

Obtaining the number of samples:

• To obtain the number of samples given a sample size n in a sample without replacement, we can apply combination.
Recall that a combination is a grouping a selection of all or part of number or things (or objects) without reference to the
arrangement of the things selected. The number of N objects taken n at time is given by
• To obtain the number of samples given a sample size n in a sample with replacement, we can simply apply the
counting techniques in probability given by the formula

• where N is the number of population and n is the number of samples

Characteristics of Sampling Distribution

1. The mean of the sampling Distribution is equal to the population mean.

2. If the population is normally distributed the sampling distribution of sample mean x will be normally distributed.
Week 8: Central Limit Theorem

Definition of a Central Limit Theorem

Central Limit Theorem

• The normal approximation in theorem will be good if n ≥ 30 regardless of the shape of population.

• If n < 30, the approximation is good only if the population is not too different from the normal.

• If the distribution of the population is normal the sampling distribution will also be exactly normal, no matter how
small size of the sample.
Week 9: Estimation of Parameters

Point Estimate and Interval Estimate

A point estimate is a numerical value and it identifies a location or a position in the distribution of possible values.

Example: A student guess what is the exact number of ex his teacher in Math.

An interval estimate is a range of values where most likely the true value will fall. (a logical range of values or set of
values with lower and upper limits)

Example: The same as in Number 1, but this time the student should give a range of values wherein your number of ex
would most likely fall.

Interval Estimate of a Population Mean

Confidence Coefficient and Confidence Interval

A confidence interval is a range of values, bounded above and below the statistic's mean, that likely would contain an
unknown population parameter. Confidence level refers to the percentage of probability, or certainty, that the
confidence interval would contain the true population parameter when you draw a random sample many times.

This measure of confidence in the interval estimate is referred to as confidence coefficient. When combined with their
estimate, it is now referred to as confidence interval estimate. Hence, a confidence interval estimate is a range of
values where one has a certain percentage of confidence that the true value will likely fall in.

The confidence level is denoted by (1 − α) 100%, where a is the Greek letter alpha.

Though any value of the confidence level can be selected to create a confidence interval, the more frequent values are
99%, 95%, and 90%. The following confidence coefficient are 0.99, 0.95, and 0.90.

You might also like