You are on page 1of 2

STATISTICS AND PROBABILITY This type of sampling is often more practical than simple

(STUDY NOTES) random sampling for studies requiring "on location"


analysis such as door-to-door surveys.
What is a random sample?
A random sample is a sample that is chosen randomly. It
could be more accurately called a randomly chosen What is a parameter?
sample. Parameters are numbers that summarizes data for an
entire population.
Random samples are used to avoid bias and other It is any numerical quantity that characterizes a given
unwanted effects. Of course, it isn’t quite as simple as it population or some aspect of it.
seems: choosing a random sample isn’t as simple as just It tells something about the whole population.
picking 100 people from 10,000 people.
What is statistics?
What is random sampling?
Random Sampling is the process of choosing a Statistics are numbers that summarize data from a
representative sample from a target population and sample.
collecting data from that sample in order to understand It is a single measure of some attribute of a sample.
something about the population as a whole. It is any function (attribute) of a sample.
What is the difference between a parameter and a
Simple random sampling is the basic sampling technique statistics?
where we select a group of subjects (a sample) for study Parameter is a characteristic of a population.
from a larger group (a population). Statistics is a characteristic of a sample.
A Population Parameter is a summary measure to
A simple random sample is meant to be an unbiased describe the characteristics of the whole population. It
representation of a group. It is considered a fair way to is usually denoted by Greek letters.
select a sample from a larger population since every A Sample Statistic is a summary measure computed
member of the population has an equal chance of from a sample to describe the characteristic of the
getting selected. whole population.
An example of a simple random sample would be the
names of 25 employees being chosen from a company
of 250 employees. Sampling Distribution
Stratified random sampling is a factor which divides the Sampling can be done in two ways: with or without
population into sub-populations (groups/strata) and we replacement.
may expect the measurement of interest to vary among
the different sub-populations. For sampling with replacement, any datum chosen from
the population to form the sample is returned to the
The word Strata is the plural form of stratum, which population so that it has a chance of being chosen
means a subgroup. again. The size of the population remains the same in
If there are two or more subgroup, it is called strata. every selection of a sampling unit.
For sampling without replacement , it indicates that
We use stratified random sampling for a population that once a sampling unit is chosen, it has no further chance
has a distinct element because in a stratified random of being chosen. In this case, the size of the population
sampling, we are dividing the population into a certain available for sampling is reduced as each sampling unit
group based on their classification. is chosen.
Example, a farmer wishes to milk each cow type in his Random Sample
herd which consists of 4 breeds. A random sample chosen without replacement is called
He could divide his cows into four sub-groups and a simple random sample.
collect milk.
If the population under investigation is large enough,
Multistage random sample is constructed by taking a then sampling without replacement may be
series of simple random samples in stages. approximated by sampling with replacement.
Sampling distribution of sample means
A sampling distribution of sample means is a frequency
distribution using the means computed from all possible What are the fundamental theorems of probability?
random samples of a specific size taken from a The Fundamental Theorems of Probability are Law of
population. large numbers and Central Limit Theorem .

The means vary from sample to sample. Law of Large numbers


In a given data {2,4,9,10,5}. The law of large numbers states that the sample mean
converges to the distribution mean as the sample size
Let us list all possible samples of sizes 3 from this increases and is one of the fundamental theorems of
population and compute the mean of each sample. probability.

Central Limit theorem


Observe that the means vary from sample to sample. The central limit theorem states that the distribution of
Thus, any mean based on the sample drawn from a the sum (or average) of a large number of independent,
population is expected to assume different values for identically distributed variables will be approximately
the samples. normal, regardless of the underlying distribution.
So we can say that the sample mean is a random
variable which depends on a particular sample. The central limit theorem states that if a random
sample size are drawn from a population, then as
The sampling distribution of a sample mean is the sample sizes becomes larger, the sampling distribution
probability that the mean of a given sample would of the mean approaches the normal distribution.
appear.
When should the central limit theorem be used?
A sampling distribution shows every possible result a Central Limit theorem is best used in large sample sizes.
statistic can take in every possible sample from a As a general rule, sample sizes equal to or greater than
population and how often each result may happen. 30 are considered sufficient for the central limit
theorem to hold.

What does the variance indicate? Since it uses large sample sizes, it justifies the use of
The variance is a numerical value used to indicate how normal curve methods for a wider range of problems.
widely individuals in a group vary.
If individual observations vary greatly from the group It also assures us that no matter what the shape of the
mean, the variance is big; and vice versa. population distribution (normal curve), the sampling
The number of observation in the population is distribution of the sample means is closely normally
important in a sampling distribution because it is the distributed whenever the sample size is large.
basis of all our computation.

Z-Test
A z-test is a statistical test used to determine whether
two population means are different when the variances
are known and the sample size is large.
A Z-test is best used if there are more than 30 samples.

T-Test
A t-test is commonly used in small sample sizes when
testing the difference between the samples if the
variances of two normal distributions are not known.

T-test is best used when there is less than 30 samples.

T-test is appropriate whenever you want to compare the


means of two groups.

You might also like