You are on page 1of 4

Introductory of Statistics

(Introduction of Statistics)

Chapter 8: Estimation (statistical inference)


Goal: to estimate a population parameter, specifically the population mean, μ.

Estimation is the statistical procedure which uses sample information to estimate the value of a
population parameter such as the population mean, population standard deviation or population
proportion.

Estimation
2 words: Confidence Interval and Confidence level

Confidence level: the probability that the CI actually contains the parameter (μ for us), given
that we take a large number of samples of size n and calculate a CI for each sample.

Typical Confidence Levels: 90%, 95%, 99%

Significance Level: the probability that our CI does not contain the true value of the population
parameter (if we repeatedly take different samples and calculate a CI)

Ex:
Let’s say we take a sample from a population where we do not know μ, but we know  squared
= 9.
We want a 95% confidence interval for μ. So if we take 1000 samples of size n=36 and calculate
a 1000 CIs,
95%= .95, so .95(1000)=950 CI’s that will contain μ 100%-95%=.05= so (.05) (1000)= 50 CI’s
that will not continue.
Confidential Interval

A confidence interval for a population mean,  , is an interval expected to capture the true
population mean a certain percentage of the time. This percentage is called the Confidence
Level. The confidence levels we will discuss are 90%, 95%, and 99%.

Assumptions
1) We have a simple random sample of size n from a population of x values.
2) The value of  is known.
3) If the population of X is normally distributed then we can use any value of n.
4) If the population of X is unknown or it is any other distribution, then we require a sample
size n≥30.

Note: If the distribution of X is highly skewed and not mound shaped, a sample of n≥50 or even
100 may be required.

Margin of error: since we do not know μ, we can never know the exact value of the margin of
error.

Critical value: for a CI with a confidence level, c, the critical value z is the number such that the
value under the normal curve between -z and z equals c
Sample Size

The goal is to determine the minimum sample size (n) required (needed) to achieve
1) A confidence level of C
2) A maximum error of E

Degrees of Freedom of a statistic are the number of free choices used in computing the statistic.
The degrees of freedom, denoted by df, for each sample of size n, is one less than the sample
size. Thus, for a sample of size n, the degrees of freedom are given by the formula: df n  1

Student’s t-distribution
The t-distribution, also known as Student’s t-distribution, is a way of describing data that
follow a bell curve when plotted on a graph, with the greatest number of observations close to
the mean and fewer observations in the tails.

Distributional Requirements
One or more of the following requirements must be meet to use a t-distribution:
1) X has a normal distribution and/or
2) n≥30.

Why?
The t- distribution is only valid if we can assume the distribution of x is normally distribute. This
is true either of the following
1) If n≥30, then we cannot use the CLT.
Thus for x to be normally distributed, x has to be normally distributed.
2) If n≥30, then by the CLT, X is approximately normally distributed.

Properties of the t Distribution


o The t distribution is bell-shaped and symmetric about t = 0
o The t distribution is more varied and flatter than the standard normal distribution
o There is a different t distribution for each sample size. A particular t distribution is
specified by giving the degrees of freedom, df = n−1.
o As the number of degrees of freedom increases, the t distribution approaches the standard
normal distribution. They are sufficiently close when the degrees of freedom are greater
than 30.

You might also like