You are on page 1of 2

Expected number = np σ

Mean: µ Standard diviation:


For a random variable X with possible values x 1,..., xn √n
andassociated probabilities p1,..., pn The Central Limit Theorem
n
Suppose X is a random variable which is not
E(X) = µ = ∑ x i p i
i=1
necessarily normally distributed. X has mean µ and
standard deviation σ. For sufficiently large n (usually n
If a discrete random variable has k possible values
≥ 30), the distribution of the sample means X n of size
x1,..., xk with probabilities p1,..., pk then
n, is approximately normal with mean µ and standard
Var(X) = σ2 = ∑(xi − µ)2 piσ = √ Var ( X ) deviation σ/√ n. The larger the value of n, the better
Bernoulli Distribution the approximation will be. Also apply to Sn

Arises from considering the result of a single trial of an If X itself is normally distributed, then X n and Sn will
experiment. Can take only two possible values: 1 – be normally distributed for any value of n
success and 0 – failure; p: probability of success =>
Sample proportions
p(x=1) = p, p(x=0) = 1 – p, 0≤ p ≤1
E(X)= µ = p; Var(X) = σ2 = p(1-p); σ = √ Var ( X )
X
p= X: no. of success in n trial
n
Binomial Distribution
p: probability of success
Consider an experiment for which there are only two
1
possible results: success if some events occur, or E ( p ) = E ( X ) = p;
failure. Probability of success is p and failure is 1 – p
n

µ = np σ = √ np ( 1− p )
P(X = k) is BINOM.DIST(k, n, p,FALSE)
n √(
Σ( p )= 1 σ ( X )= p 1− p
n
)

Let X be the Bernoulli random variable such that X = 1


P(X ≤ k) is BINOM.DIST(k, n, p,TRUE) if the selected element has given characteristic, and X
Normal distribution = 0 otherwise. If we take n observation of X, then the
sample mean
The graph of this function is called the normal curve.
We write, X ∼ N(µ, σ2 ). X 1+ …+ X n
Xn=
Every X-distribution can be transformed into the n
x−µ is actually the sample proportion p. Hencem we can
Standard normal distribution: z = has µ = 0
σ apple the CLT and state that:
and σ = 1; write Z ∼ N(0,12 ).
For sufficiently large n, p is approximately normally
To evaluate a probability, use: distributed with mean p and standard deviation
NORM.DIST(x, µ, σ,TRUE) √ p ( 1− p ) . As a general rule, p is approximately
To evaluate an inverse, use: √n
normally distributed is np ≥ 5 and n(1 – p) ≥ 5.
NORM.INV(p, µ, σ)
Confidence intervals for Means
Sampling distribution
X n−μ
For n independent observations of a random variable Consider the distribution Z=
X, the sum of the observation is the random variable: σ− √ n
Sn = X1 + · · · + Xn. and Z ∼ N(0,12). 95% CI = 1.96
Mean: nµ Standard deviation: σ√ n The 95% CI for μ given the data set with sample mean
σ σ
Distributions of Sample Means x is x−1.96 ≤ μ ≤ x +1.96
√n √n
For n independent observations of a random variable
X, the mean of observations is the random variable: If we want a CI that contains μ with probability 1 – α
X 1+ …+ X n , we find a value a such that
Xn=
n
P(- a ≤ Z ≤ a) = 1 – α For a 95% confidence interval of width w and
¿
preliminary proportion p= p , the sample size needed
α
( ) p (1− p )
or equivalently P(Z ≤ - a) = labeled by zα/2 2 x 1.96 σ 2
2 is n=
¿ ¿
w
The interval contains µ with probability 1 – α, so the
confidence level of the interval is (1–α)x100% If we have no preliminary value for p, the we assume
¿
the worst case scenario, and choose the value p
σ
The the margin of error is zα/2 . The width of the CI which maximizes the value of
√n ¿ ¿
σ p ¿). This value is p = 0.5
is 2 x zα/2 .
√n Developing Null and Alternative Hypothesis
The sample mean is calculated = AVERAGE (Input Null hypothesis: H0 Alternative hypothesis: Ha
range)
Assumption to be challenged Research
The margin of error is calculated =
CONFIDENCE.NORM (α,σ,n). H0: µ ≥ µ0 Ha: µ ˂ µ0 lower tail test

s H0: µ ≤ µ0 Ha: µ ˃ µ0 upper tail test


σ unknown x ± t α/2
√n H0: µ = µ0 Ha: µ ≠ µ0 two tail test
where s is the sample standard deviation, (1 – α) is the Let x be the sample mean of a sample of size n. Then
confidence coefficient, and tα/2 is the t value σx = σ/√n
determined by t distribution with n – 1 degree of x−μ0
freedom. Test statistic z=
σ /√n
In most applications, a sample size n ≥ 30 is adequate
p-value of the population mean
to use this expression.
p = P(z ≤ k) = norm.dist(k,0,1,true)
tα/2 can be calculated by T.INV function
p = P(z ≥ k) = 1 - norm.dist(k,0,1,true)
The margin of error is calculated by
CONFIDENCE.T function. Two tail:

For a given confidence level, we can decrease the x−μ0


If test stat z= ˂ 0 => p-value = 2P(z ≤ k)
width of the confidence interval by increasing the σ /√n
sample size x−μ0
If test stat z= ˃ 0 => p-value = 2P(z ≥ k)
When we wish to estimate the population mean to lie σ /√n
with 95% confident within an interval of width w, the
sample size required is Reject H0 if p-value ≤ α. If p-value ˃ α Don’t reject
If CI at 1 – α confidence level is (a,b) => a ˂ α ˂ b
( )
2
2 x 1.96 σ
n= always use upper rounding point
w Two tail: If µ0 doesnt belong to I, reject
σ: population standard deviation Lower tail: a ˂ µ ˂ b ˂ µ0 , reject
Confidence intervals for proportions Upper tail: µ0 ˂ a ˂ µ ˂ b, reject
For a sample of size n with sample proportions p, the x−μ0
σ unknown t=
95% CI for p is p ±1.96
√ p (1− p)
n
s/√n
p-value use t distribution P(z ≤ k) = T.Dist (k, n-1, true).
When we estimate a proportion it is desirable to make Test stat about a population proportion
the width of the confidence interval as small as
possible. We can make the width smaller by selecting p− p 0
z=
a larger sample size n.
√ p0 (1−p 0)
n

You might also like