You are on page 1of 9

Inference for Categorical Data :

PROPORTIONS

Introduction to confidence Intervals

GOAL : Estimate true population parameter


↳ take samples
sampling distribution normal
↳ all samples →
approximately

Upn =p , rpn=P¥Ñ
we don't know p , but can take a sample to estimate p Ñ

About 95% of samples p^ are within 2 SD of p

µ
There is a 95T .

probability that p is within 20pm of Ñ

S Epn ( standard Error) :


unbiased estimator for Tpr
confidence interval :
p^±z*sEpn (for proportions)

of Error : z*SEpn I
Margin
Confidence Intervals for Proportions

Estimate population parameter



sample sample proportion confidence interval

conditions / Requirements :

1) Random sample → center /mean is true proportion

2) Roughly Normal → 10 successes and 10 failures each

3) Independence condition → n s 10% of the population

critical value C-* ) : number of standard deviations to go above and


distribution
below for the
sampling

confidence Interval :
p^±z*sEpn =
p^±z*yF
Minimum Sample Size :

Given maximum MOE and confidence Level ,


minimum n to
guarantee
Max Mo F-

aka How large of a sample to ensure a


margin of error

we don't know p^
f

z*fÑ £ MOE ma,


maximize SE maximize pnci Ñ )
-

.
-
. use p^ = 0.5 (maximizes standard Error maximizes Mo E)

>_(zmoEmaÑ
The Idea of Tests
Significance
Start with a
Hypothesis ( H) and want to check if it is true
It
↳ collect samples and compare outcome with outcome probability assuming
is true
(d) of

compare probability with a threshold to determine
plausibility
H

(typically contains )
"
Ho : Null Hypothesis , as expected "
no difference
"
=
"

Ha : Alternative Hypothesis ,
new news

significance Test :

1) create Ho , Ha

2) create significance Level (d)

3) Take sample Ñ, s censure conditions are met)

4) calculate p -
value

p
- value : PCÑ,s / Ho is true)

The probability of sample statistics given Ho is true

If we show it's
roughly normal , we can find p-value using normal CDF

5) if p value < ✗
-
Reject Ho

2✗ Do not Reject Ho → evidence for Ha


p value
-

conditions for 2- test :

1) Random

2) Normal Cnp 210 and na p ) -


210 )

3) Independence ( lot . Rule)


Carrying out a test for a
population proportion
Given Ho : p = Po
and Ñ = Ñ ,
find p value
-
:

Ha: p =/ po

1) Find 2

pm -
Po
-

2 =

pofy[
2) Find area under normal curve :

this
gives p -
value

OR

skip step one and


change normal CDF to use it -
-
Po ,
T=
Ñ
Ha :
p > Po : low = Ñ , up =D
"
low d
p < Po : =
up =p
-

p =/ Po : depends on relative position

(just times 2)
Concluding a Test for a Population Proportion
If p value
-
< ✗ reject Ho , suggest Ha
,

p value
- 2 ✗
,
fail to reject Ho

To Answer FRQ :

1) STATE / DEFINE :

Ho : ? pin context Ñ = ?

Ha : ? ✗= ? n' = ?

2) CONFIRM CONDITIONS :
"

problem that random


← say
"
stated in

Random , Normal , Independent


Name procedure:

CI or 2) sample z -

test for p

3) CALCULATE :

General Formula :

Mp^= Po ,
rpn=PoÑ
p value
:-

p^ -
Po
1-
Zpn
=

Tp^

p value =P
-
(Ñ ± É / Ho )

=P (2 I Zpn )

4) CONCLUDE :

make conclusion
compare p-value to a CONTEXT
a IN
Potential Errors When Performing Tests

What if we incorrectly reject / fail to


reject Ho ?

ERRORS
we
get

Reality
Ho TRUE Ho FALSE

÷=r
¥yÉ=
Reject

Fail Rej

Type 1 Error : Ho is true BUT we reject it ← pctype 1 Error) = ✗

Type 11 Error : Ho is not true BUT we fail to reject it

Power in
significance Tests

power : PC
rejecting Ho I Ho false)

↳ I -
p ( not rejecting Ho I Ho false)

↳ PC not 11 Error)
making Type

Ho : all = It , Ha: it =/ Mi

Ho false ✗9 power A
Ho true
it = Mz
PcType 1 Error) 9
I

÷¥É÷É n t less
INIT
variability
power

Power r
r

11
uz
Type
Error
true parameter Power 9
far from mean
Confidence Interval for the Difference of two Proportions

similar to 1- Sample z -
interval

1) Conditions for Inference

↳ BOTH : Random, Normal , Independent

2) confidence Level → z*

3) Confidence Interval for p, -

pz

↳ ( Ñ, -

Ñz ) I
2*9 -

pi
,

opi.rs =ñI+Ñ÷P - -

sori op?
Testing for the Difference of two Population Proportions

Hypothesis Testing :

1) Construct Ho and Ha!

Ho : difference
Typically assume no

OR : P, -

Pz = 0

Ha: Typically Assume difference :

OR: p,
-

Pz < , =/, > 0

I Remember to multiply p-value by 2 for


✗ :
Predefine both directions

2) Test conditions + name procedure

Random Normal , Independent 2- test


,
+ 2
sample for p

3) Find p value :
-

)
Opi -
?
p
=
OF
'

+
Opni opnz =
P÷P

Because we assume p , = Pz we COMBINE proportions


,

op:-p; =ñ÷ñ'+ñ÷P pi=ñIIñn!÷


2 =

Ñgp÷P¥- p-value ( normal CDF )

4) Interpret p-value in context

You might also like