You are on page 1of 22

statistics I

mm3: parameter estimation, part II

petar popovski
assistant professor
antennas, propagation and radio networking (APNET)
department of electronic systems
aalborg university

e-mail: petarp@es.aau.dk
lecture outline

 refresh on confidence intervals

 estimating the difference in means for two normal


populations

 confidence intervals for Bernoulli variables

 other (not ML) estimator types

2 / 24
refresh on confidence intervals (1)

 let X 1 , X 2 ,L X n be a sample from a normal distribution


with unknown mean μ and known variance σ
2

n
 we calculate ∑ Xi
X= i =1
n
and we want to know what is the interval around X in
which the real mean value μ lies with confidence 100(1 − α ) %

⎛ σ σ ⎞
⎜ x − zα/2 , x + zα/2 ⎟
⎝ n n⎠

3 / 24
refresh on confidence intervals (2)

α
P(Z ≤ − zα/2 ) = P(Z > zα/2 ) =
2
P(− zα/2 < Z < zα/2 ) = 1 − α

 note that P(Z ≤ z ) = 1 − P(Z > z ) = 1 − α


α/2 α/2
2
 how to determine zα/2
– by looking at the table A1 in the appendix and determine the
α
value that gives probability 1 −
2
– in Excel or MATLAB by using the function
α
norminv( 1 − ,0,1)
2

4 / 24
refresh on confidence intervals (3)

 if the variance is not known, then we use the t-


distribution
⎛ S S ⎞
n

∑(X − X ) 2

⎜ X − tα/2,n −1 , X + tα/2,n −1 ⎟
i
S= i =1

⎝ n n⎠ n −1

 interpretation: the probability that a t-distributed


random variable with n-1 degrees of freedom is larger
than the value tα/2,n −1 equals α
2
 to determine tα/2,n −1
– use the table A3 in the appendix
– in MATLAB by using the command tinv( 1 − α ,n-1)
2
– in Excel by using the command tinv( α ,n-1)

5 / 24
difference in means for two normal populations - example

6 / 24
solution to the example 7.4a

 we have to realized that the difference of the mean values is


again a normally distributed random variable

 numerical solution

X − Y = -13.05 (− 19.61,−6.49)
σ A2 σ B2
zα/2 + = 1.96 ⋅ 3.345 = 6.56
14 14

7 / 24
example with unknown variances (1)

8 / 24
example with unknown variances (2)

 we have to work with sample variances


n m

∑(X i − X) 2
∑ (Y − Y )
i
2

S1 = i =1
S2 = i =1
n −1 m −1

 we have assumed a common variance σ , such that

9 / 24
example with unknown variances (3)

 recall

 then the following must be t-distributed


X − Y − ( μ1 − μ 2 )
σ2 σ2
+
n m X − Y − ( μ1 − μ 2 )
=
S12 S 22 ⎛1 1 ⎞
(n − 1) + (m − 1) S p2 ⎜ + ⎟
σ 2
σ 2
⎝n m⎠
n+m−2

10 / 24
solution to the example

 n=12, m=14
 for 90% confidence interval, we look for
tα/2,n −1 = t0.05, 24 = 1.711
(2.5,11.93)
X − Y = 7.2143 S p2 = 49.1

all this does not work if the variances are different and
in that case we have to know at least the radio between
the variances

11 / 24
confidence intervals for Bernoulli random variables (1)

 the real fraction of the population that supports the


president is p and we have taken a random sample

 we have shown before

 here we will make approximate confidence intervals by


assuming np(1 − p) ≈ npˆ (1 − pˆ )
X
pˆ =
n
12 / 24
confidence intervals for Bernoulli random variables (2)

 this brings us to the following approximation

 back to the example

z0.025 = 1.96

0.52 ⋅ 0.48
1.96 = 0.04 n = 599.29
n

13 / 24
finding the required sample size

 an inverse problem would be: we want to find how big


should be the sample n, such that the confidence
interval for p is not greater than a given value, say b

pˆ (1 − pˆ )
2 zα/2 =b
n

 however, p̂ can be estimated only after n is known!

 the trick is to take a preliminary sample, calculate


preliminary estimate p* and then calculate the required
number
(2 zα/2 ) p (1 − p )
2 * *
n=
b2

14 / 24
illustration by an example

if we assume that in the end the number of acceptable


chips is 1040, then we obtain

REMARK
for a conservative estimate on n, we should take the max possible
value: ( zα/2 ) 2
n=
b2

15 / 24
Bayes estimator – a motivating example

 consider the problem of a communication system


Gaussian
noise z

transmitter + receiver
s=0 or r=s+z
s=1

0.4 0.4

0.35 0.35

0.3 0.3

0.25 0.25

0.2 0.2

0.15 0.15

0.1 0.1

0.05 0.05

0 0
-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5

distribution of r if s=0 distribution of r if s=1

16 / 24
Bayes estimator

 prior distribution of θ
– reflects how much we know about it before any observation

 posterior distribution of θ
– reflects how much we know about it after the observation

 Bayes estimator

17 / 24
example with uniform prior distribution

 in this example the Bayes estimator is different from the


ML estimator, which is equal to the sample mean

18 / 24
example with normal prior distribution

 it turns out that the posterior probability is normal with

19 / 24
further issues related to the Bayes estimator

relation to the ML estimator

finding the posterior confidence intervals

the posterior mean is 43.75, the posterior variance is 37.5

how many samples do we need to have the same


confidence interval if there is no prior distribution?

20 / 24
evaluating a point estimator

 we have seen that there are many options to select estimators,


the question is how good is certain estimator

 a good measure is the mean square error

r (d , θ ) = E [(d ( X , X , L X ) − θ ) ]
2
1 2 n

 definition of unbiased estimator

bθ (d ) = E [d ( X 1 , X 2 , L X n )] − θ

 for unbiased estimator

r (d , θ ) = Var (d ( X 1 , X 2 , L X n ) )

21 / 24
evaluating a point estimator (contd.)

 minimal mean square error (mmse) estimator

σ 22
d = λd1 + (1 − λ )d 2 λ= 2
σ 1 + σ 22

 general result
r (d , θ ) = Var (d ) + bθ2 (d )

which implies that the error of the mmse estimator is

Var (d ) = λ2Var (d1 ) + (1 − λ ) 2 Var (d 2 )

22 / 24

You might also like