You are on page 1of 15

MATH 590S SEQUENTIAL ANALYSIS

PROJECT II

Junyi Dong
March 16, 2015

1 I NTRODUCTION
In this project, we study the problem of constructing a fixed-width confidence interval for
the mean of a normal distribution. We assume X 1 , X 2 , ... are random sample drawn from a
normal population with mean and variance 2 , where both and 2 are both unknown.
Since there is no fixed sample procedure, we use Chow-Robbins (1965) purely sequential
estimation procedure. The sample size required in this procedure is,
N = inf{n n 0 , n >

21 [1]S n2
d2

(1.1)

Here N is determined by the width of the confidence interval d, sample variance S n2 obtained
from the sample X 1 , X 2 ,...,X n , and (1)th percentile of 2 -distribution with degree of freedom
1.
In section 2 and 3, we formulate a general solution for the distribution of N and apply this solution to Chow-Robbins case. In section 4, we calculate the expectation of N and the coverage
probability under different scenarios. In section 5 and 6, we further investigate the distribution
of N and the expectation. And in section 7, we verify the asymptotic properties using plots. In
section 8, we compare Steins Two Stage Procedure with Chow-Robbins estimation procedure.
Finally, we conclude that Chow-Robbins estimation procedure has the following important
properties:
1. E [N ] n 0 + 1 + n opt .
2. As d 0, E [N ] .

] 1.
3. As d 0, E [ nNopt

4. As d 0, P ( J N ) 1 .

2 P RELIMINARIES
Consider Z1 , Z2 ,... be a sequence of independent random variables, iid from exponential
population with mean 1. Let {a n } be a sequence, where a 1 = 0 and a 2 < a 3 < .... We define the
stopping rule as:
M = inf{m m 0 = 1,

m
X

Zi a m+1 }

(2.1)

i =1

P
Let S m = m
i =1 Zi , we find a formula for P (M = k) = P (S 1 > a 2 , S 2 > a 3 , ...S k1 > a k , S k a k+1 ),
for k = 1, 2, ...:

h m (x) =

m1
X (x a
j =0

j!

m)

h m j (a m )

(2.2)

Gm () = P (S 1 > a 2 , ...S m1 > a m , S m )


= P (S 1 > a 2 , ...S m1 > a m )
= e am

m1
X

h m j (a m )

(2.3)

j =0

P (M = k) = P (S 1 > a 2 , ...S k1 > a k ) P (S 1 > a 2 , ...S k > a k+1 )


= P (S 1 > a 2 , ...S k1 > a k , S k a k+1 )
= Gk () Gk+1 ()

(2.4)

If we let m 0 be arbitrary, then we want to consider:


P (M = m 0 + k) = P (S m0 > a m0 +1 , S m0 +1 > a m0 +2 , ...S m0 +k1 > a m0 +k , S m0 +k a m0 +k+1 )
= P (S 1 > 0, S 2 > 0, ...S m0 1 > 0, S m0 > a m0 +1 , ...S m0 +k1 > a m0 +k , S m0 +k a m0 +k+1 )
for k = 0, 1, 2, .... Thus we need to modify sequence {a n } and let a 1 = a 2 = .. = a m0 1 = 0. Then
all the above formula follows.

3 C HOW-R OBBINS C ASE


Now let X 1 , X 2 ,... be a sequence of independent random variables, iid from normal population
with mean and variance 2 . Our goal is to construct a fixed-width confidence interval J N for
the mean . Chow and Robbins purely sequential estimation procedure provides a stopping
time: in general, we sample until
N = inf{n n 0 , n >

21 [1]S n2
d2

(3.1)

P
P
Define Ui = i (i 1+1) (i X i +1 X i )2 . Observe that i X i +1 n(i , i 2 2 ) and X i n(i , i 2 ). We
P
2
must have Ui 2 2 (1) and n1
i =1 Ui = (n 1)S n .

21 [1]S n2
d2

an =

(n 1)nd 2
21 [1]
(n 1)nd 2
21 [1]

(n 1)S n2

n1
X

Ui

i =1

(n 1)nd 2
21 [1]

In order to make use of formula(2.2)-(2.4) in this special case, we need construct Zi so that
Zi is from exponential population with mean 1. Notice that 12 (Ui +Ui +1 ) 2 (2) and thus
1
(Ui +Ui +1 ) exp(1). To make sure that we do have odd number of X i , we also impose a
22
constraint that N must be an odd number. Therefore, N is redefined as:
n
X
1
(U2i 1 +U2i ) A m+1 }
2
i =1 2
n
X
= inf{n n 0 , n = 2m + 1, Vi A m+1 }

N = inf{n n 0 , n = 2m + 1,

i =1

where Vi = 21 2 (U2i 1 +U2i ) exp(1). When n n 0 , we have the corresponding m n021 = m 0 .


Thus the sequence A m + 1 is defined as by: A 1 = A 2 = ... = A m0 1 = 0 and for m m 0 ,
Am =

m(m 1)d 2
21 [1]2

m(m 1)
21 [1]

d
where =
. And distribution of N is: for n n 0 , equivalently, m m 0 ,

P (N = 2m + 1) = P (M = m) = P (S m0 > A m0 +1 , S m0 +1 > A m0 +2 , ..., S m1 > A m , S m A m+1 )


= P (S 1 > 0, ..., S m0 1 > 0, S m0 > A m0 +1 , ..., S m1 > A m , S m A m+1 ).
Finally, the expectation and coverage probability are:

E [N ] =

(2m + 1)P (M = m)

m=m 0

P (

Nd
1)]

X
d 2m + 1
=
(
)P (M = m)

m=m 0

JN
) = E , [2(

(3.2)
(3.3)
(3.4)

4 E XPECTATION AND C OVERAGE P ROBABILITY


In this section, we present three scenarios to show the effects of size of the test , the variance
, and the width of confidence interval d on the expectation of N and coverage probability
P ( J N ). We considered the following scenarios:
Scenario I. Assume = 0.05 and = 1. We consider the following three subcases:(1) d = 0.5,
and (2) d = 0.3.
Scenario II. Assume = 0.05 and = 2. We consider the following three subcases:(1) d = 0.5,
and (2) d = 0.3.
Scenario III. Assume = 0.1 and = 1. We consider the following three subcases:(1) d = 0.5,
and (2) d = 0.3.
Table 4.1 compares the expected value and coverage probability for different values of , d ,
and n 0 . We have the following conclusions:
1. Fix , d and (i.e. fix ), E [N ] increases with n 0 and equals to n 0 when n 0 is sufficient
large. The coverage probability increases to 1. Also notice that P ( J N ) may be smaller
than 1 .
2. Fix , and n 0 , as d decreases (i.e. decreases), we have E [N ] and P ( J N ) both
increases. Also, as decreases, the coverage probability is approaching to 1 , which
supports the asymptotic consistency of Chow-Robbins procedure.
3. Fix , d and n 0 , as increases, or equivalently decreases, both E [N ] and P ( J N )
increases. And the coverage probability is approaching to 1 . Again, these results
demonstrates the asymptotic consistency property.
4. Fix d , n 0 and , when increases, the type I error increases and thus E [N ] and P ( J N )
decreases.

Table 4.1: Expectation and Coverage Probability

m0 n0
1 3
2 5
3 7
4 9
5 11
6 13
7 15
8 17
9 19
10 21
11 23
12 25
13 27
14 29
15 31
16 33
17 35
18 37
19 39
20 41
21 43
22 45
23 47
24 49
25 51
26 53
27 55
28 57
29 59
30 61
n opt

= 0.05
d = 0.5, = 1
E [N ] P ( J n )
12.80 0.8660
14.06 0.9061
14.71 0.9228
15.24 0.9338
15.82 0.9430
16.52 0.9513
17.40 0.9592
18.51 0.9666
19.86 0.9733
21.44 0.9791
23.20 0.9839
25.08 0.9877
27.03 0.9907
29.01 0.9929
31.00 0.9946
33.00 0.9959
35.00 0.9969
37.00 0.9976
39.00 0.9982
41.00 0.9986
43.00 0.9990
45.00 0.9992
47.00 0.9994
49.00 0.9995
51.00 0.9996
53.00 0.9997
55.00 0.9998
57.00 0.9998
59.00 0.9999
61.00 0.9999
15.36584

= 0.05
d = 0.3, = 1
E [N ] P ( J n )
38.56 0.8930
40.54 0.9230
41.07 0.9302
41.31 0.9332
41.45 0.9349
41.55 0.9361
41.64 0.9370
41.72 0.9378
41.80 0.9386
41.89 0.9394
42.00 0.9402
42.13 0.9412
42.29 0.9422
42.50 0.9435
42.76 0.9449
43.09 0.9465
43.51 0.9483
44.05 0.9503
44.71 0.9526
45.51 0.9551
46.47 0.9578
47.60 0.9606
48.88 0.9634
50.31 0.9663
51.88 0.9691
53.57 0.9718
55.35 0.9743
57.21 0.9767
59.12 0.9789
61.06 0.9809
42.68288

= 0.05
d = 0.5, = 2
E [N ] P ( J n )
57.43 0.9086
59.70 0.9337
60.17 0.9384
60.33 0.9400
60.41 0.9408
60.46 0.9412
60.50 0.9415
60.52 0.9417
60.55 0.9419
60.57 0.9420
60.59 0.9422
60.61 0.9424
60.64 0.9425
60.67 0.9427
60.70 0.9429
60.74 0.9431
60.80 0.9434
60.86 0.9437
60.94 0.9441
61.05 0.9446
61.19 0.9451
61.36 0.9457
61.59 0.9465
61.87 0.9474
62.23 0.9484
62.68 0.9496
63.22 0.9510
63.88 0.9525
64.66 0.9541
65.57 0.9559
61.46334

= 0.05
d = 0.3, = 2
E [N ] P ( J n )
167.09 0.9343
169.82 0.9466
170.07 0.9476
170.11 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.12 0.9478
170.7315

= 0.1
d = 0.5, = 1
E [N ] P ( J n )
9.18
0.8161
10.32 0.8622
11.07 0.8860
11.84 0.9044
12.80 0.9210
14.00 0.9364
15.48 0.9500
17.20 0.9616
19.07 0.9709
21.02 0.9781
23.00 0.9835
25.00 0.9876
27.00 0.9906
29.00 0.9929
31.00 0.9946
33.00 0.9959
35.00 0.9969
37.00 0.9976
39.00 0.9982
41.00 0.9986
43.00 0.9990
45.00 0.9992
47.00 0.9994
49.00 0.9995
51.00 0.9996
53.00 0.9997
55.00 0.9998
57.00 0.9998
59.00 0.9999
61.00 0.9999
10.82217

= 0.1
d = 0.3, = 1
E [N ] P ( J n )
26.15 0.8235
27.85 0.8581
28.41 0.8686
28.72 0.8739
28.95 0.8775
29.15 0.8805
29.35 0.8832
29.57 0.8860
29.83 0.8890
30.16 0.8923
30.57 0.8961
31.10 0.9003
31.77 0.9051
32.60 0.9104
33.62 0.9161
34.82 0.9221
36.20 0.9282
37.76 0.9343
39.45 0.9403
41.25 0.9459
43.13 0.9511
45.06 0.9560
47.03 0.9603
49.01 0.9643
51.00 0.9678
53.00 0.9710
55.00 0.9739
57.00 0.9765
59.00 0.9788
61.00 0.9809
30.06159

More specifically, we look at the expectation of N when = 0.05,d = 0.3, = 2 (Figure 4.2).
We found that E [N ] = n 0 when n 0 243.

Table 4.2: Expectation when = 0.05,d = 0.3, = 2


n0
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
39
41

E [N ]
167.0924
169.824
170.0675
170.1067
170.1157
170.1184
170.1194
170.1198
170.12
170.1201
170.1202
170.1202
170.1203
170.1203
170.1203
170.1203
170.1203
170.1203
170.1203
170.1203

n0
43
45
47
49
51
53
55
57
59
61
63
65
67
69
71
73
75
77
79
81

E [N ]
170.1203
170.1203
170.1203
170.1204
170.1204
170.1204
170.1204
170.1204
170.1204
170.1204
170.1204
170.1204
170.1204
170.1205
170.1205
170.1205
170.1206
170.1206
170.1207
170.1207

n0
83
85
87
89
91
93
95
97
99
101
103
105
107
109
111
113
115
117
119
121

E [N ]
170.1208
170.121
170.1211
170.1213
170.1216
170.1219
170.1223
170.1229
170.1236
170.1245
170.1258
170.1274
170.1295
170.1322
170.1358
170.1404
170.1464
170.1542
170.1642
170.1771

n0
123
125
127
129
131
133
135
137
139
141
143
145
147
149
151
153
155
157
159
161

E [N ]
170.1935
170.2144
170.2409
170.2743
170.3161
170.3683
170.4331
170.5128
170.6106
170.7295
170.8733
171.046
171.2519
171.4959
171.7829
172.118
172.5065
172.9536
173.4645
174.044

n0
163
165
167
169
171
173
175
177
179
181
183
185
187
189
191
193
195
197
199
201

E [N ]
174.6966
175.426
176.2357
177.128
178.1046
179.166
180.312
181.5413
182.8516
184.2399
185.7023
187.2342
188.8306
190.486
192.1949
193.9514
195.75
197.5853
199.4519
201.3453

n0
203
205
207
209
211
213
215
217
219
221
223
225
227
229
231
233
235
237
239
241

E [N ]
203.2609
205.195
207.1441
209.1053
211.076
213.0543
215.0383
217.0266
219.0183
221.0125
223.0084
225.0055
227.0036
229.0023
231.0015
233.0009
235.0006
237.0004
239.0002
241.0001

5 D ISTRIBUTION OF N
In this section, we present the pmf of N, based on formula (2.2) to (2.4). Figure 4.2 plots the
probability mass function of N for six scenarios: (1) = 0.05, = 1, d = 0.5, (2) = 0.05, =
1, d = 0.3, (3) = 0.05, = 2, d = 0.5, (4) = 0.05, = 2, d = 0.3, (5) = 0.1, = 1, d = 0.5, and
(6) = 0.1, = 1, d = 0.3 with four subcases (a) n 0 = 3, (b) n 0 = 5, (c) n 0 = 11, and (d) n 0 = 21.

1.00

0.75

0.75

0.75

0.75

0.50

0.25

0.50

0.25

0.00
10

20

30

0.50

0.25

0.00
0

density of N

1.00

density of N

1.00

density of N

density of N

1.00

0.25

0.00
0

n0+k, n0=3

10

20

30

0.50

0.00
0

n0+k, n0=5

10 20 30 40

n0+k, n0=11

20

40

n0+k, n0=21

0.25

0.25

0.20

0.20

0.20

0.20

0.15

0.10

0.05

0.15

0.10

0.05

0.00
20

40

60

0.15

0.10

0.05

0.00
0

density of N

0.25

density of N

0.25

density of N

density of N

(a) distribution of N when = 0.05, = 1 and d = 0.5

n0+k, n0=3

20

40

60

80

0.10

0.05

0.00
0

0.15

0.00
0

n0+k, n0=5

20 40 60 80

n0+k, n0=11

25

50

75

n0+k, n0=21

0.25

0.25

0.20

0.20

0.20

0.20

0.15

0.10

0.05

0.15

0.10

0.05

0.00
25

50

75 100

n0+k, n0=3

0.15

0.10

0.05

0.00
0

density of N

0.25

density of N

0.25

density of N

density of N

(b) distribution of N when = 0.05, = 1 and d = 0.3

25

50

75 100

n0+k, n0=5

0.10

0.05

0.00
0

0.15

0.00
0

30

60

90

n0+k, n0=11

25 50 75 100

n0+k, n0=21

(c) distribution of N when = 0.05, = 2 and d = 0.5

0.100

0.075

0.075

0.075

0.075

0.050

0.025

0.050

0.025

0.000

density of N

0.100

density of N

0.100

density of N

density of N

0.100

0.050

0.025

0.000

0.025

0.000

0 50100150200

0.000

0 50100150200

n0+k, n0=3

0.050

0 50100150200250

n0+k, n0=5

0 50100150200250

n0+k, n0=11

n0+k, n0=21

1.00

1.00

0.75

0.75

0.75

0.75

0.50

0.25

0.50

0.25

0.00
5

10 15 20

0.50

0.25

0.00
0

density of N

1.00

density of N

1.00

density of N

density of N

(a) distribution of N when = 0.05, = 2 and d = 0.3

0.25

0.00
0

n0+k, n0=3

10

20

30

0.50

0.00
0

n0+k, n0=5

10

20

30

n0+k, n0=11

10 20 30 40

n0+k, n0=21

0.25

0.25

0.20

0.20

0.20

0.20

0.15

0.10

0.05

0.15

0.10

0.05

0.00
20

40

n0+k, n0=3

0.15

0.10

0.05

0.00
0

density of N

0.25

density of N

0.25

density of N

density of N

(b) distribution of N when = 0.1, = 1 and d = 0.5

20

40

n0+k, n0=5

60

0.10

0.05

0.00
0

0.15

0.00
0

20

40

60

n0+k, n0=11

20

40

60

n0+k, n0=21

(c) distribution of N when = 0.1, = 1 and d = 0.5

Figure 5.2: distribution of N

We conclude that as n 0 increases, P (N = n 0 ) increases and finally should approach to 1. As


decreases, P (N = n) decreases for fixed n. Moreover, as increases, P (N = n) increases.

6 E XPECTATION OF N

60

exp_val1

50

40

"case(1)"
a=0.05,d=0.5,s=1

30

20

20

40

60

n0

60

exp_val2

55

"a=0.05,d=0.3,s=1"
50

a=0.05,d=0.3,s=1

45

40

20

40

60

n0

exp_val3

64

62

"a=0.05,d=0.5,s=2"
a=0.05,d=0.5,s=2

60

58

20

40

n0

Figure 6.1: Expectation of N

60

exp_val4

250

225

"case(4)"
a=0.05,d=0.3,s=2

200

175

100

200

n0

exp_val5

60

40
"case(5)"
a=0.1,d=0.5,s=1

20

20

40

60

n0

60

exp_val6

50
"case(6)"
a=0.1,d=0.3,s=1
40

30

20

40

n0

Figure 6.2: Expectation of N

60

10

150
"senarios"

exp_val1

a=0.05,d=0.3,s=1
a=0.05,d=0.3,s=2
100
a=0.05,d=0.5,s=1
a=0.05,d=0.5,s=2
a=0.1,d=0.3,s=1
50

a=0.1,d=0.5,s=1

10

20

30

n0
Figure 6.3: Expectation of N

In the Preliminaries, we derive a formula for the expected value of N:


E [N ] =

(2m + 1)P (M = m)

(6.1)

m=m 0
d
which is a function of n 0 , =
, and . From figure 6.1, we conclude:

1. Fix and , as n 0 increases, the expected value of N also increases. And when n 0 is
sufficient large, we have N = n 0 .
2. Fix and n 0 , as increases, E [N ] decreases. Also notice that when is larger, E [N ] gets
closer to n 0 at a smaller n 0 .
3. Fix and n 0 , as increases, E [N ] decreases and E [N ] gets closer to n 0 at a smaller n 0 .

11

7 A SYMPTOTIC P ROPERTIES
First we remark that E [N ] n 0 + 1 + n opt .
80

60

exp_val1

"E[N] vs. n optimal"


E[N]
n_opt
40
n0+nopt+1

20

10

20

30

n0

Figure 7.1: E[N] vs n opt

Chow-Robbins estimation procedure has the following asymptotic properties:


a.s.

1. as d 0, E [ nN
] 1.
opt
Proof:
2

[1]S 2

[1]S 2

Notice N n 0 + n opt + 1 (see section 8 for details) and 1d 2 N N n 0 + 1 d 2 N 1 .


N is a monotonically decreasing function of d: as d , N and thus S 2N and
S 2N 1 both converge to 2 almost surely. Then both
21 [1]2
d2

= n opt . Therefore 1

N
n opt

21 [1]S 2N
d2

0
1 + nnopt
and as d

21 [1]S 2N 1
converge to
d2
a.s.
0, we have E [ nN
] 1.
opt

and

2. as d 0, P ( J N ) 1 , , , (0, 1).
Proof:
p

P X i
N |X N |
p and N 2 = N 2 2 [1]) 2 2 [1] = const ant . By
=
1
1

n opt
1/d
N
P X i
p
p
d
N |X N |
N |X N |
p
Anscombes CLT,
=

n(0,
1)
as
d

0.
P
(

J
)
=
P
(

N
p
p
2

[1]S
N
|X
|
N

N d
N
1
) 2(21 [1]) 1 = 1 .
) P(

Notice that

The simulation demonstrates the asymptotic properties. Figure 7.1 shows that as 0,
1 and P ( J N ) 1 . We also want to point out that Chow-Robbins procedure

E [N ]
n opt

12

does not deliver the right coverage probability for all d: only when d gets closer to 0, we have
the right coverage probability.

1.8

En_1

"d to 0"
1.5

1
Expectation/nopt

1.2

0.9
0.25

0.50

0.75

1.00

lambda

Figure 7.2:

E [N ]
n opt

against

0.98

cvg11

"d to 0"
0.96
1a
coverage probability

0.94

0.92
0.25

0.50

0.75

1.00

lambda

Figure 7.3: Coverage probability against

13

8 C OMPARE WITH S TEIN S T WO S TAGE S AMPLING P ROCEDURE


From project I, we conclude that Steins two stage sampling procedure has the following
properties:
1. The distribution of N, E[N], and P ( J N ) depends on , d , and n 0 .
2. P , ( J N ) 1 : the confidence interval for has the required coverage probability
and length.
3. E , [N ] n opt and limd 0 E [ noNpt ] 1: Steins two-stage sampling procedure requires
more data than optimal amount and is asymptotically inefficient.
And in this project, we find that Chow-Robbins estimation procedure has the following
properties:
1. The distribution of N, E[N], and P ( J N ) depends on , d , and n 0 .
2. P , ( J N ) 1 , when d 0. And P , ( J N ) 1 not for all d.
3. n opt E , [N ] n 0 + n opt + 1 and limd 0 E [ nN
] = 1: Chow-Robbins procedure is
opt
asymptotic efficient.
Then we are able to finish this section by making the following comparison:

Property

Steins Two
Stage Procedure

Exact Consistency
Asymptotical Consistency

X
X

First Order Efficiency


Second Order Efficiency

Chow-Robbins
Estimation Procedure

X
X
X

Remark
P ( J N ) 1
limd 0 P ( J N ) 1

]=1
limd 0 E [ nNopt
limd 0 [E (N ) n opt ] = O(1)

9 C ONCLUSION
In our study, we use simulation to explore some properties of Chow and Robbins purely
sequential estimation procedure. We demonstrate the asymptotic consistency, first-order
asymptotic efficiency, and second-order asymptotic efficiency. We also conclude that ChowRobbins estimation procedure does not have exact consistency.

10 R EFERENCE
1. W.D. Ray, "Sequential Confidence Intervals for the Means of a Normal Population with
Unknown Variance".

14

2. Norman Starr. "The Performance of a Sequential Procedure for the Fixed-Width Interval
Estimation of the Mean".
3. Herbert Robbins, "Sequential Estimation of the Mean of a Normal Population".

15