You are on page 1of 17

Estimation of parametric functions in Downton’s

bivariate exponential distribution

George Iliopoulos
Department of Mathematics
University of the Aegean
83200 Karlovasi, Samos, Greece
e-mail: geh@aegean.gr

Abstract

This paper considers estimation of the ratio of means and the regression function in
Downton’s (1970) bivariate exponential distribution. Unbiased estimators are given and,
by presenting improved estimators, they are shown to be inadmissible in terms of mean
squared error. The results are derived by conditioning on an unobserved random sample
from a geometric distribution which provides conditional independence for the statistics
involved.

AMS 2000 subject classifications: 62F10, 62C99.

Key words and phrases: Downton’s bivariate exponential distribution, unbiased esti-
mation, ratio of means, regression function, mean squared error, inadmissibility.

1 Introduction

One of the most important bivariate distributions in reliability theory is the bivariate
exponential. There are various bivariate exponential distributions in the literature. A
recent review can be found in the book of Kotz, Balakrishnan and Johnson (2000). In this
paper we are interested in Downton’s bivariate exponential distribution with probability

1
density function (pdf)
 ( )
2(ρλ1 λ2 xy)1/2

λ1 λ2 λ1 x + λ2 y
f (x, y; λ1 , λ2 , ρ) = exp − I0 , (1.1)
1−ρ 1−ρ 1−ρ
P∞ 2k 2
where x, y, λ1 , λ2 > 0, 0 6 ρ < 1, and I0 (z) = k=0 (z/2) /k! is the modified Bessel
function of the first kind of order zero. The above density was initially derived in a different
form by Moran (1967). The form (1.1) is derived by Downton (1970) in a reliability context
and is a special case of Kibble’s (1941) bivariate gamma distribution.
Let (X, Y ) be an observation from (1.1). The marginal distributions of X, Y are
exponential with means (scale parameters) 1/λ1 , 1/λ2 respectively. Since I0 (0) = 1, it is
clear that X and Y are independent if and only if ρ = 0. Downton (1970) showed that ρ
is the correlation coefficient of the two variates. By expanding in a series, the pdf can be
written in the form

X    
f (x, y ; λ1 , λ2 , ρ) = π(k ; ρ) gk+1 x ; 1−ρ
λ1 g k+1 y ; 1−ρ
λ2 ,
k=0

where gα (· ; β) denotes the pdf of a Gamma(α, β) random variable and π(k ; ρ) = (1 −


ρ)ρk , k = 0, 1, 2, . . ., is the geometric probability mass function. Let K be a random
variable having the above geometric distribution. Then, conditionally on K = k, X, Y are
independent gamma variates with shape parameter k + 1 and scale parameters (1 − ρ)/λ1 ,
(1−ρ)/λ2 respectively. The most common algorithm for generating observations from (1.1)
(see Downton, 1970 and Al-Saadi, Scrimshaw and Young, 1979) as well as the extension
of the above distribution in more than two dimensions (see Al-Saadi and Young, 1982) are
based on this well–known property.
Statistical inference for the parameters of (1.1) is restricted mainly on the correlation
coefficient ρ. Nagao and Kadoya (1971), Al Saadi and Young (1980), and Balakrishnan
and Ng (2001) considered the estimation problem of ρ, and Al Saadi, Scrimshaw and
Young (1979) the problem of testing the hypothesis ρ = 0. However, another interesting
problem is the estimation of λ = λ2 /λ1 , which represents the ratio of the means of the
two components. For example, an estimated value greater than one indicates that on the
average the first component is more reliable than the second one. Note that λ is also
the ratio of the scale parameters of X and Y . Estimation of λ in general scale families
including among others normal, exponential and inverse Gaussian has been considered
by many authors in the past. For a decision theoretic approach, see Gelfand and Dey
(1988), Madi and Tsui (1990), Kubokawa (1994), Madi (1995), Ghosh and Kundu (1996),

2
Kubokawa and Srivastava (1996) (who assume independence of the two components) and
Iliopoulos (2001) (who considers the problem of estimation of the ratio of variances in the
bivariate normal distribution).
Next, we outline the rest of the paper. In Section 2, an unbiased estimator, λ̂U , of λ is
derived based on a random sample from (1.1). Then, a class of inadmissible estimators with
respect to the mean squared error is constructed and it is shown that this class contains
λ̂U . Furthermore, some alternative biased estimators dominating λ̂U are presented. In
Section 3, unbiased estimators of the regression of X on Y , as well as of the conditional
variance of X given Y = y, are given. They are also shown to be inadmissible; improved
estimators are presented as well. Finally, an Appendix contains useful expressions for
expectations of geometric and negative binomial distributions as well as of the statistics
involved in the derivation of the results.

2 Estimation of the ratio of means

Let (X1 , Y1 ), . . . , (Xn , Yn ), n > 2, be a random sample from (1.1) and K = (K1 , . . . , Kn )
be the associated (unobserved) random sample from the geometric distribution π(· ; ρ)
such that, given Ki = ki , Xi is independent of Yi , i = 1, . . . , n. Since each Ki is related
only to (Xi , Yi ), it is clear that, conditionally on K = k = (k1 , . . . , kn ), all X’s and Y ’s
P P
are independent. Set K = Ki , k = ki and note that K follows a negative binomial
distribution.
By considering the joint distribution of the data, it is easily seen that the suffi-
P P P P
cient statistic is (X1 Y1 , . . . , Xn Yn , Xi , Yi ). Setting S1 = X i , S2 = Yi , U =
(U1 , . . . , Un ) = (X1 S1−1 , . . . , Xn S1−1 ), and V = (V1 , . . . , Vn ) = (Y1 S2−1 , . . . , Yn S2−1 ) we
obtain the one–to–one transformation (U1 V1 , . . . , Un Vn , S1 , S2 ), which is also sufficient.
Conditionally on K = k, S1 , S2 are independent and Si ∼ Gamma(n + k, (1 − ρ)λ−1
i ),
i = 1, 2. Moreover, from a well–known characterization of the gamma distribution, (S1 , S2 )
is independent of (U, V), and U, V are iid from a (n − 1)-variate Dirichlet distribution
with parameters k1 + 1, . . . , kn−1 + 1, kn + 1.
Consider the estimation problem of λ = λ2 /λ1 . Nagao and Kadoya (1971) showed that
the maximum likelihood estimators (mles) of λ1 and λ2 are 1/X̄ and 1/Ȳ respectively,
thus the mle of λ is λ̂mle = S1 /S2 . Using Lemma 4.1(vii) in the Appendix, we obtain the
expectation of this estimator,

  
n+K n−ρ
E[S1 /S2 ] = E[E(S1 /S2 |K)] = λ E =λ . (2.1)
n+K −1 n−1

3
Hence, λ̂mle is biased. For deriving an unbiased estimator of λ it is necessary to employ
an estimator of the correlation coefficient ρ. There are two classes of estimators of ρ in
P P
the literature: (i) estimators based on the statistic T = Xi Yi /S1 S2 = Ui Vi (such
as the moment estimator) and (ii) estimators based on the sample correlation coefficient
R, see Al-Saadi and Young (1980) and Balakrishnan and Ng (2001). However, R is not
a function of the sufficient statistic, whereas T is. Therefore, T has been chosen for our
purposes. Note also that the problem of estimation of λ remains invariant under the group
of transformations (Xi , Yi ) −→ (c1 Xi , c2 Yi ), i = 1, . . . , n, and equivariant estimators of λ
are of the form ψ(U1 V1 , . . . , Un Vn )S1 /S2 . A particular choice for ψ can be of the form
ψ(T ), giving more justification to T .

The conditional expectation of T given K = k is


n n n
X X X (ki + 1)2
E[T |K = k] = E[Ui Vi |K = k] = E(Ui |K = k)2 = .
(k + n)2
i=1 i=1 i=1
Since T is a function of U and V solely, it follows that, conditionally on K, it is also
independent of S1 , S2 . Therefore,

E[T S1 /S2 ] = E[E(T S1 /S2 |K)] = E[E(T |K)E(S1 /S2 |K)]


" n #
X (Ki + 1)2
= λE
(n + K)(n + K − 1)
i=1
= λ E n(n + K)−1 (n + K − 1)−1 E (K1 + 1)2 K
  
   
n + 1 + 2K 1 n−3
= λE =λ + ρ (2.2)
(n + 1)(n + K − 1) n − 1 n2 − 1
(see Lemma 4.1). From (2.1), (2.2) it can be seen that each of E[S1 /S2 ], E[T S1 /S2 ] equals λ
times a first degree polynomial in ρ. The derivation of an unbiased estimator of λ which is a
function of S1 , S2 and T is equivalent to finding c0 , c1 such that E[c0 S1 /S2 +c1 T S1 /S2 ] = λ.
Solving the linear equations
n 1
c0 + c1 = 1
n−1 n−1
1 n−3
− c0 + 2 c1 = 0 ,
n−1 n −1
we obtain c0 = (n − 3)/(n − 1), c1 = (n + 1)/(n − 1). Thus, we have proved the following
proposition.

Proposition 2.1. The estimator


 
n − 3 + (n + 1)T S1
λ̂U = (2.3)
n−1 S2
is unbiased for λ = λ2 /λ1 .

4
For n > 3, the variance of λ̂U is

n − 3 + (n + 1)T S1 2
 
Var(λ̂U ) = E − λ2
n−1 S2
 2   2(n−3)(n+1)  2 2   n+1 2  2 2 2 
2 2
= n−3
n−1 E S 1 /S 2 + (n−1)2
E T S1 /S2 + n−1 E T S1 /S2 − λ2 ,

and substituting the expectations from Lemma 4.2 we get

2n2 − 5n + 5 2(n3 − 3n + 10) n3 + 6n2 − 5n + 38


 
2
Var(λ̂U ) = − ρ + ρ λ2 .
(n − 2)(n − 1)2 (n2 − 4)(n − 1)2 (n + 3)(n2 − 4)(n − 1)2

Consider the class of estimators of λ,

C = {λ̂a1 ,a2 = (a1 + a2 T )S1 /S2 , a1 , a2 ∈ R} .

The unbiased estimator λ̂U as well as the mle λ̂mle are members of C for a1 = a1U =
(n − 3)/(n − 1), a2 = a2U = (n + 1)/(n − 1) and a1 = 1, a2 = 0, respectively. We would
like to characterize inadmissible estimators within C in terms of mean squared error (mse).
By invariance, the (scaled) mse λ−2 Eλ1 ,λ2 ,ρ (λ̂a1 ,a2 − λ)2 does not depend on λ1 , λ2 . Thus,
without loss of generality, we assume for the rest of the section that λ1 = λ2 = 1 and
denote the mse of λ̂a1 ,a2 as mse(a, ρ), where a = (a1 , a2 )0 .

Fix ρ ∈ [0, 1]. Then, for n > 3, mse(a, ρ) is strictly convex in a and there exists a
minimizing point a0 (ρ) = (a10 (ρ), a20 (ρ))0 with

a10 (ρ) = (n − 2)q1 (ρ)/q2 (ρ)


a20 (ρ) = 3(n − 2)(n + 1)(n + 2)(n + 3)ρ(1 − ρ)2 /q2 (ρ) ,

where

q1 (ρ) = (n + 1)(n + 2)(n + 3) + 4(n − 6)(n + 1)(n + 3)ρ


−(n − 5)(3n2 + 29n + 30)ρ2 + 2(n3 − 11n − 46)ρ3 ,
q2 (ρ) = (n + 1)2 (n + 2)(n + 3) + 4(n − 6)(n + 1)2 (n + 3)ρ
−(3n4 + 32n3 − 77n2 − 382n − 312)ρ2
+2(n4 + 4n3 + 25n2 − 126n − 256)ρ3 − 3(n3 + 13n − 94)ρ4 .

As expected, it holds a0 (0) = ((n − 2)/(n + 1), 0)0 , i.e., in the case of two independent
exponential samples, the best estimator within C coincides with the best equivariant es-
timator of λ. On the other hand, a0 (1) = (1, 0)0 , that is, the best estimator in this case
is the mle. Notice here that the mse of the mle tends to zero as ρ → 1. To see that

5
without evaluating it, observe that since the support of Xi , Yi is (0, ∞), ρ = 1 implies
that Xi /λ1 = Yi /λ2 with probability one.

Setting  
Eλ1 =λ2 =1,ρ (S12 /S22 ) Eλ1 =λ2 =1,ρ (T S12 /S22 )
B(ρ) =  
Eλ1 =λ2 =1,ρ (T S12 /S22 ) Eλ1 =λ2 =1,ρ (T 2 S12 /S22 )

the mse of λ̂a1 ,a2 can be expressed as

mse(a, ρ) = [a − a0 (ρ)]0 B(ρ)[a − a0 (ρ)] + mse(a0 (ρ), ρ) .

Let

E(a, ρ) = c ∈ R2 : [c − a0 (ρ)]0 B(ρ)[c − a0 (ρ)] < mse(a, ρ) − mse(a0 (ρ), ρ)




be the interior of the ellipse that consists of the points c = (c1 , c2 )0 , such that λ̂c1 ,c2 has
equal mse with λ̂a1 ,a2 for the particular ρ. Then, λ̂a1 ,a2 is admissible within C if and only
if
\
E(a, ρ) = ∅ .
ρ∈[0,1)

This condition is clearly satisfied by λ̂a10 (ρ),a20 (ρ) , ∀ ρ ∈ [0, 1), implying that these estima-
tors are admissible within C. By the continuity of the mse, this holds also for the mle.
However, the determination of the above intersection is in general a problem which does
not seem to allow for an analytical solution. Instead of that, we can find a subclass of C
containing inadmissible estimators λ̂a1 ,a2 , by fixing a1 or a2 one at a time.

Fix first a1 . Then, the mse of λ̂a1 ,a2 is quadratic in a2 and uniquely minimized at
a2 = a∗2 (a1 , ρ) given by

(n − 2)[n + 1 − (n − 3)ρ] + [(n + 1)2 + (n2 − 5n − 12)ρ − 3(n − 5)ρ2 ]a1


a∗2 (a1 , ρ) = .
(n + 2)(n + 3)2 + 2(n + 3)(n2 + n − 26)ρ + (n3 − 8n2 − 27n + 178)ρ2
(2.4)
Since the denominator in (2.4) is positive for every ρ ∈ [0, 1] and n > 3, a∗2 (a1 , ρ) is
bounded. Let a∗2 (a1 ) = inf ρ∈[0,1) a∗2 (a1 , ρ) and a∗2 (a1 ) = supρ∈[0,1) a∗2 (a1 , ρ). Then we have
the following.

/ A∗2 (a1 ) = [a∗2 (a1 ), a∗2 (a1 )] then λ̂a1 ,a2 is inadmissible being
Proposition 2.2. (i) If a2 ∈
dominated by λ̂a1 ,a∗2 (a1 ) if a2 < a∗2 (a1 ) or λ̂a1 ,a∗ (a1 ) if a2 > a∗2 (a1 ).
2
(ii) In particular, if a1 6 a11 = (n − 7)/(n − 1) then
(n + 2)(n + 3)
a∗2 (a1 ) = a∗2 (a1 , 1) = (1 − a1 ) , (2.5)
2(n + 5)

6
(n + 1)2
 
n−2
a∗2 (a1 ) = a∗2 (a1 , 0) = − a1 , (2.6)
n+3 n+1
whereas, if a1 > a12 = (n3 + 2n2 − 41n − 34)/[(n − 1)(n2 + 9n + 10)],

a∗2 (a1 ) = a∗2 (a1 , 0) , a∗2 (a1 ) = a∗2 (a1 , 1) ,

where a∗2 (a1 , 0), a∗2 (a1 , 1) are as in (2.5), (2.6), respectively.

Proof. Part (i) is a consequence of the convexity of the mse in a2 . Part (ii) arises from
the monotonicity of a∗2 (a1 , ρ) with respect to ρ. Specifically, for a1 6 a11 , a∗2 (a1 , ρ) is
strictly decreasing in ρ whereas for a1 > a12 it is strictly increasing. This can be seen by
examining the sign of the derivative of a∗2 (a1 , ρ) with respect to ρ which is proportional to
the quadratic

(n + 3)[(n − 1)(n2 + 9n + 10)a1 − (n3 + 2n2 − 41n − 34)]+


2[(n − 1)(n3 − 35n − 46)a1 − (n + 1)(n3 − 8n2 − 27n + 178)]ρ+
[(n − 1)(n3 − 4n2 − 19n + 102)a1 − (n − 3)(n3 − 8n2 − 27n + 178)]ρ2 .

The rest of the proof is elementary (although messy) and therefore omitted.

Remark 2.1. When a2 < a∗2 (a1 ), by the convexity of the mean squared error, λ̂a1 ,a2 is
dominated not only by λ̂a1 ,a∗2 (a1 ) , but by any estimator λ̂a1 ,a02 with a02 ∈ (a2 , a∗2 (a1 )] (a
similar argument occurs when a2 > a∗2 (a1 )). Nevertheless, λ̂a1 ,a∗2 (a1 ) is the best among
these estimators, therefore is the only one mentioned in Proposition 2.2.

In a similar way, by fixing a2 and letting a1 to vary, one can obtain an analogous result.
In this case the mse is quadratic in a1 and uniquely minimized in a∗1 (a2 , ρ) given by

(n + 1)(n − 2)(n − ρ) − [(n + 1)2 + (n2 − 5n − 12)ρ − 3(n − 5)ρ2 ]a2


a∗1 (a2 , ρ) = .
(n + 1)[n(n + 1) − 4(n + 1)ρ + 6ρ2 ]

The denominator is always positive, thus a∗1 (a2 , ρ) is bounded for ρ ∈ [0, 1]. Setting
a∗1 (a2 ) = inf ρ∈[0,1) a∗1 (a2 , ρ) and a∗1 (a2 ) = supρ∈[0,1) a∗1 (a2 , ρ), we derive the following.

/ A∗1 (a2 ) = [a∗1 (a2 ), a∗1 (a2 )] then λ̂a1 ,a2 is inadmissible being
Proposition 2.3. (i) If a1 ∈
dominated by λ̂a∗1 (a2 ),a2 if a1 < a∗1 (a2 ) or λ̂a∗ (a2 ),a2 if a1 > a∗1 (a2 ).
1
(ii) In particular, if a2 6 a21 = 3n(n + 1)/[(n − 1)(n + 2)] then
n − 2 a2
a∗1 (a2 ) = a∗1 (a2 , 0) = − , (2.7)
n+1 n

7
2a2
a∗1 (a2 ) = a∗1 (a2 , 1) = 1 − , (2.8)
n+1
whereas, if a2 > a22 = 3(n + 1)/(n − 1),

a∗1 (a2 ) = a∗1 (a2 , 1) , a∗1 (a2 ) = a∗1 (a2 , 0) ,

where a∗1 (a2 , 0), a∗1 (a2 , 1) are as in (2.7), (2.8), respectively.

Propositions 2.2 and 2.3 provide necessary conditions for the admissibility of λ̂a1 ,a2
within C as stated in Corollary 2.1 below.

Corollary 2.1. Two necessary conditions for the admissibility of λ̂a1 ,a2 within C are
a1 ∈ A∗1 (a2 ) and a2 ∈ A∗2 (a1 ).

Typically, unbiased estimators of scale parameters (as is λ for the distribution of S1 /S2 )
are inadmissible in terms of mean squared error. In our case, the inadmissibility of the
unbiased estimator λ̂U follows from Proposition 2.2, since a1U > a12 and a2U > a∗2 (a1U ) =
(n + 2)(n + 3)/(n − 1)(n + 5).

Corollary 2.2. The unbiased estimator λ̂U is inadmissible in terms of mean squared error
being dominated by
 
n − 3 (n + 2)(n + 3) S1
λ̂∗U = λ̂a1U ,a∗ (a1U ) = + T . (2.9)
2 n − 1 (n − 1)(n + 5) S2

Consider now the broader class of estimators

D = {λ̂φ = φ(T )S1 /S2 } ,

where φ(·) is any function such that λ̂φ has finite mse. Using Stein’s (1964) technique,
originally presented for improving the best equivariant estimator of a normal variance
when the mean is unknown, one concludes that λ̂∗U in (2.9) as well as a large subset of C
are inadmissible estimators. To be specific, consider
the conditional mean squared error
n o
2
of λ̂φ given T = t, K = k, E [φ(T )S1 /S2 − λ] T = t, K = k , which is quadratic in φ(t)
and uniquely minimized at

λ E[S1 /S2 |T = t, K = k] n+k−2


φk (t) = = = φ∗k ,
E[S12 /S22 |T = t, K = k] n+k+1

say. Note that it does not depend on t, since conditionally on K = k, S1 , S2 and T are
mutually independent. Moreover φ∗k is strictly increasing in k with φ∗0 = (n − 2)/(n + 1)

8
n 3 5 10 20 50 100
ρ 0.8463 0.8317 0.8433 0.8663 0.9004 0.9239

Table 1. Values of ρ for which λ̂mle and λ̂∗U have equal mean squared errors.

and limk→∞ φ∗k = 1. As a consequence, each estimator of the form λ̂φ with P[φ(T ) ∈
/
[(n − 2)/(n + 1), 1] > 0 is inadmissible being dominated by the estimator φ∗ (T )S1 /S2 ,
where φ∗ (T ) = max{(n − 2)/(n + 1), min[φ(T ), 1]}. Application of the above argument to
the class C leads to the following proposition.

Proposition 2.4. (i) If a1 ∈


/ [(n − 2)/(n + 1), 1] or a2 ∈
/ [(n − 2)/(n + 1) − a1 , 1 − a1 ],
then the estimator λ̂a1 ,a2 is inadmissible being dominated by

max{(n − 2)/(n + 1), min[a1 + a2 T, 1]}S1 /S2 .

(ii) In particular, λ̂∗U in (2.9) is dominated by λ̂∗∗ ∗


U = min{λ̂U , λ̂mle }.

The mse of λ̂∗∗


U cannot be derived in a closed form, therefore an analytical comparison
with λ̂mle is impossible. However, it is easy to compare the latter with λ̂∗U . Table 1 shows,
for selected sample sizes, the corresponding values of the correlation coefficient for which
both estimators have equal mean squared errors. When ρ is less than the reported value,
λ̂∗U is superior to λ̂mle and vice-versa. Since λ̂∗∗ ∗
U dominates λ̂U , it follows that for ρ less
than the reported value, λ̂∗∗
U dominates λ̂mle as well. (In fact, a Monte Carlo study showed
that λ̂∗∗
U and λ̂mle have equal mean squared errors when ρ is approximately 0.05 higher
than the values given in Table 1.) It can be concluded that λ̂∗∗
U should be preferred, unless
almost perfect linear correlation is suspected.

3 Estimation of the regression and the conditional


variance

Consider now estimation of the regression of X on Y based on a random sample from


(1.1). Downton (1970) showed that the conditional expectation of X given Y = y is linear
in y, specifically,
1−ρ λ2
η(y) = E[X|Y = y] = +ρ y .
λ1 λ1
Obviously, for deriving an unbiased estimator of η(y) it suffices to derive unbiased estima-
tors of η1 = (1 − ρ)/λ1 and η2 = ρλ2 /λ1 .

9
Proposition 3.1. (i) The estimator

2 − (n + 1)T
η̂1U = S1
n−1
is unbiased for η1 = (1 − ρ)/λ1 .

(ii) The estimator


n+1 S1
η̂2U = (nT − 1)
n−1 S2
is unbiased for η2 = ρλ2 /λ1 .

Proof. (i) The problem is similar to that of the derivation of λ̂U in (2.3). We have to
find c0 , c1 such that E[(c0 + c1 T )S1 ] = (1 − ρ)λ−1
1 . Using Lemma 4.2 (i), (ii), it can be
n−1
seen that it suffices to solve the equations nc0 + c1 = 1, n+1 c1 = −1, for c0 and c1 . The
solution is c0 = 2/(n − 1) and c1 = −(n + 1)/(n − 1), hence η̂1U is an unbiased estimator
of η1 = (1 − ρ)/λ1 .
(ii) Similarly, we need to find c0 , c1 such that E[(c0 + c1 T )S1 /S2 ] = ρλ2 /λ1 . Using (2.1)
and (2.2), we get the equations
n 1
c0 + c1 = 0
n−1 n−1
1 n−3
− c0 + 2 c1 = 1 ,
n−1 n −1
whose solution is c0 = −(n + 1)/(n − 1) and c1 = n(n + 1)/(n − 1), yielding η̂2U as an
unbiased estimator of η2 = ρλ2 /λ1 .

Corollary 3.1. The estimator

2 − (n + 1)T n+1 S1
η̂U (y) = S1 + (nT − 1) y
n−1 n−1 S2
is unbiased for η(y).

The estimator η̂U (y) is inadmissible for every y, since it assumes negative values with
positive probability. A rather crude improved estimator is its positive part, η̂U+ (y) =
max{0, η̂U (y)}, which has smaller risk for any convex loss function. However, the same
occurs for η̂1U and η̂2U , and it seems rational to improve first on them and use their
improvements to estimate the regression.

An estimator dominating η̂1U in terms of mean squared error can be derived using
Stein’s (1964) technique. Consider the conditional mean squared error of estimators of

10
n 
−1 2
o
the form φ(T )S1 given T = t, K = k, E φ(T )S1 − (1 − ρ)λ1 T = t, K = k , which
is quadratic in φ(t) and uniquely minimized at

λ−1
1 (1 − ρ)E[S1 |T = t, K = k] 1
φk (t) = = = φ∗k ,
E[S12 |T = t, K = k] n+k+1
say. Now, φ∗k is positive, attaining its maximum when k = 0, i.e. 0 < φ∗k 6 φ∗0 =
(n + 1)−1 . As a consequence, each estimator of the form φ(T )S1 with P[φ(T ) ∈
/ [0, (n +
1)−1 ]] > 0 is inadmissible being dominated by the estimator φ∗ (T )S1 , where φ∗ (T ) =
max{0, min[φ(T ), (n + 1)−1 ]}. Since P[(2 − (n + 1)T )/(n − 1) ∈
/ [0, (n + 1)−1 ]] > 0, η̂1U is
dominated by the estimator

S /(n + 1) , T < (n + 3)/(n + 1)2 ,
 1



η̂1∗ = η̂1U , (n + 3)/(n + 1)2 6 T 6 2/(n + 1) , (3.1)



 0 , T > 2/(n + 1) .

In a similar fashion we can improve on η̂2U . Note that it contains the quantity nT − 1,
which is the estimator of ρ obtained by Nagao and Kadoya (1971) using the method of
moments. Using the condition 0 6 ρ < 1, Al-Saadi and Young (1980) modified this
estimator to 


 0 , T < 1/n ,

ρ̃ = nT − 1 , 1/n 6 T 6 2/n ,



 1 , T > 2/n .
+
The replacement of nT − 1 in η̂2U by max{nT − 1, 0} leads to its positive part, η̂2U =
max{0, η̂2U }, which is an improved estimator of ρλ2 /λ1 . Replacement of nT −1 by ρ̃ seems
also reasonable, leading to the estimator




 0 , T < 1/n ,

η̃2 = η̂2U , 1/n 6 T 6 2/n , (3.2)
n + 1 S1


, T > 2/n .



n − 1 S2
However, using Stein’s (1964) technique we can find an estimator dominating all these
estimators. Consider the class of estimators of ρλ2 /λ1 having the form ψ(T )S1 /S2 . The
conditional mean squared error given T = t, K = k of such an estimator is uniquely
minimized with respect to ψ(t) at

ρλ2 λ−1
1 E[S1 /S2 |T = t, K = k] n+k−2
ψk (t) = =ρ = ψk∗ (ρ) ,
E[S12 /S22 |T = t, K = k] n+k+1

11
ρ 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
n = 10 η1∗ 39.4 41.8 43.1 44.3 46.2 49.4 53.9 58.9 64.1 67.7
η2∗ 46.9 45.4 43.3 42.1 42.3 44.3 47.9 53.0 59.4 65.2
n = 20 η1∗ 32.1 31.5 27.9 25.8 27.5 32.7 41.0 50.5 58.7 64.9
η2∗ 42.8 36.4 28.9 24.8 25.4 29.4 36.7 45.6 54.5 62.5
n = 50 η1∗ 28.1 22.1 11.6 6.5 7.5 13.2 22.5 34.2 47.4 58.7
η2∗ 43.6 27.9 12.5 6.3 6.9 11.8 20.0 30.7 43.8 56.4
Table 2. Simulated percentage risk improvement of the mean squared error of η̂1∗ in (3.1), η̂2∗ in
(3.3) over η̂1U , η̂2U respectively.

say. Since 0 6 ψk∗ (ρ) 6 ρ < 1, any estimator of the form ψ(T )S1 /S2 satisfying P[ψ(T ) ∈
/
[0, 1]] > 0 is inadmissible. Indeed, it is dominated by ψ ∗ (T )S1 /S2 , where ψ ∗ (T ) =
+
max{0, min[ψ(T ), 1]}. Thus, η̂2U , η̂2U are dominated by



 0 , T < 1/n ,

η̂2∗ = η̂2U , 1/n 6 T 6 2/(n + 1) , (3.3)



 S /S
1 2 , T > 2/(n + 1) .

From (3.2) and (3.3), it is obvious that η̂2∗ dominates also η̃2 .

Remark. The estimators η̂1∗ , η̂2∗ in (3.1), (3.3) respectively, have the property of “pre-
testing” for ρ. For example, when T is “small” (smaller than (n + 3)/(n + 1)2 ), indicating
ρ = 0, η̂1∗ equals to the best equivariant estimator of 1/λ1 with respect to squared error
loss, S1 /(n + 1). On the other hand, when T is “large” (greater than 2/(n + 1)), indicating
ρ to be very close to one, η̂1∗ equals zero. Analogous comments hold for η̂2∗ .

The percentage improvements in terms of mean squared error of the estimators η̂1∗ , η̂2∗
over η̂1U , η̂2U respectively, have been evaluated by Monte Carlo sampling from (1.1), for
sample sizes n = 10, 20, 50 and ρ = 0(.1).9. We have taken 10000 replications for each
pair (n, ρ). The results are shown in Table 2. It can be seen that the improvements are
remarkable even for n = 50. Generally, they are larger for extreme values of ρ. This can
be explained by the nature of the improved estimators as indicated in the above remark.

The conditional variance of X given Y = y is also linear in y. Specifically,


 2
1−ρ λ2
θ(y) = Var(X|Y = y) = + 2ρ(1 − ρ) y.
λ1 λ21

Let θ1 = (1 − ρ)2 λ−2 −2


1 , θ2 = 2ρ(1 − ρ)λ2 λ1 . Then we have the following proposition.

12
Proposition 3.2. (i) The estimator θ̂1U = h1 (T )S12 , where

4(n + 5) − 4(n + 1)(n + 5)T + (n + 1)(n + 2)(n + 3)T 2


h1 (T ) = ,
(n − 1)(n2 + 5n + 2)

is unbiased for θ1 = (1 − ρ)2 λ−2


1 .

(ii) The estimator θ̂2U = h2 (T )S12 /S2 , where

−4(n2 + 7n + 8) + 2(n + 1)(3n2 + 19n + 18)T − 2(n + 1)2 (n + 2)(n + 3)T 2


h2 (T ) = ,
(n − 1)(n2 + 5n + 2)

is unbiased for θ2 = 2ρ(1 − ρ)λ2 λ−2


1 .

Proof. Similarly to the proof of Proposition 3.1, the problem reduces in finding c0 , c1 , c2 ,
d0 , d1 , d2 such that E[(c0 + c1 T + c2 T 2 )S12 ] = θ1 , E[(d0 + d1 T + d2 T 2 )S12 /S2 ] = θ2 for part
(i), (ii) respectively. Using Lemma 4.2 and equating the coefficients of the appropriate
second degree polynomials in ρ, we obtain θ̂1U , θ̂2U as unbiased estimators of θ1 , θ2 .

Corollary 3.2. The estimator

θ̂U (y) = h1 (T )S12 + y h2 (T )S12 /S2

is unbiased for θ(y).

The estimators θ̂1U , θ̂2U and hence θ̂(y) assume negative values with positive proba-
bility. As in the estimation problem of η(y), we can improve on them by truncating h1 ,
h2 in suitable intervals. Omitting the details, an estimator of θ1 = (1 − ρ)2 λ−2
1 of the
/ [0, 1/(n + 2)(n + 3)]] > 0 is dominated by φ∗ (T )S12 where
form φ(T )S12 satisfying P[φ(T ) ∈
φ∗ (T ) = max{0, min[φ(T ), 1/(n + 2)(n + 3)]}, whereas an estimator of θ2 = 2ρ(1 − ρ)λ2 λ−2
1
of the form ψ(T )S12 /S2 with P[ψ(T ) ∈
/ [0, 2(n − 2)/(n + 2)(n + 3)]] > 0 is dominated by
ψ ∗ (T )S12 /S2 where ψ ∗ (T ) = max{0, min[ψ(T ), 2(n − 2)/(n + 2)(n + 3)]}, provided n > 6.
The functions h1 , h2 satisfy the above conditions for n > 3, thus θ̂1U , θ̂2U are dominated
by suitable estimators.

4 Appendix

Lemma 4.1. Let K1 , K2 , . . . , Kn be a random sample from a geometric distribution with


probability mass function

π1 (k1 ; ρ) = P(K1 = k1 ; ρ) = (1 − ρ)ρk1 , k1 = 0, 1, 2, . . . . (4.1)

13
P
and K = Ki . Then,
  −1
n−2+k−k1 n+k−1
(i) P(K1 = k1 |K = k) = k−k1 k
, 0 6 k1 6 k,
  −1
n−3+k−k1 −k2 n+k−1
(ii) P(K1 = k1 , K2 = k2 |K = k) = k−k1 −k2 k
, 0 6 k1 , k2 , k1 + k2 6 k,

3n+1 2
(iii) E[(K1 + 1)2 |K = k] = 1 + n(n+1) k+ n(n+1) k2 ,

(iv) E[(K1 + 1)2 (K1 + 2)2 |K = k] =


 
8n2 +13n+3 19n2 +41n+18 18 6
4 1 + n(n+1)(n+3) k + n(n+1)(n+2)(n+3) k2 + n(n+2)(n+3) k3 + n(n+1)(n+2)(n+3) k4 ,

(v) E[(K1 + 1)2 (K2 + 1)2 |K = k] =


(2n+3)(3n+1) 13n2 +29n+14 12 4
1+ n(n+1)(n+3) k+ n(n+1)(n+2)(n+3) k2 + n(n+2)(n+3) k3 + n(n+1)(n+2)(n+3) k4 ,

nρ nρ (1 + nρ)
(vi) EK = , EK 2 = ,
1−ρ (1 − ρ)2
 
n+K n−ρ
(vii) E = .
n+K −1 n−1

Proof. Parts (i), (ii) are applications of the Bayes Theorem, whereas parts (iii)–(vi) are
straightforward. We will prove only part (vii).
P
Since K = Ki follows a negative binomial distribution with probability mass func-
tion
 
n+k−1
πn (k ; ρ) = k
ρk (1 − ρ)n , k = 0, 1, 2, . . . ,

one has
  ∞
n+K X n + k  n+k−1  k
E = k
ρ (1 − ρ)n
n+K −1 n+k−1
k=0

X (n + k − 2)! k
= (n + k) ρ (1 − ρ)n
k!(n − 1)!
k=0
∞ ∞
n(1 − ρ) X  n−1+k−1  k n−1
X  n+k−1 
= k
ρ (1 − ρ) + ρ k
ρk (1 − ρ)n
n−1
k=0 k=0
n(1 − ρ) n−ρ
= +ρ= .
n−1 n−1

Lemma 4.2. Let (X1 , Y1 ), (X2 , Y2 ), . . . , (Xn , Yn ) be a random sample from (1.1), and
P P P
S1 = Xi , S2 = Yi , T = Xi Yi /(S1 S2 ). Then,

14
(i) E[S1 ] = n λ−1
1 ,
 
(ii) E[T S1 ] = 1 + n−1
n+1 ρ λ−1
1 ,

(iii) E[S12 ] = n(n + 1) λ−2


1 ,
 
(iv) E[T S12 ] = n + 1 + (n−1)(n+2)
n+1 ρ− n−1
n+1 ρ2 λ−2
1 ,
 
2(n−1)(n+6) (n−1)(n2 +n−18)
(v) E[T 2 S12 ] = n+3
n+1 + (n+1)(n+2) ρ+ (n+1)(n+2)(n+3) ρ2 λ−2
1 ,
 
n(n+1) 2(n+1)
(vi) E[S12 /S2 ] = n−1 − n−1 ρ+ 2
n−1 ρ2 λ2 λ−2
1 ,
 
n2 −2n−7 2(n−3)
(vii) E[T S12 /S2 ] = n+1
n−1 + n2 −1
ρ− n2 −1
ρ2 λ2 λ−2
1 ,
 
2(n2 +3n−16) n3 −4n2 −27n+78
(viii) E[T 2 S12 /S2 ] = n+3
n2 −1
+ (n2 −1)(n+2)
ρ+ (n2 −1)(n+2)(n+3)
ρ2 λ2 λ−2
1 ,
 
n(n+1) 4(n+1)
(ix) E[S12 /S22 ] = (n−1)(n−2) − (n−1)(n−2) ρ+ 6
(n−1)(n−2) ρ2 λ22 λ−2
1 ,
 
n2 −5n−12 3(n−5)
(x) E[T S12 /S22 ] = n+1
(n−1)(n−2) + (n2 −1)(n−2)
ρ− (n2 −1)(n−2)
ρ2 λ22 λ−2
1 ,
 
2(n2 +n−26) n3 −8n2 −27n+178
(xi) E[T 2 S12 /S22 ] = n+3
(n2 −1)(n−2)
+ (n2 −1)(n2 −4)
ρ+ (n2 −1)(n2 −4)(n+3)
ρ2 λ22 λ−2
1 ,

Proof. The marginal distribution of S1 is Gamma(n, 1/λ1 ), thus (i), (iii) are immediate.
From the rest, we will prove only (v) since the proofs of the other parts are similar.

Let K = (K1 , . . . , Kn ) be a random sample from the geometric distribution (4.1) and
set Ui = Xi S1−1 , Vi = Yi S2−1 , i = 1, . . . , n. Then

n
!2 
X
E[T 2 |K = k] = E 

Ui Vi K = k
i=1
  

n n X n
 X 2 2
X  
= E
 U V
i i + Ui Vi Uj Vj 
 K = k

i=1 i=1 j=1
j6=i
= (n + k)−2 (n + k + 1)−2
 
n n X
n
X 2 2
X
(ki + 1)2 (kj + 1)2 

×
 (ki + 1) (ki + 2) +  ,
i=1 i=1 j=1
j6=i

E[S12 |K = k] = (n + k)(n + k + 1)(1 − ρ)2 λ−2


1 ,

15
yielding

E[T 2 S12 ] = E[E(T 2 S12 |K)] = E[E(T 2 |K)E(S12 |K)]


n(K1 + 1)2 (K1 + 2)2 + n(n − 1)(K1 + 1)2 (K2 + 1)2 1−ρ 2
    
= E E K
(n + K)(n + K + 1) λ1
  2
n+3 4(n + 5) 4(n + 5) 1−ρ
= E + K+ K2 .
n + 1 (n + 1)(n + 2)(n + 3) (n + 1)(n + 3) λ1
Here the last equality follows from Lemma 4.1(iv), (v). Substituting the moments of K
in the last expression from Lemma 4.1(vi), we obtain the desired result.

Acknowledgment
The author wishes to thank the referees for their suggestions which improved the results
and the presentation of the paper.

References
Al-Saadi, S. D., Scrimshaw, D. G. and Young, D. H. (1979). Tests for independence of
exponential variables. J. Statist. Comput. Simul., 9, 217–233.

Al-Saadi, S. D. and Young, D. H. (1980). Estimators for the correlation coefficient in a


bivariate exponential distribution. J. Statist. Comput. Simul., 11, 13–20.

Al-Saadi, S. D. and Young, D. H. (1982). A test for independence in a multivariate


exponential distribution with equal correlation coefficient. J. Statist. Comput. Simul.,
14, 219–227.

Balakrishnan, N. and Ng, H. K. T. (2001). Improved estimation of the correlation coeffi-


cient in a bivariate exponential distribution. J. Statist. Comput. Simul., 68, 173–184.

Downton, F. (1970). Bivariate exponential distributions in reliability theory. J. Roy.


Statist. Soc. B, 32, 408–417.

Gelfand, A. E. and Dey, D. K. (1988). On the estimation of a variance ratio. J. Statist.


Plann. Inference, 19, 121–131.

Ghosh, M. and Kundu, S. (1996). Decision theoretic estimation of the variance ratio.
Statist. Decisions, 14, 161–175.

Iliopoulos, G. (2001). Decision theoretic estimation of the ratio of variances in a bivariate


normal distribution. Ann. Inst. Statist. Math., 53, 436–446.

Kibble, W. F. (1941). A two-variate gamma type distribution. Sankhyã, 5, 137–150.

16
Kotz, S., Balakrishnan, N. and Johnson, N. L. (2000). Continuous Multivariate Distribu-
tions, 1, Second edition. New York, Wiley.

Kubokawa, T. (1994). Double shrinkage estimation of ratio of scale parameters. Ann.


Inst. Statist. Math., 46, 95–116.

Kubokawa, T. and Srivastava, M. S. (1996). Double shrinkage estimators of ratio of


variances. Multidimensional Statistical Analysis and Theory of Random Matrices (eds.
A.K. Gupta and V.L. Girko), 139–154, VSP, Netherlands.

Madi, T. M. (1995). On the invariant estimation of a normal variance ratio. J. Statist.


Plann. Inference, 44, 349–357.

Madi, T. M. and Tsui, K. W. (1990). Estimation of the ratio of the scale parameters
of two exponential distributions with unknown location parameters. Ann. Inst. Statist.
Math., 42, 77–87.

Moran, P. A. P. (1967). Testing for correlation between non-negative variates. Biometrika,


54, 385–394.

Nagao, M. and Kadoya, M. (1971). Two-variate exponential distribution and its numerical
table for engineering application. Bulletin of the Disaster Prevention Institute, Kyoto
University, 20, 183–215.

Stein, C. (1964). Inadmissibility of the usual estimator for the variance of a normal
distribution with unknown mean. Ann. Inst. Statist. Math., 16, 155–160.

17

You might also like