Professional Documents
Culture Documents
To cite this article: Surbhi Suman & G. N. Singh (2019) An ameliorated stratified two-stage
randomized response model for estimating the rare sensitive parameter under Poisson
distribution, Statistics, 53:2, 395-416, DOI: 10.1080/02331888.2019.1569665
Article views: 60
1. Introduction
In the real life scenario, gathering the response or reliable response from respondents
is an arduous assignment when potentially discomforting, deplorable or incriminating
responses are sought due to sensitive nature of characteristic under study such as use
of illegal cannabis plant, sexual behaviour, mental disorder and others. The randomized
response technique is efficacious in reducing the non-response rate and inflated response
bias which is occurred due to non-response and untruthful responses. This pioneer work
is initiated by Warner [1] which uses a randomized device bearing two questions, one on
sensitive character A and other on its compliment and sample units are selected by sim-
ple random sampling with replacement method. To alleviate the risk of privacy discloser,
Horvitz et al. [2] replaced second question of Warner [1] by a question related to non-
sensitive attribute B unrelated to sensitive attribute A. Greenberg et al. [3] extended this
unrelated model when proportion of non-sensitive attribute B is unknown. Further, the
randomized response techniques have been modified for diverse circumstances by [4–11]
Moors [4], Fox and Tracy [5], Mangat and Singh [6], Mangat [7], Ryu et al. [8], Mangat
et al. [9], Singh et al. [10,11] and among others where sample is drawn from the popula-
tion under simple random sampling with replacement scheme in all the above cases. Hong
et al. [12] addressed a stratified randomized response model under the proportional allo-
cation. To improve this model, Kim and Warde [13] and Kim and Elam [14] proposed
stratified randomized response models utilizing optimal allocation and their works were
further extended by Kim and Elam [15], Adebola et al. [16] and others.
When the number of persons possessing a rare stigmatized characteristic in population
is very small and looked for large sample size to estimate this number, Land [17] address
the use of Poisson distribution under [3] randomized response model to overcome this
distinct problem. Lee et al. [18] and [20] developed this work for stratified sampling and
stratified double sampling according to availability of stratum size (known and unknown
respectively) utilizing the Poisson distribution looked into [17] work.
Motivated by above mentioned works, we have suggested an ameliorated two-stage
unrelated randomized response method for estimating the mean number of persons in
population with a rare characteristic under stratified sampling scheme using Poisson distri-
bution. Based on availability of the stratum size, the work is extended for double stratified
sampling scheme. The properties of the resultant estimators are discussed in both cases
when proportion of unrelated rare non-sensitive attribute is known as well as unknown.
The proportional and optimum allocation methods are considered for detailed study. The
empirical studies have been accomplished to support the discussed theory.
2.1. When the proportion of an unrelated rare non-sensitive attribute (πb ) is known
Let be a finite population of size N which is composed into L strata of sizes Ni , i =
1, 2 · · · L. A sample of size ni is selected from ith stratum by simple random sampling with
replacement (SRSWR) such that total sample size n = Li=1 ni . In this section, the pro-
cedure is continued with assumption that the stratum size Ni and the proportion πib of
unrelated rare non-sensitive attribute B for ith stratum are known. Each person selected in
the sample from ith stratum is requested to answer ‘yes’ or ‘no’ using randomized response
devices (R1i , R2i ) which consist the deck of cards (Table 1):
If the statement (3) appears, then it is needed to repeat the process without replacing the
card. Once statement (3) is reappeared in the second draw, then answer ‘no’ is reported by
interviewee.
Following the instruction of above randomized response model, the probability of
getting answer ‘yes’ in ith stratum is given as
ki
ζi0 = Ui πia + (1 − Ui ) (P1i πia + P2i πib ) 1 + P3i , (1)
ki − 1
STATISTICS 397
where for ith stratum, ki be the total number of cards in device R2i , πia and πib are the
population proportions of the attributes A and B respectively. Since, the attributes A and
B under study are assumed to be very rare in the population, therefore, for a large sample
ni from ith stratum, i.e, ni → ∞ and ζi0 → 0, ni ζi0 = λi0 > 0. Therefore, Equation (1) is
rewritten as
ki
λi0 = Ui λia + (1 − Ui ) (P1i λia + P2i λib ) 1 + P3i , (2)
ki − 1
where ni πia = λia > 0 and ni πib = λib > 0 as πia → 0 and πib → 0 respectively.
Let xi1 , xi2 , . . . , xini be a random sample of size ni observations from the ith stratum
follow the Poisson distribution with parameter λi0 .
The likelihood function of the random sample of ni observations is obtained as
ni −λi0 xij
e λ i0
L(xij , λi0 ) = . (3)
xij !
j=1
ni
− log(xij !). (4)
j=1
Maximizing Equation (4) with respect to parameter λia and after simplifying it, the esti-
mator λ̂ia for mean number of persons bearing the rare sensitive characteristic in the ith
stratum is given as
⎡ ⎤
ni
1 1 ki
λ̂ia = ⎣ xij − (1 − Ui )P2i 1 + P3i λib ⎦ .
ki ni j=1 ki − 1
Ui + (1 − Ui )P1i 1 + P3i
ki − 1
(5)
398 S. SUMAN AND G. N. SINGH
Therefore, the estimator λ̂a for mean number of persons in population with rare sensitive
characteristics (λa ) is proposed under stratified population as
⎡ ⎤
L ni
Wi 1 k
⎣ λib ⎦,
i
λ̂a = xij − (1 − Ui )P2i 1 + P3i
ki ni j=1 ki − 1
i=1 Ui + (1 − Ui )P1i 1 + P3i
ki − 1
(6)
Ni
where Wi = .
N
Theorem 2.1: The proposed estimator λ̂a is an unbiased estimator of the parameter λa .
Proof: Since the random variable xij follows Poisson distribution with parameter λi0 ,
hence,
⎡ ⎤
L ni
Wi 1 k
⎣ λib ⎦ .
i
E(λ̂a ) = λi0 − (1 − Ui )P2i 1 + P3i
ki ni j=1 ki − 1
i=1 Ui + (1 − Ui )P1i 1 + P3i
ki − 1
Substituting the value of λi0 from Equation (2) in above expression and simplify, we have
E(λ̂a ) = λa .
1 ni
j=1 V(xij )
n2i
V(λ̂ia ) = 2
ki
Ui + (1 − Ui )P1i 1 + P3i
ki − 1
ni
j=1 λi0
= 2 .
ki
n2i Ui + (1 − Ui )P1i 1 + P3i
ki − 1
Substituting the value of λi0 from Equation (2) in the above equation and after some
algebraic simplifications, we have
λia
V(λ̂ia ) =
ki
ni Ui + (1 − Ui )P1i 1 + P3i
ki − 1
ki
(1 − Ui )P2i 1 + P3i λib
ki − 1
+ 2 . (9)
ki
ni Ui + (1 − Ui )P1i 1 + P3i
ki − 1
Substituting the value of V(λ̂ia ) from Equation (9) in Equation (8), we have the expression
for the variance of the estimator λ̂ia as given in Equation (7).
Theorem 2.3: The unbiased estimate of the variance of the proposed estimator λ̂a is
given by
ni
L
j=1 xij
2
V̂(λ̂a ) = Wi 2 . (10)
ki
i=1 ni Ui + (1 − Ui )P1i 1 + P3i
2
ki − 1
Proof: Taking expectation both sides of Equation (10) and utilizing E(xij ) = λi0 as xij ∼
P(λi0 ); we may easily prove that V̂(λ̂a ) is an unbiased estimator of V(λ̂a ).
2.1.2. Allocation of sample size and variance under different systems of allocation
The expression for the variance of the estimator given in Equation (7) is the function
of sample size ni , i.e., the precision of the proposed estimator under stratified sampling
depends upon the selection of sample size ni from ith stratum (i = 1, 2, . . . L). The allo-
cation method for selection of sample from different strata is based on the availability of
prior information of stratum variance.
(I) Proportional allocation: When the stratum size Ni is known, while the variances of
strata are unknown, the proportional allocation is used to draw the sample form the strata.
400 S. SUMAN AND G. N. SINGH
Ni
In proportional allocation ni ∝ Ni and ni = n and the variance of the proposed
N
estimator under proportional allocation is derive as
⎡ λia ⎤
⎢ k ⎥
⎢ Ui + (1 − Ui )P1i 1 + P3i i ⎥
⎢ k i−1 ⎥
1
L ⎢ ⎥
⎢ ⎥
V(λ̂a )p. = Wi ⎢ ki ⎥. (11)
n ⎢ (1 − Ui )P2i 1 + P3i λib ⎥
i=1 ⎢ ki − 1 ⎥
⎢ + 2 ⎥
⎣ ki ⎦
Ui + (1 − Ui )P1i 1 + P3i
ki − 1
(II) Optimum allocation: It is a method to define sample size by minimizing variance for a
given cost or to minimizing the cost for given variance. The cost function is defined under
stratified sampling [21] as
L
C = c0 + ni ci , (12)
i=1
where c0 denotes overhead cost, whereas ci be the survey cost per unit in the ith stratum.
Under optimum allocation, the sample size ni from ith stratum is given by
√ √
(Wi ηi )/ ci
ni = n L √ √ , (13)
i=1 (Wi ηi )/ ci
where
ki
(1 − Ui )P2i 1 + P3i λib
λia ki − 1
ηi = + 2
ki ki
Ui + (1 − Ui )P1i 1 + P3i Ui + (1 − Ui )P1i 1 + P3i
ki − 1 ki − 1
Table 2. Layout of the proposed stratified rrm for giving answer first time.
First stage randomized response device R11i
Outcomes Statements Probability of selection
(1) Do you possess rare stigmatized characteristic A ? U1i
(2) Go to randomized device R12i 1 − U1i
Second stage randomized response device R12i
Outcomes Statements Probability of selection
(1) Do you possess the rare stigmatized characteristic A ? P1i
(2) Do you possess the unrelated non-sensitive attribute B? P2i
(3) Blank card P3i
Table 3. Layout of the proposed stratified rrm for giving answer second time.
First stage randomized response device R21i
Outcomes Statements Probability of selection
(1) Do you possess the rare stigmatized characteristic A? U2i
(2) Go to randomized device R22i 1 − U2i
Second stage randomized response device R22i
Outcomes Statements Probability of selection
(1) Do you possess the rare stigmatized characteristic A? Q1i
(2) Do you possess the unrelated rare non-sensitive attribute B? Q2i
(3) Blank card Q3i
response devices (R11i , R12i ) and later, same question is answered by same interviewee
using second set of randomized response devices (R21i , R22i ) (Tables 2 and 3).
If the statement (3) is selected by respondent during selection, then it is required to
repeat the process without replacing the card. In the second draw, if statement (3) is
reappeared, then respondent is suggested to report ‘No’.
The probabilities of getting answer ‘yes’ from the respondent using above randomized
response devices are
ki
ζi1 = U1i πia + (1 − U1i ) (P1i πia + P2i πib ) 1 + P3i
ki − 1
and
ki
ζi2 = U2i πia + (1 − U2i ) (Q1i πia + Q2i πib ) 1 + Q3i .
ki − 1
For ni → ∞, as ζi1 → 0 and ζi2 → 0, we have ni ζi1 = λ∗ia > 0 and ni ζi2 = λ∗ib > 0. Let
xi11 , xi12 , . . . , xi1n and xi21 , xi22 , . . . , xi2n be the random samples of size of ni observa-
tions from ith stratum follow the Poisson distribution with parametre λ∗ia > 0 and λ∗ib > 0
respectively. Proceeding in similar fashion as described in the previous section, we get
1
ni ki
xi1j = U1i λ̂ia + (1 − U1i ) P1i λ̂ia + P2i λ̂ib 1 + P3i (15)
ni j=1 ki − 1
and
1
ni ki
xi2j = U2i λ̂ia + (1 − U2i ) Q1i λ̂ia + Q2i λ̂ib 1 + Q3i . (16)
ni j=1 ki − 1
402 S. SUMAN AND G. N. SINGH
Solving Equations (15) and (16), the estimators for mean number of persons in population
possessing rare sensitive attribute A and non-sensitive attribute B in the ith stratum are
suggested as follow:
⎡
ni
1 ⎣ ki
λ̂iau = Q2i (1 − U2i ) 1 + Q3i xi1j
ni i1 ki − 1 j=1
⎤
ni
ki
−(1 − U1i )P2i 1 + P3i xi2j ⎦ , (17)
ki − 1 j=1
where
ki ki
i1 = U1i (1 − U2i )Q2i 1 + Q3i − U2i (1 − U1i )P2i 1 + P3i
ki − 1) ki − 1
ki ki
+ (1 − U1i )(1 − U2i )(P1i Q2i − P2i Q1i ) 1 + P3i 1 + Q3i = 0
ki − 1 ki − 1
and
ni
1 ki
λ̂ibu = U2i + (1 − U2i )Q1i 1 + Q3i xi1j
ni i2 ki − 1
j=1
ni
ki
− U1i + (1 − U1i )P2i 1 + P3i xi2j (18)
ki − 1 j=1
where
ki ki
i2 = U2i (1 − U1i )P2i 1 + P3i − U1i (1 − U2i )Q2i 1 + Q3i
ki − 1) ki − 1
ki ki
+ (1 − U1i )(1 − U2i )(P2i Q1i − P1i Q2i ) 1 + P3i 1 + Q3i = 0.
ki − 1 ki − 1
From Equations (17) and (18), we have proposed the estimators λ̂au and λ̂bu for mean
number of persons in population possessing rare sensitive attribute A and non-sensitive
attribute B respectively, which are as follows:
L
λ̂au = Wi λ̂iau (19)
i=1
and
L
λ̂bu = Wi λ̂ibu , (20)
i=1
where λ̂iau and λ̂ibu are derived in Equations (17) and (18), respectively.
STATISTICS 403
Theorem 2.4: The proposed estimators λ̂au and λ̂bu are unbiased for parameters λa and λb ,
respectively.
Proof:
L
E(λ̂au ) = Wi E(λ̂iau ). (21)
i=1
We consider,
ni
1 ki
E(λ̂iau ) = Q2i (1 − U2i ) 1 + Q3i E(xi1j ) − (1 − U1i )P2i
ni i1 ki − 1
j=1
ni
ki
1 + P3i E(xi2j ) .
ki − 1 j=1
Since,
ki
xi1j ∼ P(λ∗ia ) ⇒ E(xi1j ) = λ∗ia = U1i λ̂ia + (1 − U1i ) P1i λ̂ia + P2i λ̂ib 1 + P3i
ki − 1
and
ki
xi2j ∼ P(λ∗ib ) ⇒ E(xi2j ) = λ∗ib = U2i λ̂ia + (1 − U2i ) Q1i λ̂ia + Q2i λ̂ib 1 + Q3i ,
ki − 1
therefore, we get, E(λ̂iau ) = λia .
Putting the value of E(λ̂iau ) in Equation (21), we get E(λ̂au ) = λa .
In similar manner, we may show that E(λ̂bu ) = λb .
Theorem 2.5: The variances of the estimators λ̂au and λ̂bu are given as
L
Wi2
V(λ̂au ) = ηi1 (22)
i=1
2i1 ni
and
L
Wi2
V(λ̂bu ) = η ,
2 n i2
(23)
i=1
i2 i
where
ηi1 = a1i b22i + a22i b1i − 2a1i a2i b1i b2i λia + a22i b2i + b22i a2i − 2a22i b22i λib ,
ηi2 = a21i b1i + b21i a1i − 2a21i b21i λia + a2i b21i + a21i b2i − 2a1i a2i b1i b2i λib ,
ki ki
a1i = U1i + (1 − U1i )P1i 1 + P3i , a2i = P2i (1 − U1i ) 1 + P3i ,
ki − 1 ki − 1
ki ki
b1i = U2i + (1 − U2i )Q1i 1 + Q3i , and b2i = Q2i (1 − U2i ) 1 + Q3i .
ki − 1 ki − 1
404 S. SUMAN AND G. N. SINGH
Theorem 2.6: The unbiased estimates of the variances V(λ̂au ) and V(λ̂bu ) are given by
L
Wi2
V̂(λ̂au ) = η̂
2 n i1
(26)
i=1
i1 i
and
L
Wi2
V̂(λ̂bu ) = η̂i2 , (27)
i=1
2i2 ni
where η̂i1 and η̂i2 are the sample estimates of ηi1 and ηi2 , respectively.
STATISTICS 405
2.2.2. Allocation of sample size and variance under different systems of allocation
Theorem 2.7: The variances of the estimators λ̂au and λ̂bu under proportional allocation are
given as
L
Wi
V(λ̂au )p. = ηi1 (28)
i=1
2i1 n
and
L
Wi
V(λ̂bu )p. = ηi2 . (29)
i=1
2i2 n
Theorem 2.8: Under optimum allocation, the sample size from ith stratum is
√
Wi ηi1 /i1
√ L √
ci Wi ηi1 /i1
ni = n √
i=1
ci
is selected by simple random sampling with replacement and respondents selected in the
this sample are asked a direct question ‘Do you belong in ith stratum?’ and thus the first
sample is categorized into L strata of size n
i (i = 1, 2, . . . L). The stratum weights Wi and
wi are defined as follows:
Ni n
Wi = and wi = i
(i = 1, 2, . . . L).
N n
It is obvious that wi is an unbiased estimator of Wi .
In the second phase, ni respondents are randomly selected by SRSWR from the first
phase samples n
i in the ith stratum.
406 S. SUMAN AND G. N. SINGH
3.1. When the proportion of an unrelated rare non-sensitive attribute (πb ) is known
When the proportion of a rare unrelated non-sensitive attribute is known, the random-
ized device as described in Section 2.1 is utilize to get the response from the respondents.
Following the procedure in Section 2.1, the estimator for the parameter λa is obtained as
L
1
λ̂ad = wi
ki
i=1 Ui + (1 − Ui )P1i 1 + P3i
ki − 1
⎡ ⎤
ni
1 k
×⎣ λib ⎦ .
i
xij − (1 − Ui )P2i 1 + P3i (31)
ni ki − 1
j=1
Proof: We consider
L
E(λ̂ad ) = E wi λ̂ia
i=1
L
= E1 wi E2 (λ̂ia ) wi
i=1
⎡ ⎤
1 ni ki
j=1 E2 (xij ) − (1 − Ui )P2i 1 + P3i λib
⎢ L
ni ki − 1 ⎥
= E1 ⎢
⎣ w i wi ⎥
⎦
ki
i=1 Ui + (1 − Ui )P1i 1 + P3i
ki − 1
⎡ ⎤
1 ni ki
λ − (1 − U )P 1 + P λ
⎢ L
ni j=1 i0 i 2i 3i
ki − 1
ib ⎥
⎢
= E1 ⎣ wi wi ⎥
ki ⎦
Ui + (1 − Ui )P1i 1 + P3i
i=1
ki − 1
L
= Wi λia = λa .
i=1
ni
where υi = is a fixed constant for ith stratum and ηi is defined in Equation (13).
n
i
STATISTICS 407
h=1 h=1
i=1 h=1
Putting the expressions for first and second terms from Equations (34) and (35) in
Equation (33), we obtain the expression for variance of estimator λ̂ad as given in
Equation (32).
3.1.2. Allocation of sample size and variance under different systems of allocation
In the double stratified sampling under proportional allocation, the second phase sample
size ni is determined using the first phase sample n
and n
i .
n
i
ni = n
n
1
L
1
L
2
V λ̂ad =
Wi (λia − λa ) + Wi ηi . (36)
p. n i=1 n i=1
L
C = c
n
+ ci ni , (37)
i=1
Here, ni is a random variable, therefore, expectation is taken over the cost function for
optimizing the values of υi over different strata and n
L
L
∗
Putting the optimum value of υi in Equation (38), the optimum value of sample size n
is
obtained as
C∗
n
= L . (40)
c
+ i=1 Wi ci υi
Lemma 3.4: Under optimum allocation, the variance of the estimator λ̂ad is given by
⎡
L ⎤2
1 ⎣
L
√
V λ̂ad = ∗ c Wi (λia − λa )2 + Wi ci ηi ⎦ . (41)
opt. C
i=1 i=1
L
λ̂aud = wi λ̂iau (42)
i=1
and
L
λ̂bud = wi λ̂ibu , (43)
i=1
where λ̂iau and λ̂ibu are derived in Equations (17) and (18), respectively.
Proof:
L
L
L
E(λ̂aud ) = E1 E2 wi λ̂iau = E1 wi λiau = Wi λiau = λa .
i=1 i=1 i=1
− 1 , (44)
n i1 n υi 2
i1
i=1 i=1 i=1
ni
where υi = is a fixed constant for ith stratum and ηi1 is defined in Equation (22).
n
i
Wi 2 − 1 +
Wi 2 (46)
i=1
n i1 υi n i1
h=1 h=1
Putting the expressions for first and second terms from Equations (46) and (47) in
Equation (45), we obtain the expression for variance of estimator λ̂ad as given in
Equation (44).
Lemma 3.7: Under proportion allocation, the sample size ni in ith stratum is given as
n
i
ni = n
n
1
L
1
L
ηi1
2
V λ̂aud =
Wi (λiau − λa ) + Wi 2 . (48)
p. n i=1 n i=1 i1
410 S. SUMAN AND G. N. SINGH
Using the cost function given in Equation (37) and applying the Cauchy–Schwarz inequal-
ity, the optimum values of υi over different strata and sample size n
are given as
c
ηi1 1
υi = × 2 L
ci i1 i=1 Wi (λiau − λa )2
and
C∗
n
= L . (49)
c
+ i=1 Wi ci υi
Lemma 3.8: The variance of the estimator λ̂aud under optimum allocation is obtained as
⎡
L ⎤2
! " 1 ⎣
L
η
Wi ci 2 ⎦ .
i1
V λ̂aud (o) = ∗ c Wi (λiau − λa )2 + (50)
C
i=1 i=1
i1
4. Empirical comparison
In this section, we have accomplished the empirical studies to get tangible idea about the
effectiveness of the proposed stratified randomized response model and resultant estima-
tors λ̂a and λ̂au over the stratified randomized response models proposed in [18] and
[20] and their resultant estimators. For empirical studies, we consider the population of
size N = 100, 000 which is divided into two strata with stratum weights W1 = 0.6 and
W2 = 0.4. A sample of size 10, 000 is drawn by simple random sampling with replace-
ment scheme where n1 = 6000 and n2 = 4000. It is assumed that the number of cards in
proposed deck in both strata are equal i.e, k1 = k2 = 100(say). The per cent relative effi-
ciencies of the resultant estimators of proposed stratified model are contemplated under
two different cases:
Case ( I ): Stratified sampling when the proportion of rare non-sensitive attribute B is
known: The per cent relative efficiencies of the estimator λ̂a with respect to (λ̂1 )l and (λ̂1 )l1
are defined as
where (λ̂1 )l and (λ̂1 )l1 are the estimators based on randomized response models described
in [18] and [20], respectively, when the proportion of unrelated rare non-sensitive attribute
is known. It is assumed that P11 = P12 = P1 , P21 = P22 = P2 , P31 = P32 = P3 and U1 =
U2 = U. The per cent relative efficiencies E11 and E12 depend on the different choices of
parameters λ1a , λ2a , U, P1 , P2 and P3 . The empirical results are presented in Tables 4 and 5
for different choices of above mentioned parameters.
Case(II): Stratified sampling when the proportion of unrelated rare non-sensitive
attribute B is unknown
STATISTICS 411
Table 4. Per cent relative efficiencies λ̂a of the estimator with respect to (λ̂1 )l .
P1 0.4 0.5 0.6 0.7 0.8
P2 0.4 0.1 0.4 0.2 0.2 0.1 0.2 0.1 0.1
P3 0.2 0.5 0.1 0.3 0.2 0.3 0.1 0.2 0.1
U λ1a λ2a
0.3 0.5 0.5 268.32 558.34 216.47 380.37 184.56 271.12 162.9 198.31 147.12
1 254.43 485.64 205.4 336.46 175.32 244.88 155.01 183.66 140.33
1.5 243.5 436.39 197.04 307.04 168.64 227.45 149.56 174 135.89
1 0.5 241.12 426.47 195.26 301.15 167.25 223.97 148.46 172.07 135.01
1 232.74 393.43 189.09 281.6 162.5 212.48 144.75 165.74 132.11
1.5 225.79 368.21 184.1 266.75 158.76 203.79 141.89 160.97 129.93
1.5 0.5 224.24 362.81 183 263.58 157.95 201.94 141.28 159.95 129.46
1 218.63 344 179.06 252.56 155.07 195.51 139.13 156.43 127.86
1.5 213.83 328.7 175.75 243.63 152.68 190.31 137.38 153.59 126.57
0.5 0.5 0.5 408.19 674.31 296.64 438.29 231 302.26 188.1 216.06 157.85
1 374.31 576.78 273.41 382.28 214.29 269.72 175.92 197.98 149.14
1.5 349.2 512.53 256.73 345.63 202.69 248.54 167.76 186.27 143.53
1 0.5 343.9 499.76 253.27 338.36 200.32 244.35 166.12 183.96 142.43
1 325.66 457.64 241.5 314.46 192.38 230.61 160.7 176.39 138.81
1.5 311.08 425.9 232.25 296.51 186.23 220.31 156.58 170.72 136.11
1.5 0.5 307.88 419.16 230.24 292.7 184.91 218.13 155.7 169.53 135.54
1 296.5 395.77 223.15 279.51 180.28 210.58 152.64 165.38 133.57
1.5 286.98 376.9 217.27 268.88 176.49 204.51 150.17 162.06 131.99
0.7 0.5 0.5 595.67 798.7 397.64 500.1 286.15 335.38 216.36 234.9 169.2
1 525.7 672.47 354.8 430.22 258.68 295.66 198.67 212.91 158.32
1.5 477.13 591.46 325.68 385.53 240.42 270.32 187.18 198.93 151.42
1 0.5 467.22 575.54 319.8 376.76 236.77 265.36 184.91 196.2 150.07
1 433.95 523.54 300.21 348.14 224.7 249.17 177.45 187.28 145.67
1.5 408.25 484.81 285.24 326.86 215.58 237.15 171.87 180.67 142.42
1.5 0.5 402.71 476.62 282.03 322.36 213.63 234.62 170.69 179.27 141.73
1 383.32 448.38 270.84 306.87 206.88 225.88 166.6 174.47 139.37
1.5 367.44 425.75 261.74 294.46 201.42 218.88 163.32 170.63 137.48
0.9 0.5 0.5 845.44 931.02 524.64 565.75 351.64 370.53 248.04 254.87 181.23
1 713.79 772.24 451.31 480.18 309.02 322.68 223.37 228.48 167.87
1.5 628.91 672.8 404.41 426.64 281.97 292.77 207.85 211.99 159.54
1 0.5 612.2 653.49 395.21 416.24 276.69 286.96 204.82 208.79 157.92
1 557.51 590.86 365.18 382.54 259.48 268.15 195.01 198.42 152.69
1.5 516.71 544.72 342.85 357.72 246.73 254.29 187.76 190.79 148.83
1.5 0.5 508.08 535.02 338.13 352.51 244.04 251.38 186.23 189.19 148.02
1 478.27 501.69 321.88 334.59 234.79 241.38 180.99 183.68 145.25
1.5 454.36 475.14 308.86 320.31 227.39 233.42 176.81 179.3 143.03
Note: λ1b = λ2b λb = 1.0.
In this case, we have also considered the stratum weights as W1 = 0.7 and W1 = 0.3 and
assumed that P11 = P12 = P1 , P21 = P22 = P2 , P31 = P32 = P3 , Q11 = Q12 = Q1 , Q21 =
Q22 = Q2 , Q31 = Q32 = Q3 , U11 = U12 = U1 and U21 = U22 = U2 .
The per cent relative efficiencies of the estimator λ̂au with respect to (λ̂1u )l and (λ̂1u )l1
are given by
V[(λ̂1u )l ] V[(λ̂1u )l1 ]
E21 = × 100 and E22 = × 100,
V[λ̂au ] V[λ̂au ]
where (λ̂1u )l and (λ̂1u )l1 are the estimators based on randomized response models
described in [18] and [20], respectively when the proportion of unrelated rare non-
sensitive attribute is unknown.
The per cent relative efficiencies E21 and E22 depend on the different choices of param-
eters λ1a , λ2a P1 , P2 , P3 , Q1 , Q2 , Q3 , U1 and U2 . The empirical results are presented in
Tables 6 and 7 for different choices of above mentioned parameters.
412 S. SUMAN AND G. N. SINGH
Table 5. Per cent relative efficiencies λ̂a of the estimator with respect to (λ̂1 )l1 .
P1 0.4 0.5 0.6 0.7 0.8
P2 0.4 0.1 0.4 0.2 0.2 0.1 0.2 0.1 0.1
P2 0.2 0.5 0.1 0.3 0.2 0.3 0.1 0.2 0.1
U λ1a λ2a
0.3 0.5 0.5 113.26 235.68 115.28 202.57 117.18 172.14 119.04 144.92 120.94
1 112.26 214.28 113.91 186.59 115.37 161.14 116.7 138.27 117.94
1.5 111.47 199.78 112.87 175.88 114.05 153.83 115.08 133.88 115.98
1 0.5 111.3 196.85 112.65 173.74 113.78 152.37 114.75 133 115.59
1 110.69 187.12 111.89 166.62 112.85 147.56 113.65 130.13 114.31
1.5 110.19 179.7 111.27 161.22 112.12 143.91 112.8 127.96 113.35
1.5 0.5 110.08 178.11 111.13 160.07 111.96 143.14 112.61 127.5 113.15
1 109.68 172.57 110.64 156.06 111.39 140.44 111.98 125.9 112.44
1.5 109.33 168.07 110.23 152.81 110.92 138.26 111.45 124.61 111.87
0.5 0.5 0.5 108.3 178.9 109.87 162.33 111.38 145.73 112.85 129.63 114.32
1 107.49 165.64 108.8 152.12 110.01 138.46 111.13 125.07 112.2
1.5 106.9 156.9 108.03 145.44 109.06 133.73 109.98 122.12 110.83
1 0.5 106.77 155.16 107.88 144.12 108.86 132.79 109.75 121.53 110.56
1 106.34 149.43 107.34 139.76 108.21 129.72 108.99 119.62 109.68
1.5 105.99 145.12 106.91 136.49 107.71 127.42 108.41 118.2 109.02
1.5 0.5 105.92 144.2 106.82 135.8 107.6 126.93 108.28 117.9 108.88
1 105.65 141.02 106.49 133.39 107.22 125.24 107.85 116.85 108.4
1.5 105.42 138.45 106.22 131.45 106.91 123.88 107.5 116.01 108.01
0.7 0.5 0.5 104.54 140.16 105.49 132.67 106.42 124.73 107.34 116.54 108.26
1 103.98 133.01 104.78 127.06 105.54 120.63 106.28 113.9 106.99
1.5 103.6 128.42 104.3 123.47 104.96 118.02 105.59 112.22 106.18
1 0.5 103.52 127.52 104.2 122.76 104.84 117.5 105.45 111.89 106.02
1 103.26 124.58 103.88 120.46 104.46 115.83 105 110.82 105.51
1.5 103.06 122.38 103.63 118.75 104.17 114.59 104.67 110.02 105.13
1.5 0.5 103.01 121.92 103.58 118.39 104.11 114.33 104.6 109.86 105.05
1 102.86 120.32 103.4 117.15 103.89 113.43 104.35 109.28 104.78
1.5 102.74 119.04 103.25 116.15 103.72 112.71 104.15 108.82 104.56
0.9 0.5 0.5 101.42 111.69 101.73 109.7 102.04 107.52 102.35 105.16 102.65
1 101.21 109.5 101.47 107.96 101.72 106.22 101.98 104.31 102.23
1.5 101.07 108.12 101.3 106.87 101.52 105.41 101.75 103.77 101.97
1 0.5 101.04 107.86 101.27 106.65 101.49 105.25 101.7 103.67 101.91
1 100.95 106.99 101.16 105.97 101.36 104.74 101.56 103.33 101.75
1.5 100.89 106.36 101.08 105.46 101.26 104.37 101.45 103.09 101.63
1.5 0.5 100.87 106.22 101.06 105.36 101.24 104.29 101.42 103.03 101.6
1 100.82 105.76 101 104.99 101.18 104.02 101.35 102.86 101.51
1.5 100.79 105.39 100.96 104.7 101.12 103.8 101.28 102.71 101.44
Note: λ1b = λ2b λb = 1.0.
The optimum value of k has been obtained by minimizing the variance of the proposed
estimator and the expression of optimum k is mathematically too complex. Looking on
the complicated mathematical expression of optimum k, we have studied the behaviour of
the per cent relative efficiencies of estimators under proposed randomized response model
for different choices of k for fixed values of λb = λ1b = λ2b = 1.0 and λ1a = 1.5; L2a =
1.5 which are shown graphically in Graphs 1 − 4. For convenience, we have chosen all
ki = k ∀ i.
5. Interpretations of results
From Tables 4–7, it is clear that the values of per cent relative efficiencies E11 , E12 , E21 and
E22 are more than 100 for all the chosen parametric cases. These results highly support that
the proposed estimators λ̂a and λ̂au are more efficient than the estimators {(λ̂1 )l , (λ̂)l1 } and
STATISTICS 413
Table 6. Per cent relative efficiencies λ̂au of the estimator with respect to (λ̂1u )l .
P1 0.6 0.6 0.7 0.7 0.8 0.8
P2 0.2 0.2 0.15 0.15 0.1 0.1
Q1 0.1 0.4 0.1 0.4 0.1 0.4
Q2 0.45 0.3 0.45 0.3 0.45 0.3
U1 U2 λ1a λ2a
0.7 0.5 0.5 0.5 211.81 629.96 162.3 277.79 126.37 157.49
1 245.93 707.04 192.34 319.72 153.61 187.94
1.5 280.05 784.13 222.39 361.65 180.85 218.39
1 0.5 196.89 547.53 154.85 250.5 124.98 150.48
1 220.17 600.06 175.09 278.73 143.12 170.75
1.5 243.45 652.6 195.32 306.96 161.26 191.03
1.5 0.5 189.17 504.92 151.07 236.66 124.28 146.98
1 206.83 544.77 166.32 257.94 137.88 162.18
1.5 224.5 584.61 181.58 279.21 151.48 177.37
0.3 0.5 0.5 209.32 627.93 161.32 277.3 126.06 157.38
1 243.03 704.76 191.18 319.16 153.23 187.8
1.5 276.75 781.6 221.04 361.02 180.4 218.23
1 0.5 194.97 546.07 154.08 250.15 124.73 150.39
1 218.02 598.47 174.22 278.34 142.83 170.65
1.5 241.08 650.86 194.35 306.52 160.94 190.91
1.5 0.5 187.52 503.73 150.4 236.37 124.07 146.9
1 205.03 543.48 165.59 257.61 137.64 162.09
1.5 222.55 583.23 180.78 278.86 151.21 177.28
0.5 0.5 0.5 0.5 171.16 515.36 142.38 245.13 117.57 146.86
1 198.73 578.42 168.73 282.14 142.91 175.25
1.5 226.3 641.48 195.09 319.14 168.26 203.64
1 0.5 164.34 461.9 138.87 225.78 117.83 142.13
1 183.77 506.21 157.02 251.22 134.93 161.28
1.5 203.2 550.53 175.17 276.66 152.03 180.43
1.5 0.5 160.63 432.93 137.03 215.64 117.96 139.73
1 175.63 467.1 150.87 235.02 130.86 154.17
1.5 190.63 501.26 164.71 254.41 143.77 168.62
0.3 0.5 0.5 165.4 510.48 139.91 243.9 116.75 146.55
1 192.04 572.95 165.8 280.71 141.92 174.88
1.5 218.69 635.41 191.7 317.53 167.08 203.21
1 0.5 159.6 458.18 136.86 224.83 117.15 141.89
1 178.47 502.14 154.74 250.16 134.16 161
1.5 197.34 546.1 172.63 275.5 151.16 180.12
1.5 0.5 156.42 429.78 135.25 214.83 117.36 139.51
1 171.03 463.69 148.91 234.14 130.2 153.94
1.5 185.64 497.61 162.57 253.45 143.04 168.36
Note: P3 = 1 − (P1 + P2 ) and Q3 = 1 − (Q1 + Q2 ).
{(λ̂1u )l , (λ̂1u )l1 } respectively when the prior information of proportion of unrelated rare
attribute is known as well as unknown.
From Tables 4 and 5, it is clear that for the fixed values of other parameters, (a) the values
of E11 increase and the values of E12 decrease for increasing values of U i.e the probability
of selection of first question in first stage randomized device, (b) the values of E11 and E12
are decreasing as the values of λ1a increase, (c) the values of E11 and E12 are decreasing as
the values of λ2a increase.
From Tables 6 and 7, it is observed that for the fixed values of other parameters, (a) the
values of E21 and E22 are decreasing as the values of U1 decrease which implies that the
proposed estimation procedure performs better when the probability of selection of first
question in the first randomized response device R11 is small, (b) the values of E21 and E22
are decreasing as the values of U2 decrease, (c) the values of E21 and E22 are decreasing as
414 S. SUMAN AND G. N. SINGH
Table 7. Per cent relative efficiencies λ̂au of the estimator with respect to (λ̂1u )l1 .
P1 0.6 0.6 0.7 0.7 0.8 0.8
P2 0.2 0.2 0.15 0.15 0.1 0.1
Q1 0.4 0.5 0.4 0.5 0.4 0.5
Q2 0.3 0.1 0.3 0.1 0.3 0.1
U1 U2 λ1a λ2a
0.7 0.5 0.5 0.5 146.42 179.98 118.64 130.53 102.15 105.73
1 161 196.31 132.36 144.88 115.54 119.31
1.5 175.59 212.64 146.09 159.22 128.93 132.88
1 0.5 138.06 164.33 116.24 125.4 103.66 106.38
1 146.98 174.31 124.5 134.03 111.59 114.42
1.5 155.89 184.3 132.75 142.65 119.52 122.46
1.5 0.5 134.38 157.45 115.22 123.2 104.3 106.66
1 140.8 164.64 121.12 129.36 109.93 112.37
1.5 147.22 171.82 127.01 135.53 115.57 118.08
0.3 0.5 0.5 117.4 130.03 106.35 111.98 97.91 99.93
1 130.46 143.77 119.43 125.36 111.07 113.2
1.5 143.52 157.5 132.5 138.74 124.23 126.47
1 0.5 115.33 125.23 106.78 111.12 100.44 101.98
1 123.32 133.62 114.64 119.16 108.24 109.84
1.5 131.31 142.02 122.51 127.2 116.04 117.7
1.5 0.5 114.42 123.11 106.96 110.74 101.51 102.84
1 120.18 129.16 112.59 116.49 107.05 108.42
1.5 125.93 135.2 118.21 122.24 112.59 114.01
0.5 0.5 0.5 0.5 430.07 1212.5 203.02 299.83 128.34 145.56
1 457.99 1280.9 220.17 322 142.5 160.61
1.5 485.91 1349.3 237.32 344.17 156.66 175.67
1 0.5 370.96 1009.2 183.37 260.06 123.82 137.11
1 388.75 1052.7 193.98 273.77 132.36 146.19
1.5 406.54 1096.3 204.59 287.48 140.89 155.26
1.5 0.5 343.31 914.21 174.57 242.26 121.87 133.47
1 356.37 946.17 182.25 252.18 127.98 139.96
1.5 369.42 978.12 189.93 262.1 134.09 146.46
0.3 0.5 0.5 182.12 264.33 137.46 163.97 111.47 119.06
1 197.16 283.67 151.18 179.09 124.74 132.74
1.5 212.2 303 164.91 194.21 138.01 146.41
1 0.5 168.57 235.63 131.42 152.42 110.79 116.65
1 178.17 247.95 139.92 161.78 118.8 124.9
1.5 187.77 260.28 148.42 171.14 126.8 133.14
1.5 0.5 162.21 222.2 128.71 147.25 110.5 115.62
1 169.26 231.24 134.86 154.02 116.23 121.52
1.5 176.32 240.29 141.02 160.79 121.96 127.42
Note: P3 = 1 − (P1 + P2 ) and Q3 = 1 − (Q1 + Q2 ).
the values of λ1a increase, (d) the values of E21 and E22 are increasing as the values of λ2a
increase.
According to the above numerical results, the patterns indicate that the proposed
stratified estimation procedure is more efficient than Lee et al. [18] and [20] estimation
procedure when the proportion of the persons possessing sensitive attribute in sample
selected from strata is small.
From Graphs 1–4, it is observed that the per cent relative efficiencies are decreasing as
the values of k increase and after a particular value of k, the values of per cent relative
efficiencies almost get stabilized. On the basis of empirical studies of proposed estima-
tor for a particular randomized response device, we may provide the range of ‘k’ and
finally, we arrive on a conclusion, the best choice of k may be in the interval (3, 100). The
maximum possible per cent relative efficiencies are being observed at the lowest choices
of k.
STATISTICS 415
Acknowledgements
Authors are thankful to the Indian Institute of Technology (Indian School of Mines), Dhanbad for
providing financial and necessary infrastructural support to carry out the present research work.
Authors are also thankful to the reviewers for their valuable suggestions which improved the quality
of the paper.
Disclosure statement
No potential conflict of interest was reported by the authors.
References
[1] Warner SL. Randomized response: a survey technique for eliminating evasive answer bias.
J Amer Statist Assoc. 1965;60:63–69.
[2] Horvitz DG, Shah BV, Simmins WR, The unrelated question randomized response model.
Proceedings of the American statistical association, social statistics section; Washington, DC:
American Statistical Association; 1967. pp. 65–72.
[3] Greenberg BG, Abul-Ela AA, Simmins WR, et al. The unrelated question randomized response
model: theoretical framework. J Amer Statist Assoc. 1969;64:520–539.
[4] Moors JA. Optimization of the unrelated question randomized responsemodel. J Amer Statist
Assoc. 1971;66:627–629.
[5] Fox JA, Tracy PE. Randomized response: a method of sensitive surveys. Newbury Park (CA):
SAGE Publications; 1986.
[6] Mangat NS, Singh R. An alternative randomized procedure. Biometrika. 1990;77:439–442.
[7] Mangat NS. Two-stage randomized response sampling procedure using unrelated question.
J Ind Soc Agril Statist. 1992;44(1):82–87.
[8] Ryu JB, Hong KH, Lee GS. Randomized response model. Seoul: Freedom Academy; 1993.
[9] Mangat NS, Singh S, Singh R. On the use of a modified randomization device in randomized
response inquiries. Metron. 1993;51:211–216.
[10] Singh S, Singh R, Mangat NS, Tracy DS. An alternative device for randomized responses.
Statistica. 1994;54(2):233–243.
[11] Singh S, Horn S, Singh R, Mangat NS. On the use of modified randomization device for
estimating the prevalence of a sensitive attribute. Statist Trans. 2003;6(4):515–522.
[12] Hong K, Yum J, Lee H. A stratified randomized response technique. Korean J Appl Stat.
1994;7:141–147.
[13] Kim J-M, Warde WD. A stratified Warner’s randomized response model. J Statist Plann
Inference. 2004;120(1–2):155–165.
[14] Kim J-M, Elam ME. A two-stage stratified Warner’s randomized response model using optimal
allocation. Metrika. 2005;61:1–7.
[15] Kim J-M, Elam ME. A stratified unrelated question randomized response model. Statist Paper.
2007;48:215–233.
[16] Adebola FB, Johnson OO, Adegoke NA. A modified stratified randomized response techniques.
Math Theory Model. 2014;4(13):30–42.
[17] Land M. Estimation of a rare sensitive attribute using Poisson distribution. Statistics.
2012;46(3):351–360.
[18] Lee GS, Uhm D, Kim JM. Estimation of a rare sensitive attribute in a stratified sample using
Poisson distribution. Statistics. 2013;47(3):575–589.
[19] Singh GN, Suman S. A modified two-stage randomized response model for estimat-
ing the proportion of stigmatized attribute. Journal of Applied Statistics. 2018. doi:
10.1080/02664763.2018.1529150.
[20] Lee GS, Hong KH, Son CK. A stratified two-stage unrelated randomized response model for
estimating a rare sensitive attribute based on the Poisson distribution. J Stat Theory Pract.
2016;10:239–262.
[21] Cochran WG. Sampling techniques. 3rd ed. New York (NY): John Wiley and Sons; 1977.