You are on page 1of 23

Statistics

A Journal of Theoretical and Applied Statistics

ISSN: 0233-1888 (Print) 1029-4910 (Online) Journal homepage: https://www.tandfonline.com/loi/gsta20

An ameliorated stratified two-stage randomized


response model for estimating the rare sensitive
parameter under Poisson distribution

Surbhi Suman & G. N. Singh

To cite this article: Surbhi Suman & G. N. Singh (2019) An ameliorated stratified two-stage
randomized response model for estimating the rare sensitive parameter under Poisson
distribution, Statistics, 53:2, 395-416, DOI: 10.1080/02331888.2019.1569665

To link to this article: https://doi.org/10.1080/02331888.2019.1569665

Published online: 06 Feb 2019.

Submit your article to this journal

Article views: 60

View related articles

View Crossmark data

Citing articles: 4 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=gsta20
STATISTICS
2019, VOL. 53, NO. 2, 395–416
https://doi.org/10.1080/02331888.2019.1569665

An ameliorated stratified two-stage randomized response


model for estimating the rare sensitive parameter under
Poisson distribution
Surbhi Suman and G. N. Singh
Department of Applied Mathematics, Indian Institute of Technology (Indian School of Mines), Dhanbad, India

ABSTRACT ARTICLE HISTORY


This manuscript presents the process for estimating the mean num- Received 22 November 2017
ber of individuals having rare sensitive characteristic when popu- Accepted 5 January 2019
lation units are heterogeneous with and without prior information KEYWORDS
on the supplementary (unrelated rare non-sensitive) characteristic. Poisson distribution; rare
The rare stigmatized parameter is estimated using an ameliorated sensitive attribute; unrelated
two-stage randomized response model under stratified sampling attribute; stratified sampling;
and stratified double sampling schemes. The properties of the sug- stratified double sampling;
gested estimators have been discussed under random, proportional randomized response model
and optimal allocations of sample from different strata. The pro-
AMS SUBJECT
posed estimators perform better over some contemporary compe- CLASSIFICATION
tent estimators for similar situations which have been shown through 62D99
empirical studies.

1. Introduction
In the real life scenario, gathering the response or reliable response from respondents
is an arduous assignment when potentially discomforting, deplorable or incriminating
responses are sought due to sensitive nature of characteristic under study such as use
of illegal cannabis plant, sexual behaviour, mental disorder and others. The randomized
response technique is efficacious in reducing the non-response rate and inflated response
bias which is occurred due to non-response and untruthful responses. This pioneer work
is initiated by Warner [1] which uses a randomized device bearing two questions, one on
sensitive character A and other on its compliment and sample units are selected by sim-
ple random sampling with replacement method. To alleviate the risk of privacy discloser,
Horvitz et al. [2] replaced second question of Warner [1] by a question related to non-
sensitive attribute B unrelated to sensitive attribute A. Greenberg et al. [3] extended this
unrelated model when proportion of non-sensitive attribute B is unknown. Further, the
randomized response techniques have been modified for diverse circumstances by [4–11]
Moors [4], Fox and Tracy [5], Mangat and Singh [6], Mangat [7], Ryu et al. [8], Mangat
et al. [9], Singh et al. [10,11] and among others where sample is drawn from the popula-
tion under simple random sampling with replacement scheme in all the above cases. Hong

CONTACT Surbhi Suman surbhi.iitism@yahoo.com Department of Applied Mathematics, Indian Institute of


Technology (Indian School of Mines), Dhanbad 826004, India

© 2019 Informa UK Limited, trading as Taylor & Francis Group


396 S. SUMAN AND G. N. SINGH

et al. [12] addressed a stratified randomized response model under the proportional allo-
cation. To improve this model, Kim and Warde [13] and Kim and Elam [14] proposed
stratified randomized response models utilizing optimal allocation and their works were
further extended by Kim and Elam [15], Adebola et al. [16] and others.
When the number of persons possessing a rare stigmatized characteristic in population
is very small and looked for large sample size to estimate this number, Land [17] address
the use of Poisson distribution under [3] randomized response model to overcome this
distinct problem. Lee et al. [18] and [20] developed this work for stratified sampling and
stratified double sampling according to availability of stratum size (known and unknown
respectively) utilizing the Poisson distribution looked into [17] work.
Motivated by above mentioned works, we have suggested an ameliorated two-stage
unrelated randomized response method for estimating the mean number of persons in
population with a rare characteristic under stratified sampling scheme using Poisson distri-
bution. Based on availability of the stratum size, the work is extended for double stratified
sampling scheme. The properties of the resultant estimators are discussed in both cases
when proportion of unrelated rare non-sensitive attribute is known as well as unknown.
The proportional and optimum allocation methods are considered for detailed study. The
empirical studies have been accomplished to support the discussed theory.

2. Preposition of randomized response model under stratified sampling


Following the work of Singh and Suman [19] and Lee et al. [20], we have suggested a new
stratified unrelated randomized response model using Poisson distribution for estimating
the mean number of persons in population having rare stigmatized characteristic while
population is easily divided in the strata and the strata information may be effortlessly
used to obtain the efficient estimator rather than simple random sampling. According to
availability of the prior knowledge of an unrelated rare non-sensitive attribute, this proce-
dure is looked for two cases when the proportion of unrelated rare non-sensitive attribute
is known and unknown.

2.1. When the proportion of an unrelated rare non-sensitive attribute (πb ) is known
Let  be a finite population of size N which is composed into L strata of sizes Ni , i =
1, 2 · · · L. A sample of size ni is selected from ith stratum by simple random sampling with
replacement (SRSWR) such that total sample size n = Li=1 ni . In this section, the pro-
cedure is continued with assumption that the stratum size Ni and the proportion πib of
unrelated rare non-sensitive attribute B for ith stratum are known. Each person selected in
the sample from ith stratum is requested to answer ‘yes’ or ‘no’ using randomized response
devices (R1i , R2i ) which consist the deck of cards (Table 1):
If the statement (3) appears, then it is needed to repeat the process without replacing the
card. Once statement (3) is reappeared in the second draw, then answer ‘no’ is reported by
interviewee.
Following the instruction of above randomized response model, the probability of
getting answer ‘yes’ in ith stratum is given as
 
ki
ζi0 = Ui πia + (1 − Ui ) (P1i πia + P2i πib ) 1 + P3i , (1)
ki − 1
STATISTICS 397

Table 1. Layout of the proposed stratified randomized response model (rrm).


First stage randomized response device R1i
Outcomes Statements Probability
(1) Do you possess the rare stigmatized characteristic A? Ui
(2) Go to randomized device R2i 1 − Ui
Second stage randomized response device R2i
Outcomes Statements Probability of selection
(1) Do you possess the rare stigmatized characteristic A? P1i
(2) Do you possess the unrelated rare non-sensitive attribute B? P2i
(3) Blank card P3i

where for ith stratum, ki be the total number of cards in device R2i , πia and πib are the
population proportions of the attributes A and B respectively. Since, the attributes A and
B under study are assumed to be very rare in the population, therefore, for a large sample
ni from ith stratum, i.e, ni → ∞ and ζi0 → 0, ni ζi0 = λi0 > 0. Therefore, Equation (1) is
rewritten as
 
ki
λi0 = Ui λia + (1 − Ui ) (P1i λia + P2i λib ) 1 + P3i , (2)
ki − 1

where ni πia = λia > 0 and ni πib = λib > 0 as πia → 0 and πib → 0 respectively.
Let xi1 , xi2 , . . . , xini be a random sample of size ni observations from the ith stratum
follow the Poisson distribution with parameter λi0 .
The likelihood function of the random sample of ni observations is obtained as
ni −λi0 xij
 e λ i0
L(xij , λi0 ) = . (3)
xij !
j=1

The natural log-likelihood function is given by


  
ki
log L(xij , λi0 ) = −ni Ui λia + (1 − Ui ) (P1i λia + P2i λib ) 1 + P3i
ki − 1
n i   
ki
+ xij log Ui λia + (1 − Ui ) (P1i λia + P2i λib ) 1 + P3i
j=1
ki − 1


ni
− log(xij !). (4)
j=1

Maximizing Equation (4) with respect to parameter λia and after simplifying it, the esti-
mator λ̂ia for mean number of persons bearing the rare sensitive characteristic in the ith
stratum is given as
⎡ ⎤
ni  
1 1 ki
λ̂ia =  ⎣ xij − (1 − Ui )P2i 1 + P3i λib ⎦ .
ki ni j=1 ki − 1
Ui + (1 − Ui )P1i 1 + P3i
ki − 1
(5)
398 S. SUMAN AND G. N. SINGH

Therefore, the estimator λ̂a for mean number of persons in population with rare sensitive
characteristics (λa ) is proposed under stratified population as
⎡ ⎤

L ni  
Wi 1 k
⎣ λib ⎦,
i
λ̂a =  xij − (1 − Ui )P2i 1 + P3i
ki ni j=1 ki − 1
i=1 Ui + (1 − Ui )P1i 1 + P3i
ki − 1
(6)
Ni
where Wi = .
N

2.1.1. Properties of estimator λ̂a


The properties of the proposed estimator λ̂a are discussed in following theorems:

Theorem 2.1: The proposed estimator λ̂a is an unbiased estimator of the parameter λa .

Proof: Since the random variable xij follows Poisson distribution with parameter λi0 ,
hence,
⎡ ⎤

L ni  
Wi 1 k
⎣ λib ⎦ .
i
E(λ̂a ) =  λi0 − (1 − Ui )P2i 1 + P3i
ki ni j=1 ki − 1
i=1 Ui + (1 − Ui )P1i 1 + P3i
ki − 1

Substituting the value of λi0 from Equation (2) in above expression and simplify, we have

E(λ̂a ) = λa .

Therefore, the suggested estimator λ̂a is unbiased for the parameter λa . 

Theorem 2.2: The variance of the proposed estimator λ̂a is given by


⎡ λia ⎤
  
⎢ ki ⎥
⎢ ni Ui + (1 − Ui )P1i 1 + P3i ⎥
⎢ ki − 1 ⎥
L ⎢   ⎥
2⎢ ⎥
V(λ̂a ) = Wi ⎢ ki ⎥. (7)
⎢ (1 − Ui )P2i 1 + P3i λib ⎥
i=1 ⎢ ki − 1 ⎥
⎢ +   2 ⎥
⎣ ki ⎦
ni Ui + (1 − Ui )P1i 1 + P3i
ki − 1

Proof: The variance of the proposed estimator λˆa is derived as


 

L 
L  
V(λ̂a ) = V Wi λ̂ia = Wi2 V λ̂ia . (8)
i=1 i=1
STATISTICS 399

The variance of the estimator λ̂ia is given by

1 ni
j=1 V(xij )
n2i
V(λ̂ia ) =   2
ki
Ui + (1 − Ui )P1i 1 + P3i
ki − 1
ni
j=1 λi0
=   2 .
ki
n2i Ui + (1 − Ui )P1i 1 + P3i
ki − 1

Substituting the value of λi0 from Equation (2) in the above equation and after some
algebraic simplifications, we have

λia
V(λ̂ia ) =   
ki
ni Ui + (1 − Ui )P1i 1 + P3i
ki − 1
 
ki
(1 − Ui )P2i 1 + P3i λib
ki − 1
+   2 . (9)
ki
ni Ui + (1 − Ui )P1i 1 + P3i
ki − 1

Substituting the value of V(λ̂ia ) from Equation (9) in Equation (8), we have the expression
for the variance of the estimator λ̂ia as given in Equation (7). 

Theorem 2.3: The unbiased estimate of the variance of the proposed estimator λ̂a is
given by
ni
L
j=1 xij
2
V̂(λ̂a ) = Wi   2 . (10)
ki
i=1 ni Ui + (1 − Ui )P1i 1 + P3i
2
ki − 1

Proof: Taking expectation both sides of Equation (10) and utilizing E(xij ) = λi0 as xij ∼
P(λi0 ); we may easily prove that V̂(λ̂a ) is an unbiased estimator of V(λ̂a ). 

2.1.2. Allocation of sample size and variance under different systems of allocation
The expression for the variance of the estimator given in Equation (7) is the function
of sample size ni , i.e., the precision of the proposed estimator under stratified sampling
depends upon the selection of sample size ni from ith stratum (i = 1, 2, . . . L). The allo-
cation method for selection of sample from different strata is based on the availability of
prior information of stratum variance.
(I) Proportional allocation: When the stratum size Ni is known, while the variances of
strata are unknown, the proportional allocation is used to draw the sample form the strata.
400 S. SUMAN AND G. N. SINGH

Ni
In proportional allocation ni ∝ Ni and ni = n and the variance of the proposed
N
estimator under proportional allocation is derive as
⎡ λia ⎤
  
⎢ k ⎥
⎢ Ui + (1 − Ui )P1i 1 + P3i i ⎥
⎢ k i−1 ⎥
1
L ⎢   ⎥
⎢ ⎥
V(λ̂a )p. = Wi ⎢ ki ⎥. (11)
n ⎢ (1 − Ui )P2i 1 + P3i λib ⎥
i=1 ⎢ ki − 1 ⎥
⎢ +  2 ⎥
⎣ ki ⎦
Ui + (1 − Ui )P1i 1 + P3i
ki − 1
(II) Optimum allocation: It is a method to define sample size by minimizing variance for a
given cost or to minimizing the cost for given variance. The cost function is defined under
stratified sampling [21] as

L
C = c0 + ni ci , (12)
i=1
where c0 denotes overhead cost, whereas ci be the survey cost per unit in the ith stratum.
Under optimum allocation, the sample size ni from ith stratum is given by
√ √
(Wi ηi )/ ci
ni = n L √ √ , (13)
i=1 (Wi ηi )/ ci

where
 
ki
(1 − Ui )P2i 1 + P3i λib
λia ki − 1
ηi =    +   2
ki ki
Ui + (1 − Ui )P1i 1 + P3i Ui + (1 − Ui )P1i 1 + P3i
ki − 1 ki − 1

and the variance of the estimator λ̂a is given as



1  √  Wi ηi
L L
V(λ̂a )opt. = Wi ηi ci √ . (14)
n i=1 i=1
ci

2.2. When the proportion of an unrelated rare non-sensitive attribute (πb ) is


unknown
In this section, we have suggested the estimators for the mean number of persons in pop-
ulation bearing the rare sensitive characteristics using Poisson distribution if the true
proportion of the unrelated rare non-sensitive attribute B is unknown but strata size is
known in advance. For this, ni individuals, selected by SRSWR in the sample from ith stra-
tum, are requested to answer the same question twice. Two sets of randomized devices
(R11i , R12i ) and (R21i , R22i ) are used by respondents to answer the question twice in ith
stratum. Initially, the question is answered by interviewee using first set of randomized
STATISTICS 401

Table 2. Layout of the proposed stratified rrm for giving answer first time.
First stage randomized response device R11i
Outcomes Statements Probability of selection
(1) Do you possess rare stigmatized characteristic A ? U1i
(2) Go to randomized device R12i 1 − U1i
Second stage randomized response device R12i
Outcomes Statements Probability of selection
(1) Do you possess the rare stigmatized characteristic A ? P1i
(2) Do you possess the unrelated non-sensitive attribute B? P2i
(3) Blank card P3i

Table 3. Layout of the proposed stratified rrm for giving answer second time.
First stage randomized response device R21i
Outcomes Statements Probability of selection
(1) Do you possess the rare stigmatized characteristic A? U2i
(2) Go to randomized device R22i 1 − U2i
Second stage randomized response device R22i
Outcomes Statements Probability of selection
(1) Do you possess the rare stigmatized characteristic A? Q1i
(2) Do you possess the unrelated rare non-sensitive attribute B? Q2i
(3) Blank card Q3i

response devices (R11i , R12i ) and later, same question is answered by same interviewee
using second set of randomized response devices (R21i , R22i ) (Tables 2 and 3).
If the statement (3) is selected by respondent during selection, then it is required to
repeat the process without replacing the card. In the second draw, if statement (3) is
reappeared, then respondent is suggested to report ‘No’.
The probabilities of getting answer ‘yes’ from the respondent using above randomized
response devices are
 
ki
ζi1 = U1i πia + (1 − U1i ) (P1i πia + P2i πib ) 1 + P3i
ki − 1
and
 
ki
ζi2 = U2i πia + (1 − U2i ) (Q1i πia + Q2i πib ) 1 + Q3i .
ki − 1
For ni → ∞, as ζi1 → 0 and ζi2 → 0, we have ni ζi1 = λ∗ia > 0 and ni ζi2 = λ∗ib > 0. Let
xi11 , xi12 , . . . , xi1n and xi21 , xi22 , . . . , xi2n be the random samples of size of ni observa-
tions from ith stratum follow the Poisson distribution with parametre λ∗ia > 0 and λ∗ib > 0
respectively. Proceeding in similar fashion as described in the previous section, we get

1 
ni   ki

xi1j = U1i λ̂ia + (1 − U1i ) P1i λ̂ia + P2i λ̂ib 1 + P3i (15)
ni j=1 ki − 1

and
1 
ni   ki

xi2j = U2i λ̂ia + (1 − U2i ) Q1i λ̂ia + Q2i λ̂ib 1 + Q3i . (16)
ni j=1 ki − 1
402 S. SUMAN AND G. N. SINGH

Solving Equations (15) and (16), the estimators for mean number of persons in population
possessing rare sensitive attribute A and non-sensitive attribute B in the ith stratum are
suggested as follow:

  ni
1 ⎣ ki
λ̂iau = Q2i (1 − U2i ) 1 + Q3i xi1j
ni i1 ki − 1 j=1

  ni
ki
−(1 − U1i )P2i 1 + P3i xi2j ⎦ , (17)
ki − 1 j=1

where
   
ki ki
i1 = U1i (1 − U2i )Q2i 1 + Q3i − U2i (1 − U1i )P2i 1 + P3i
ki − 1) ki − 1
  
ki ki
+ (1 − U1i )(1 − U2i )(P1i Q2i − P2i Q1i ) 1 + P3i 1 + Q3i = 0
ki − 1 ki − 1

and
   
ni
1 ki
λ̂ibu = U2i + (1 − U2i )Q1i 1 + Q3i xi1j
ni i2 ki − 1
j=1
   
ni 
ki
− U1i + (1 − U1i )P2i 1 + P3i xi2j (18)
ki − 1 j=1

where
   
ki ki
i2 = U2i (1 − U1i )P2i 1 + P3i − U1i (1 − U2i )Q2i 1 + Q3i
ki − 1) ki − 1
  
ki ki
+ (1 − U1i )(1 − U2i )(P2i Q1i − P1i Q2i ) 1 + P3i 1 + Q3i = 0.
ki − 1 ki − 1

From Equations (17) and (18), we have proposed the estimators λ̂au and λ̂bu for mean
number of persons in population possessing rare sensitive attribute A and non-sensitive
attribute B respectively, which are as follows:


L
λ̂au = Wi λ̂iau (19)
i=1

and

L
λ̂bu = Wi λ̂ibu , (20)
i=1

where λ̂iau and λ̂ibu are derived in Equations (17) and (18), respectively.
STATISTICS 403

2.2.1. Properties of the estimators λ̂au and λ̂bu


The properties of the estimators λ̂au and λ̂bu are given in the following theorems.

Theorem 2.4: The proposed estimators λ̂au and λ̂bu are unbiased for parameters λa and λb ,
respectively.

Proof:

L
E(λ̂au ) = Wi E(λ̂iau ). (21)
i=1
We consider,
  
ni
1 ki
E(λ̂iau ) = Q2i (1 − U2i ) 1 + Q3i E(xi1j ) − (1 − U1i )P2i
ni i1 ki − 1
j=1
 
ni 
ki
1 + P3i E(xi2j ) .
ki − 1 j=1

Since,
  ki

xi1j ∼ P(λ∗ia ) ⇒ E(xi1j ) = λ∗ia = U1i λ̂ia + (1 − U1i ) P1i λ̂ia + P2i λ̂ib 1 + P3i
ki − 1
and
  ki

xi2j ∼ P(λ∗ib ) ⇒ E(xi2j ) = λ∗ib = U2i λ̂ia + (1 − U2i ) Q1i λ̂ia + Q2i λ̂ib 1 + Q3i ,
ki − 1
therefore, we get, E(λ̂iau ) = λia .
Putting the value of E(λ̂iau ) in Equation (21), we get E(λ̂au ) = λa .
In similar manner, we may show that E(λ̂bu ) = λb . 

Theorem 2.5: The variances of the estimators λ̂au and λ̂bu are given as
L
Wi2
V(λ̂au ) = ηi1 (22)
i=1
2i1 ni
and
L
Wi2
V(λ̂bu ) = η ,
2 n i2
(23)
i=1
 i2 i

where
   
ηi1 = a1i b22i + a22i b1i − 2a1i a2i b1i b2i λia + a22i b2i + b22i a2i − 2a22i b22i λib ,
   
ηi2 = a21i b1i + b21i a1i − 2a21i b21i λia + a2i b21i + a21i b2i − 2a1i a2i b1i b2i λib ,
   
ki ki
a1i = U1i + (1 − U1i )P1i 1 + P3i , a2i = P2i (1 − U1i ) 1 + P3i ,
ki − 1 ki − 1
   
ki ki
b1i = U2i + (1 − U2i )Q1i 1 + Q3i , and b2i = Q2i (1 − U2i ) 1 + Q3i .
ki − 1 ki − 1
404 S. SUMAN AND G. N. SINGH

Proof: The variance of the proposed estimator λ̂ia is derived as



L
V(λ̂au ) = Wi2 V(λ̂iau ). (24)
i=1

The variance of estimator λ̂iau is obtained as


⎡   2 ⎤
Q2i (1 − U2i ) 1 + Q3i kik−1
i
λ∗a
1 ⎢ ⎢    ⎥
2 ⎥
V(λ̂iau ) = ⎢ + (1 − U1i )P2i 1 + P3i k −1) ki
λ∗b ⎥,
ni i1
2 ⎣ i    ⎦
−2P2i Q2i (1 − U2i )(1 − U1i ) 1 + Q3i kik−1i
1 + P3i kik−1
i
λ∗ab
(25)
where
  
ki
λ∗a = v(xi1j ) = U1i + (1 − U1i )P1i 1 + P3i λia
ki − 1
  
ki
+ (1 − U1i )P2i 1 + P3i λib
ki − 1
  
∗ ki
λb = v(xi2j ) = U2i + (1 − U2i )Q1i 1 + Q3i λia
ki − 1
  
ki
+ Q2i (1 − U2i ) 1 + Q3i λib
ki − 1
and
λ∗ab = Cov(xi1j , xi2j )
     
ki ki
= U1i +(1 − U1i )P1i 1 + P3i U2i + (1−U2i )Q1i 1 + Q3i λia
ki − 1 ki − 1
     
ki ki
+ (1 − U1i )P2i 1 + P3i Q2i (1 − U2i ) 1 + Q3i λib .
ki − 1 ki − 1
Putting the values of λ∗a , λ∗b and λ∗ab in Equation (25) and after some algebraic simplifi-
cations, we get the expression for the variance of the estimator λ̂au as given in Equation
(22).
In the similar fashion, we may derive the expression for variance of the estimator λ̂bu as
given in Equation (23). 

Theorem 2.6: The unbiased estimates of the variances V(λ̂au ) and V(λ̂bu ) are given by
L
Wi2
V̂(λ̂au ) = η̂
2 n i1
(26)
i=1
 i1 i

and
L
Wi2
V̂(λ̂bu ) = η̂i2 , (27)
i=1
2i2 ni
where η̂i1 and η̂i2 are the sample estimates of ηi1 and ηi2 , respectively.
STATISTICS 405

2.2.2. Allocation of sample size and variance under different systems of allocation
Theorem 2.7: The variances of the estimators λ̂au and λ̂bu under proportional allocation are
given as
L
Wi
V(λ̂au )p. = ηi1 (28)
i=1
2i1 n

and
L
Wi
V(λ̂bu )p. = ηi2 . (29)
i=1
2i2 n

Theorem 2.8: Under optimum allocation, the sample size from ith stratum is

Wi ηi1 /i1
√ L √
ci Wi ηi1 /i1
ni = n √
i=1
ci

and the variance of estimator λ̂au is given as


 L  L 
1 

ηi1 √  Wi √ηi1 /i1
V(λ̂au )opt . = Wi ci √ . (30)
n i=1 i1 i=1
ci

3. Preposition of randomized response model under double stratified


sampling
When the population is composed of strata due to heterogeneous nature of units and infor-
mation on strata sizes are not available, the stratified double sampling scheme is used to get
reliable estimates of the true population parameters. In this section, the two-stage unre-
lated randomized response model and resultant estimators are proposed to estimate the
mean number of persons in population possessing rare sensitive attribute using Poisson
distribution under double stratified sampling scheme.
In this procedure, the first sample is used to categorized the units belong to different
strata and the second sample is used to estimate the parameter λa using two-stage unrelated
randomized response model (as discussed in previous section). The first sample of size n

is selected by simple random sampling with replacement and respondents selected in the
this sample are asked a direct question ‘Do you belong in ith stratum?’ and thus the first
sample is categorized into L strata of size n
i (i = 1, 2, . . . L). The stratum weights Wi and
wi are defined as follows:

Ni n

Wi = and wi = i
(i = 1, 2, . . . L).
N n
It is obvious that wi is an unbiased estimator of Wi .
In the second phase, ni respondents are randomly selected by SRSWR from the first
phase samples n
i in the ith stratum.
406 S. SUMAN AND G. N. SINGH

3.1. When the proportion of an unrelated rare non-sensitive attribute (πb ) is known
When the proportion of a rare unrelated non-sensitive attribute is known, the random-
ized device as described in Section 2.1 is utilize to get the response from the respondents.
Following the procedure in Section 2.1, the estimator for the parameter λa is obtained as


L
1
λ̂ad = wi  
ki
i=1 Ui + (1 − Ui )P1i 1 + P3i
ki − 1
⎡ ⎤
ni  
1 k
×⎣ λib ⎦ .
i
xij − (1 − Ui )P2i 1 + P3i (31)
ni ki − 1
j=1

3.1.1. Properties of the estimator λ̂ad


Theorem 3.1: The estimator λ̂ad is an unbiased estimator of the parameter λa

Proof: We consider
 L 

E(λ̂ad ) = E wi λ̂ia
i=1
  

L 

= E1 wi E2 (λ̂ia ) wi

i=1
⎡     ⎤
1 ni ki 
j=1 E2 (xij ) − (1 − Ui )P2i 1 + P3i λib 
⎢ L
ni ki − 1  ⎥
= E1 ⎢
⎣ w i    wi ⎥
 ⎦
ki 
i=1 Ui + (1 − Ui )P1i 1 + P3i 
ki − 1
⎡      ⎤
1 ni ki 
λ − (1 − U )P 1 + P λ 
⎢ L
ni j=1 i0 i 2i 3i
ki − 1
ib  ⎥

= E1 ⎣ wi    wi ⎥
ki  ⎦
Ui + (1 − Ui )P1i 1 + P3i 
i=1 
ki − 1

L
= Wi λia = λa .
i=1

Theorem 3.2: The variance of the estimator λ̂ad is given by


 L   
  1  L 
L
Wi 1
2
V λ̂ad =
Wi ηi + Wi (λia − λa ) + − 1 ηi , (32)
n i=1 n
υi
h=1 h=1

ni
where υi = is a fixed constant for ith stratum and ηi is defined in Equation (13).
n
i
STATISTICS 407

Proof: The variance of the estimator λ̂ad is obtained as


 L   L 
   
V λ̂ad = E1 V2 wi λ̂ia + V1 E2 wi λ̂ia . (33)
i=1 i=1

Now, consider the first term of Equation (33), which is simplified as


 L   
 1 
L
1 1 
L
E1 V2 wi λ̂ia = W η
i i − 1 + Wi ηi . (34)
i=1
n
υi n

h=1 h=1

The second term of Equation (33) is simplified as


 

L
1 
L
V1 E2 wi λ̂ia = Wi (λia − λa )2 . (35)
n

i=1 h=1

Putting the expressions for first and second terms from Equations (34) and (35) in
Equation (33), we obtain the expression for variance of estimator λ̂ad as given in
Equation (32). 

3.1.2. Allocation of sample size and variance under different systems of allocation
In the double stratified sampling under proportional allocation, the second phase sample
size ni is determined using the first phase sample n
and n
i .

Lemma 3.3: Under proportion allocation, the sample size ni is given as

n
i
ni = n
n

and the variance of λ̂ad is given by

  1 
L
1
L
2
V λ̂ad =
Wi (λia − λa ) + Wi ηi . (36)
p. n i=1 n i=1

Under optimum allocation, the cost function is given as


L
C = c
n
+ ci ni , (37)
i=1

where ci is cost per unit in the ith stratum and c


is the cost per unit of the preliminary
(first-phase) sample.
408 S. SUMAN AND G. N. SINGH

Here, ni is a random variable, therefore, expectation is taken over the cost function for
optimizing the values of υi over different strata and n


L 
L



E(C) = C = c n + ci E(ni ) = c n + ci Wi υi . (38)


i=1 i=1

Using the Cauchy–Schwarz inequality, the optimum value of υi is given by



c
ηi
υi = × L . (39)
i=1 Wi (λia − λa )
ci 2

Putting the optimum value of υi in Equation (38), the optimum value of sample size n
is
obtained as
C∗
n
= L . (40)
c
+ i=1 Wi ci υi

Lemma 3.4: Under optimum allocation, the variance of the estimator λ̂ad is given by
⎡
 L ⎤2
  1 ⎣
 
L

V λ̂ad = ∗ c Wi (λia − λa )2 + Wi ci ηi ⎦ . (41)
opt. C
i=1 i=1

3.2. When the proportion of an unrelated rare non-sensitive attribute (πb ) is


unknown
In this section, we consider the estimation problem of mean number of persons bearing
sensitive attribute when the proportion of a rare unrelated non-sensitive attribute and the
sizes of strata both are not available at the time of survey. Proceeding in the similar way as
in Section 2.2, the unbiased estimator for the parameters λaud and λbud are obtained as


L
λ̂aud = wi λ̂iau (42)
i=1

and

L
λ̂bud = wi λ̂ibu , (43)
i=1

where λ̂iau and λ̂ibu are derived in Equations (17) and (18), respectively.

3.2.1. Properties of the estimators λ̂aud and λ̂bud


Theorem 3.5: The estimators λ̂aud and λ̂bud are unbiased for the parameters λa and λb ,
respectively.
STATISTICS 409

Proof:
 L   
 
L 
L
E(λ̂aud ) = E1 E2 wi λ̂iau = E1 wi λiau = Wi λiau = λa .
i=1 i=1 i=1

In the similar way, E(λ̂bud ) = λb . 

Theorem 3.6: The variance of the estimator λ̂aud is given as


 L   
1  Wi  
L L
2 Wi 1 ηi1
V(λ̂aud ) =
η
2 i1
+ W (λ
i ia − λa ) +

− 1 , (44)
n  i1 n υi  2
i1
i=1 i=1 i=1

ni
where υi = is a fixed constant for ith stratum and ηi1 is defined in Equation (22).
n
i

Proof: The variance of the estimator λ̂ad is obtained as


   
  
L 
L
V λ̂aud = E1 V2 wi λ̂iau + V1 E2 wi λ̂iau . (45)
i=1 i=1

Now, consider the first term of Equation (45), which is simplified as


 L   
 1 
L
ηi1 1 1 
L
ηi1
E1 V2 wi λ̂iau =

Wi 2 − 1 +

Wi 2 (46)
i=1
n i1 υi n i1
h=1 h=1

The second term of Equation (45) is simplified as


 L 
 1 
L
V1 E2 wi λ̂iau =
Wi (λia − λa )2 . (47)
n
i=1 h=1

Putting the expressions for first and second terms from Equations (46) and (47) in
Equation (45), we obtain the expression for variance of estimator λ̂ad as given in
Equation (44). 

Lemma 3.7: Under proportion allocation, the sample size ni in ith stratum is given as

n
i
ni = n
n

and the variance of λ̂aud is given by

  1 
L
1
L
ηi1
2
V λ̂aud =
Wi (λiau − λa ) + Wi 2 . (48)
p. n i=1 n i=1 i1
410 S. SUMAN AND G. N. SINGH

Using the cost function given in Equation (37) and applying the Cauchy–Schwarz inequal-
ity, the optimum values of υi over different strata and sample size n
are given as

c
ηi1 1
υi = × 2 L
ci i1 i=1 Wi (λiau − λa )2

and
C∗
n
= L . (49)
c
+ i=1 Wi ci υi

Lemma 3.8: The variance of the estimator λ̂aud under optimum allocation is obtained as

⎡
 L  ⎤2
! " 1 ⎣
 L
η
Wi ci 2 ⎦ .
i1
V λ̂aud (o) = ∗ c Wi (λiau − λa )2 + (50)
C
i=1 i=1
 i1

4. Empirical comparison
In this section, we have accomplished the empirical studies to get tangible idea about the
effectiveness of the proposed stratified randomized response model and resultant estima-
tors λ̂a and λ̂au over the stratified randomized response models proposed in [18] and
[20] and their resultant estimators. For empirical studies, we consider the population of
size N = 100, 000 which is divided into two strata with stratum weights W1 = 0.6 and
W2 = 0.4. A sample of size 10, 000 is drawn by simple random sampling with replace-
ment scheme where n1 = 6000 and n2 = 4000. It is assumed that the number of cards in
proposed deck in both strata are equal i.e, k1 = k2 = 100(say). The per cent relative effi-
ciencies of the resultant estimators of proposed stratified model are contemplated under
two different cases:
Case ( I ): Stratified sampling when the proportion of rare non-sensitive attribute B is
known: The per cent relative efficiencies of the estimator λ̂a with respect to (λ̂1 )l and (λ̂1 )l1
are defined as

V[(λ̂1 )l ] V[(λ̂1 )l1 ]


E11 = × 100 and E12 = × 100,
V[λ̂a ] V[λ̂a ]

where (λ̂1 )l and (λ̂1 )l1 are the estimators based on randomized response models described
in [18] and [20], respectively, when the proportion of unrelated rare non-sensitive attribute
is known. It is assumed that P11 = P12 = P1 , P21 = P22 = P2 , P31 = P32 = P3 and U1 =
U2 = U. The per cent relative efficiencies E11 and E12 depend on the different choices of
parameters λ1a , λ2a , U, P1 , P2 and P3 . The empirical results are presented in Tables 4 and 5
for different choices of above mentioned parameters.
Case(II): Stratified sampling when the proportion of unrelated rare non-sensitive
attribute B is unknown
STATISTICS 411

Table 4. Per cent relative efficiencies λ̂a of the estimator with respect to (λ̂1 )l .
P1 0.4 0.5 0.6 0.7 0.8
P2 0.4 0.1 0.4 0.2 0.2 0.1 0.2 0.1 0.1
P3 0.2 0.5 0.1 0.3 0.2 0.3 0.1 0.2 0.1
U λ1a λ2a
0.3 0.5 0.5 268.32 558.34 216.47 380.37 184.56 271.12 162.9 198.31 147.12
1 254.43 485.64 205.4 336.46 175.32 244.88 155.01 183.66 140.33
1.5 243.5 436.39 197.04 307.04 168.64 227.45 149.56 174 135.89
1 0.5 241.12 426.47 195.26 301.15 167.25 223.97 148.46 172.07 135.01
1 232.74 393.43 189.09 281.6 162.5 212.48 144.75 165.74 132.11
1.5 225.79 368.21 184.1 266.75 158.76 203.79 141.89 160.97 129.93
1.5 0.5 224.24 362.81 183 263.58 157.95 201.94 141.28 159.95 129.46
1 218.63 344 179.06 252.56 155.07 195.51 139.13 156.43 127.86
1.5 213.83 328.7 175.75 243.63 152.68 190.31 137.38 153.59 126.57
0.5 0.5 0.5 408.19 674.31 296.64 438.29 231 302.26 188.1 216.06 157.85
1 374.31 576.78 273.41 382.28 214.29 269.72 175.92 197.98 149.14
1.5 349.2 512.53 256.73 345.63 202.69 248.54 167.76 186.27 143.53
1 0.5 343.9 499.76 253.27 338.36 200.32 244.35 166.12 183.96 142.43
1 325.66 457.64 241.5 314.46 192.38 230.61 160.7 176.39 138.81
1.5 311.08 425.9 232.25 296.51 186.23 220.31 156.58 170.72 136.11
1.5 0.5 307.88 419.16 230.24 292.7 184.91 218.13 155.7 169.53 135.54
1 296.5 395.77 223.15 279.51 180.28 210.58 152.64 165.38 133.57
1.5 286.98 376.9 217.27 268.88 176.49 204.51 150.17 162.06 131.99
0.7 0.5 0.5 595.67 798.7 397.64 500.1 286.15 335.38 216.36 234.9 169.2
1 525.7 672.47 354.8 430.22 258.68 295.66 198.67 212.91 158.32
1.5 477.13 591.46 325.68 385.53 240.42 270.32 187.18 198.93 151.42
1 0.5 467.22 575.54 319.8 376.76 236.77 265.36 184.91 196.2 150.07
1 433.95 523.54 300.21 348.14 224.7 249.17 177.45 187.28 145.67
1.5 408.25 484.81 285.24 326.86 215.58 237.15 171.87 180.67 142.42
1.5 0.5 402.71 476.62 282.03 322.36 213.63 234.62 170.69 179.27 141.73
1 383.32 448.38 270.84 306.87 206.88 225.88 166.6 174.47 139.37
1.5 367.44 425.75 261.74 294.46 201.42 218.88 163.32 170.63 137.48
0.9 0.5 0.5 845.44 931.02 524.64 565.75 351.64 370.53 248.04 254.87 181.23
1 713.79 772.24 451.31 480.18 309.02 322.68 223.37 228.48 167.87
1.5 628.91 672.8 404.41 426.64 281.97 292.77 207.85 211.99 159.54
1 0.5 612.2 653.49 395.21 416.24 276.69 286.96 204.82 208.79 157.92
1 557.51 590.86 365.18 382.54 259.48 268.15 195.01 198.42 152.69
1.5 516.71 544.72 342.85 357.72 246.73 254.29 187.76 190.79 148.83
1.5 0.5 508.08 535.02 338.13 352.51 244.04 251.38 186.23 189.19 148.02
1 478.27 501.69 321.88 334.59 234.79 241.38 180.99 183.68 145.25
1.5 454.36 475.14 308.86 320.31 227.39 233.42 176.81 179.3 143.03
Note: λ1b = λ2b λb = 1.0.

In this case, we have also considered the stratum weights as W1 = 0.7 and W1 = 0.3 and
assumed that P11 = P12 = P1 , P21 = P22 = P2 , P31 = P32 = P3 , Q11 = Q12 = Q1 , Q21 =
Q22 = Q2 , Q31 = Q32 = Q3 , U11 = U12 = U1 and U21 = U22 = U2 .
The per cent relative efficiencies of the estimator λ̂au with respect to (λ̂1u )l and (λ̂1u )l1
are given by
V[(λ̂1u )l ] V[(λ̂1u )l1 ]
E21 = × 100 and E22 = × 100,
V[λ̂au ] V[λ̂au ]
where (λ̂1u )l and (λ̂1u )l1 are the estimators based on randomized response models
described in [18] and [20], respectively when the proportion of unrelated rare non-
sensitive attribute is unknown.
The per cent relative efficiencies E21 and E22 depend on the different choices of param-
eters λ1a , λ2a P1 , P2 , P3 , Q1 , Q2 , Q3 , U1 and U2 . The empirical results are presented in
Tables 6 and 7 for different choices of above mentioned parameters.
412 S. SUMAN AND G. N. SINGH

Table 5. Per cent relative efficiencies λ̂a of the estimator with respect to (λ̂1 )l1 .
P1 0.4 0.5 0.6 0.7 0.8
P2 0.4 0.1 0.4 0.2 0.2 0.1 0.2 0.1 0.1
P2 0.2 0.5 0.1 0.3 0.2 0.3 0.1 0.2 0.1
U λ1a λ2a
0.3 0.5 0.5 113.26 235.68 115.28 202.57 117.18 172.14 119.04 144.92 120.94
1 112.26 214.28 113.91 186.59 115.37 161.14 116.7 138.27 117.94
1.5 111.47 199.78 112.87 175.88 114.05 153.83 115.08 133.88 115.98
1 0.5 111.3 196.85 112.65 173.74 113.78 152.37 114.75 133 115.59
1 110.69 187.12 111.89 166.62 112.85 147.56 113.65 130.13 114.31
1.5 110.19 179.7 111.27 161.22 112.12 143.91 112.8 127.96 113.35
1.5 0.5 110.08 178.11 111.13 160.07 111.96 143.14 112.61 127.5 113.15
1 109.68 172.57 110.64 156.06 111.39 140.44 111.98 125.9 112.44
1.5 109.33 168.07 110.23 152.81 110.92 138.26 111.45 124.61 111.87
0.5 0.5 0.5 108.3 178.9 109.87 162.33 111.38 145.73 112.85 129.63 114.32
1 107.49 165.64 108.8 152.12 110.01 138.46 111.13 125.07 112.2
1.5 106.9 156.9 108.03 145.44 109.06 133.73 109.98 122.12 110.83
1 0.5 106.77 155.16 107.88 144.12 108.86 132.79 109.75 121.53 110.56
1 106.34 149.43 107.34 139.76 108.21 129.72 108.99 119.62 109.68
1.5 105.99 145.12 106.91 136.49 107.71 127.42 108.41 118.2 109.02
1.5 0.5 105.92 144.2 106.82 135.8 107.6 126.93 108.28 117.9 108.88
1 105.65 141.02 106.49 133.39 107.22 125.24 107.85 116.85 108.4
1.5 105.42 138.45 106.22 131.45 106.91 123.88 107.5 116.01 108.01
0.7 0.5 0.5 104.54 140.16 105.49 132.67 106.42 124.73 107.34 116.54 108.26
1 103.98 133.01 104.78 127.06 105.54 120.63 106.28 113.9 106.99
1.5 103.6 128.42 104.3 123.47 104.96 118.02 105.59 112.22 106.18
1 0.5 103.52 127.52 104.2 122.76 104.84 117.5 105.45 111.89 106.02
1 103.26 124.58 103.88 120.46 104.46 115.83 105 110.82 105.51
1.5 103.06 122.38 103.63 118.75 104.17 114.59 104.67 110.02 105.13
1.5 0.5 103.01 121.92 103.58 118.39 104.11 114.33 104.6 109.86 105.05
1 102.86 120.32 103.4 117.15 103.89 113.43 104.35 109.28 104.78
1.5 102.74 119.04 103.25 116.15 103.72 112.71 104.15 108.82 104.56
0.9 0.5 0.5 101.42 111.69 101.73 109.7 102.04 107.52 102.35 105.16 102.65
1 101.21 109.5 101.47 107.96 101.72 106.22 101.98 104.31 102.23
1.5 101.07 108.12 101.3 106.87 101.52 105.41 101.75 103.77 101.97
1 0.5 101.04 107.86 101.27 106.65 101.49 105.25 101.7 103.67 101.91
1 100.95 106.99 101.16 105.97 101.36 104.74 101.56 103.33 101.75
1.5 100.89 106.36 101.08 105.46 101.26 104.37 101.45 103.09 101.63
1.5 0.5 100.87 106.22 101.06 105.36 101.24 104.29 101.42 103.03 101.6
1 100.82 105.76 101 104.99 101.18 104.02 101.35 102.86 101.51
1.5 100.79 105.39 100.96 104.7 101.12 103.8 101.28 102.71 101.44
Note: λ1b = λ2b λb = 1.0.

The optimum value of k has been obtained by minimizing the variance of the proposed
estimator and the expression of optimum k is mathematically too complex. Looking on
the complicated mathematical expression of optimum k, we have studied the behaviour of
the per cent relative efficiencies of estimators under proposed randomized response model
for different choices of k for fixed values of λb = λ1b = λ2b = 1.0 and λ1a = 1.5; L2a =
1.5 which are shown graphically in Graphs 1 − 4. For convenience, we have chosen all
ki = k ∀ i.

5. Interpretations of results
From Tables 4–7, it is clear that the values of per cent relative efficiencies E11 , E12 , E21 and
E22 are more than 100 for all the chosen parametric cases. These results highly support that
the proposed estimators λ̂a and λ̂au are more efficient than the estimators {(λ̂1 )l , (λ̂)l1 } and
STATISTICS 413

Table 6. Per cent relative efficiencies λ̂au of the estimator with respect to (λ̂1u )l .
P1 0.6 0.6 0.7 0.7 0.8 0.8
P2 0.2 0.2 0.15 0.15 0.1 0.1
Q1 0.1 0.4 0.1 0.4 0.1 0.4
Q2 0.45 0.3 0.45 0.3 0.45 0.3
U1 U2 λ1a λ2a
0.7 0.5 0.5 0.5 211.81 629.96 162.3 277.79 126.37 157.49
1 245.93 707.04 192.34 319.72 153.61 187.94
1.5 280.05 784.13 222.39 361.65 180.85 218.39
1 0.5 196.89 547.53 154.85 250.5 124.98 150.48
1 220.17 600.06 175.09 278.73 143.12 170.75
1.5 243.45 652.6 195.32 306.96 161.26 191.03
1.5 0.5 189.17 504.92 151.07 236.66 124.28 146.98
1 206.83 544.77 166.32 257.94 137.88 162.18
1.5 224.5 584.61 181.58 279.21 151.48 177.37
0.3 0.5 0.5 209.32 627.93 161.32 277.3 126.06 157.38
1 243.03 704.76 191.18 319.16 153.23 187.8
1.5 276.75 781.6 221.04 361.02 180.4 218.23
1 0.5 194.97 546.07 154.08 250.15 124.73 150.39
1 218.02 598.47 174.22 278.34 142.83 170.65
1.5 241.08 650.86 194.35 306.52 160.94 190.91
1.5 0.5 187.52 503.73 150.4 236.37 124.07 146.9
1 205.03 543.48 165.59 257.61 137.64 162.09
1.5 222.55 583.23 180.78 278.86 151.21 177.28
0.5 0.5 0.5 0.5 171.16 515.36 142.38 245.13 117.57 146.86
1 198.73 578.42 168.73 282.14 142.91 175.25
1.5 226.3 641.48 195.09 319.14 168.26 203.64
1 0.5 164.34 461.9 138.87 225.78 117.83 142.13
1 183.77 506.21 157.02 251.22 134.93 161.28
1.5 203.2 550.53 175.17 276.66 152.03 180.43
1.5 0.5 160.63 432.93 137.03 215.64 117.96 139.73
1 175.63 467.1 150.87 235.02 130.86 154.17
1.5 190.63 501.26 164.71 254.41 143.77 168.62
0.3 0.5 0.5 165.4 510.48 139.91 243.9 116.75 146.55
1 192.04 572.95 165.8 280.71 141.92 174.88
1.5 218.69 635.41 191.7 317.53 167.08 203.21
1 0.5 159.6 458.18 136.86 224.83 117.15 141.89
1 178.47 502.14 154.74 250.16 134.16 161
1.5 197.34 546.1 172.63 275.5 151.16 180.12
1.5 0.5 156.42 429.78 135.25 214.83 117.36 139.51
1 171.03 463.69 148.91 234.14 130.2 153.94
1.5 185.64 497.61 162.57 253.45 143.04 168.36
Note: P3 = 1 − (P1 + P2 ) and Q3 = 1 − (Q1 + Q2 ).

{(λ̂1u )l , (λ̂1u )l1 } respectively when the prior information of proportion of unrelated rare
attribute is known as well as unknown.
From Tables 4 and 5, it is clear that for the fixed values of other parameters, (a) the values
of E11 increase and the values of E12 decrease for increasing values of U i.e the probability
of selection of first question in first stage randomized device, (b) the values of E11 and E12
are decreasing as the values of λ1a increase, (c) the values of E11 and E12 are decreasing as
the values of λ2a increase.
From Tables 6 and 7, it is observed that for the fixed values of other parameters, (a) the
values of E21 and E22 are decreasing as the values of U1 decrease which implies that the
proposed estimation procedure performs better when the probability of selection of first
question in the first randomized response device R11 is small, (b) the values of E21 and E22
are decreasing as the values of U2 decrease, (c) the values of E21 and E22 are decreasing as
414 S. SUMAN AND G. N. SINGH

Table 7. Per cent relative efficiencies λ̂au of the estimator with respect to (λ̂1u )l1 .
P1 0.6 0.6 0.7 0.7 0.8 0.8
P2 0.2 0.2 0.15 0.15 0.1 0.1
Q1 0.4 0.5 0.4 0.5 0.4 0.5
Q2 0.3 0.1 0.3 0.1 0.3 0.1
U1 U2 λ1a λ2a
0.7 0.5 0.5 0.5 146.42 179.98 118.64 130.53 102.15 105.73
1 161 196.31 132.36 144.88 115.54 119.31
1.5 175.59 212.64 146.09 159.22 128.93 132.88
1 0.5 138.06 164.33 116.24 125.4 103.66 106.38
1 146.98 174.31 124.5 134.03 111.59 114.42
1.5 155.89 184.3 132.75 142.65 119.52 122.46
1.5 0.5 134.38 157.45 115.22 123.2 104.3 106.66
1 140.8 164.64 121.12 129.36 109.93 112.37
1.5 147.22 171.82 127.01 135.53 115.57 118.08
0.3 0.5 0.5 117.4 130.03 106.35 111.98 97.91 99.93
1 130.46 143.77 119.43 125.36 111.07 113.2
1.5 143.52 157.5 132.5 138.74 124.23 126.47
1 0.5 115.33 125.23 106.78 111.12 100.44 101.98
1 123.32 133.62 114.64 119.16 108.24 109.84
1.5 131.31 142.02 122.51 127.2 116.04 117.7
1.5 0.5 114.42 123.11 106.96 110.74 101.51 102.84
1 120.18 129.16 112.59 116.49 107.05 108.42
1.5 125.93 135.2 118.21 122.24 112.59 114.01
0.5 0.5 0.5 0.5 430.07 1212.5 203.02 299.83 128.34 145.56
1 457.99 1280.9 220.17 322 142.5 160.61
1.5 485.91 1349.3 237.32 344.17 156.66 175.67
1 0.5 370.96 1009.2 183.37 260.06 123.82 137.11
1 388.75 1052.7 193.98 273.77 132.36 146.19
1.5 406.54 1096.3 204.59 287.48 140.89 155.26
1.5 0.5 343.31 914.21 174.57 242.26 121.87 133.47
1 356.37 946.17 182.25 252.18 127.98 139.96
1.5 369.42 978.12 189.93 262.1 134.09 146.46
0.3 0.5 0.5 182.12 264.33 137.46 163.97 111.47 119.06
1 197.16 283.67 151.18 179.09 124.74 132.74
1.5 212.2 303 164.91 194.21 138.01 146.41
1 0.5 168.57 235.63 131.42 152.42 110.79 116.65
1 178.17 247.95 139.92 161.78 118.8 124.9
1.5 187.77 260.28 148.42 171.14 126.8 133.14
1.5 0.5 162.21 222.2 128.71 147.25 110.5 115.62
1 169.26 231.24 134.86 154.02 116.23 121.52
1.5 176.32 240.29 141.02 160.79 121.96 127.42
Note: P3 = 1 − (P1 + P2 ) and Q3 = 1 − (Q1 + Q2 ).

the values of λ1a increase, (d) the values of E21 and E22 are increasing as the values of λ2a
increase.
According to the above numerical results, the patterns indicate that the proposed
stratified estimation procedure is more efficient than Lee et al. [18] and [20] estimation
procedure when the proportion of the persons possessing sensitive attribute in sample
selected from strata is small.
From Graphs 1–4, it is observed that the per cent relative efficiencies are decreasing as
the values of k increase and after a particular value of k, the values of per cent relative
efficiencies almost get stabilized. On the basis of empirical studies of proposed estima-
tor for a particular randomized response device, we may provide the range of ‘k’ and
finally, we arrive on a conclusion, the best choice of k may be in the interval (3, 100). The
maximum possible per cent relative efficiencies are being observed at the lowest choices
of k.
STATISTICS 415

6. Conclusions and recommendations


In this work, Poisson probability model has been used to estimate the rare sensitive
attribute under stratified sampling and stratified double sampling schemes. Following the
interpretations of Tables 4–7, highly desirable results have been obtained and in follow up,
it may be concluded that the integrated approach of proposed estimators under stratified
sampling scheme are performing better than contemporary competent estimators in both
the cases when unrelated rare attribute is known and unknown.The dominance nature
of the proposed stratified randomized response model over the models described in [18]
and [20] advocates its highly beneficial use in social survey where characteristics under
study is sensitive in nature such as alcoholism, drug addiction and sexual harassment,
among others and related population is easily divided in the strata due to its heterogeneous
nature.
416 S. SUMAN AND G. N. SINGH

Acknowledgements
Authors are thankful to the Indian Institute of Technology (Indian School of Mines), Dhanbad for
providing financial and necessary infrastructural support to carry out the present research work.
Authors are also thankful to the reviewers for their valuable suggestions which improved the quality
of the paper.

Disclosure statement
No potential conflict of interest was reported by the authors.

References
[1] Warner SL. Randomized response: a survey technique for eliminating evasive answer bias.
J Amer Statist Assoc. 1965;60:63–69.
[2] Horvitz DG, Shah BV, Simmins WR, The unrelated question randomized response model.
Proceedings of the American statistical association, social statistics section; Washington, DC:
American Statistical Association; 1967. pp. 65–72.
[3] Greenberg BG, Abul-Ela AA, Simmins WR, et al. The unrelated question randomized response
model: theoretical framework. J Amer Statist Assoc. 1969;64:520–539.
[4] Moors JA. Optimization of the unrelated question randomized responsemodel. J Amer Statist
Assoc. 1971;66:627–629.
[5] Fox JA, Tracy PE. Randomized response: a method of sensitive surveys. Newbury Park (CA):
SAGE Publications; 1986.
[6] Mangat NS, Singh R. An alternative randomized procedure. Biometrika. 1990;77:439–442.
[7] Mangat NS. Two-stage randomized response sampling procedure using unrelated question.
J Ind Soc Agril Statist. 1992;44(1):82–87.
[8] Ryu JB, Hong KH, Lee GS. Randomized response model. Seoul: Freedom Academy; 1993.
[9] Mangat NS, Singh S, Singh R. On the use of a modified randomization device in randomized
response inquiries. Metron. 1993;51:211–216.
[10] Singh S, Singh R, Mangat NS, Tracy DS. An alternative device for randomized responses.
Statistica. 1994;54(2):233–243.
[11] Singh S, Horn S, Singh R, Mangat NS. On the use of modified randomization device for
estimating the prevalence of a sensitive attribute. Statist Trans. 2003;6(4):515–522.
[12] Hong K, Yum J, Lee H. A stratified randomized response technique. Korean J Appl Stat.
1994;7:141–147.
[13] Kim J-M, Warde WD. A stratified Warner’s randomized response model. J Statist Plann
Inference. 2004;120(1–2):155–165.
[14] Kim J-M, Elam ME. A two-stage stratified Warner’s randomized response model using optimal
allocation. Metrika. 2005;61:1–7.
[15] Kim J-M, Elam ME. A stratified unrelated question randomized response model. Statist Paper.
2007;48:215–233.
[16] Adebola FB, Johnson OO, Adegoke NA. A modified stratified randomized response techniques.
Math Theory Model. 2014;4(13):30–42.
[17] Land M. Estimation of a rare sensitive attribute using Poisson distribution. Statistics.
2012;46(3):351–360.
[18] Lee GS, Uhm D, Kim JM. Estimation of a rare sensitive attribute in a stratified sample using
Poisson distribution. Statistics. 2013;47(3):575–589.
[19] Singh GN, Suman S. A modified two-stage randomized response model for estimat-
ing the proportion of stigmatized attribute. Journal of Applied Statistics. 2018. doi:
10.1080/02664763.2018.1529150.
[20] Lee GS, Hong KH, Son CK. A stratified two-stage unrelated randomized response model for
estimating a rare sensitive attribute based on the Poisson distribution. J Stat Theory Pract.
2016;10:239–262.
[21] Cochran WG. Sampling techniques. 3rd ed. New York (NY): John Wiley and Sons; 1977.

You might also like