You are on page 1of 35

Mathematical Population Studies

An International Journal of Mathematical Demography

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/gmps20

Estimation of a rare sensitive attribute in two-


stage sampling using a randomized response
model under Poisson distribution

Garib N. Singh, Surbhi Suman & Chandraketu Singh

To cite this article: Garib N. Singh, Surbhi Suman & Chandraketu Singh (2020) Estimation
of a rare sensitive attribute in two-stage sampling using a randomized response
model under Poisson distribution, Mathematical Population Studies, 27:2, 81-114, DOI:
10.1080/08898480.2018.1553404

To link to this article: https://doi.org/10.1080/08898480.2018.1553404

Published online: 02 Jan 2019.

Submit your article to this journal

Article views: 57

View related articles

View Crossmark data

Citing articles: 1 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=gmps20
MATHEMATICAL POPULATION STUDIES
2020, VOL. 27, NO. 2, 81–114
https://doi.org/10.1080/08898480.2018.1553404

Estimation of a rare sensitive attribute in two-stage


sampling using a randomized response model under
Poisson distribution
Garib N. Singh, Surbhi Suman, and Chandraketu Singh
Department of Applied Mathematics, Indian Institute of Technology (Indian School of Mines),
Dhanbad, India

ABSTRACT KEYWORDS
Unbiased estimation procedures of the mean total number of Poisson distribution;
persons with a rare sensitive attribute apply for a clustered probability proportional to
population under two-stage and stratified two-stage sampling size sampling; randomized
response model; rare
schemes. Randomized response model is used to obtain the
sensitive attribute; rare
estimators, when the parameter of an unrelated rare non- unrelated attribute;
sensitive attribute is either known or unknown. The variances two-stage sampling
of the resultant estimators are derived and their unbiased
estimates are expressed. Numerical comparisons show that AMS SUBJECT
dispersions in the estimates are lower than other contempor- CLASSIFICATION
ary estimators. 62D05

1. Introduction
Surveys may deal with sensitive issues, to which respondents answer untruth-
fully. Warner (1965) advised to exclude direct questions related to sensitive
issues. He recommended to select a sample of n individuals using simple
random sampling with replacement scheme from the population, in order to
estimate the proportion πa having the sensitive characteristic A. Each selected
individual has A with probability π. The respondents answer “yes” if the
outcome of the randomized device matches with his or her actual status. The
maximum likelihood estimator of the proportion is
π1 n1
^a ¼
π þ ; (1)
2π  1 ð2π  1Þn
where n1 is the total number of “yes” and n the sample size.
Horvitz et al. (1967) modeled the introduction of an unrelated non-
sensitive attribute B in a randomized response model, which produced the
maximum likelihood estimator of the proportion as
1 n1 
^a ¼
π  ð1  πÞπb ; (2)
π n

CONTACT Surbhi Suman surbhi.iitism@yahoo.com Department of Applied Mathematics, Indian Institute


of Technology (Indian School of Mines), Dhanbad 826004, India
© 2019 Taylor & Francis Group, LLC
82 G. N. SINGH ET AL.

where πb is the known proportion of an unrelated non-sensitive attribute B.


Greenberg et al. (1969) introduced an unknown proportion of unrelated
non-sensitive attribute. Moors (1971) advised on the choice of parameters
πb and π and found that his optimized method was superior to Warner
(1965)’s model. To ensure more confidentiality to the respondent, Mangat
and Singh (1990) introduced a two-stage related randomized response model,
which also enhanced the precision of the resultant estimators over Warner
(1965)’s and Greenberg et al. (1969)’s estimators. Mangat (1992) suggested
two-stage unrelated randomized response models in order to favor the
privacy of the respondents. He found that the resultant estimator performed
better than Mangat and Singh (1990)’s estimator. Singh et al. (1994) devised
the randomized response models by including blank cards in the randomized
devices with similar motivation.
When the total number of persons having a sensitive attribute is small,
Land et al. (2012), Singh and Tarray (2014), Tarray and Singh (2015),
Singh and Tarray (2017), and Tarray (2017) estimate the mean total
number of persons who possess a rare sensitive attribute in the popula-
tion. A large sample size is necessary for providing a precise enough
estimate.
We shall present alternative estimation methods to estimate the mean total
number of persons having a rare sensitive attribute under a Poisson distribu-
tion. We use two-stage and stratified two-stage sampling schemes. We use
the sampling scheme with probability proportional to size to select the first-
stage units and simple random sampling with replacement to select
the second-stage units. We express the variances of the estimators and
their estimates, when the parameter of the unrelated non-sensitive attribute
is either known or unknown. We shall also present the performance of the
estimation procedures over Lee et al. (2014)’s estimators.

2. Sampling design
A finite population U ¼ ðU1 ; U2 :::UN Þ of N clusters, which represent first-
stage units, consists of ðM1 ; M2 :::MN Þ second-stage units. At the first stage,
we select a sample of n clusters with probabilities pi ; i ¼ 1; 2; :::; n. At
the second stage, we select mi ; i ¼ 1; 2; :::; n, second-stage units from the
ith selected first-stage unit using simple random sampling with replacement. We
denote
πa : the true proportion of persons having a rare sensitive attribute A,
πb : the true proportion of persons having an unrelated rare non-sensitive
attribute B,
πia : the true proportion of the rare sensitive attribute in the ith cluster,
πib : the true proportion of the unrelated rare non-sensitive attribute in the
ith cluster.
MATHEMATICAL POPULATION STUDIES 83

We have
X
N X
n
M0 ¼ Mi and m ¼ mi : (3)
i¼1 i¼1

3. Estimation procedure of a rare sensitive attribute under two-stage


sampling using a randomized response model
We present the estimation procedure for the mean total number of persons
having a rare sensitive attribute under a two-stage sampling scheme. We
examine the case when the unrelated rare non-sensitive attribute is known
and the case when it is unknown. We collect the responses from the
elementary units in the second-stage samples using the randomization device
of Singh et al. (1994).

3.1. When the unrelated rare attribute is known


When the proportion πb of persons having the unrelated rare attribute is
known, respondents are invited to use the randomization device and answer
without revealing their having the attribute or not. The randomization device
consists of a deck of ki cards provided to the respondents selected from the
ith cluster. The cards bear one of the statements:

(i) “Do you have the rare sensitive attribute A?,” with probability T1i ;
(ii) “Do you have the unrelated rare non-sensitive attribute B?,” with
probability T2i ;
(iii) ”Draw another card,” with probability T3i ¼ 1  T1i  T2i .

If the statement (iii) appears, the respondent is invited to repeat the


process without replacing the card. If the statement (iii) reappears in the re-
drawn card, then the respondent reports “no”.
The probability of “yes” from the respondents of the ith cluster is
 
ki
ζ i0 ¼ ðT1i πia þ T2i πib Þ 1 þ T3i : (4)
ki  1
Because the attributes A and B under study are rare in the population, for
mi ! 1, mi ζ i0 ¼ θi0 > 0 as ζ i0 ! 0, mi πia ¼ θia > 0 as πia ! 0 and mi πib ¼
θib > 0 as πib ! 0, where
 
ki
θi0 ¼ ðT1i θia þ T2i θib Þ 1 þ T3i : (5)
ki  1
84 G. N. SINGH ET AL.

Consider a random sample yi1 ; yi2 ; :::; yimi of size mi from a Poisson distribu-
tion with parameter θi0 from the ith cluster of the population.
The likelihood function of the random sample of mi observations is

Y
mi θi0 yij
e θ
L¼ i0
: (6)
j¼1
yij !

Taking the logarithm of L in Eq. (6), putting the value of θi0 from Eq. (5),
and maximizing with respect to the parameter θia , the maximum-likelihood
estimator ^θia of the mean total number of persons having the sensitive
attribute in the ith cluster is
0 1
P
mi
B yij C
^θia ¼ 1 BB 
j¼1
  T2i θib C
C: (7)
T1i @mi 1 þ T3i ki A
ki 1

The final estimator of the mean total number of persons having a rare
sensitive attribute in the population under two-stage sampling design is then

^θa ¼ 1
Xn
Mi ^θia
ppt
; (8)
nM0 i¼1 pi

where pi is the initial probability of selecting the ith cluster which is a first-
stage unit.

3.1.1. Properties of the estimator ^θappt

Theorem 3.1. The estimator ^θappt of the mean total number of persons having
the rare sensitive attribute is unbiased.

Proof. We consider
! !
  1 X n
Mi ^θia 1 X n
Mi E2 ð^θia Þ
E1 E2 ^θappt ¼ E1 E2 ¼ E1 ; (9)
nM0 i¼1 pi nM0 i¼1 pi

where E1 is the expected total number of first-stage selections and E2 is the


expected total number of second-stage selections.
Because yij follows a Poisson distribution with parameter θi0 , Eðyij Þ ¼ θi0 .
Consider
MATHEMATICAL POPULATION STUDIES 85

0 0 11
P
mi

  B1 B yij CC
^ B B  j¼1  CC
E2 θia ¼ E2 B B  T θ
2i ib C C
@T1i @mi 1 þ T3i k1
k AA

0 1
P
mi
E2 ðyij Þ
1 BB j¼1
C
¼ B   T2i θib C
C
T1i @mi 1 þ T3i ki A
ki 1
0 1
P
mi
θi0
1 BB  j¼1
C
¼ B   T2i θib C
C
T1i @mi 1 þ T3i ki A
ki 1
0 1
1 @ θi0
¼    T2i θib A: (10)
T1i 1 þ T3i ki
ki 1

From Eq. (5), putting the value of θi0 into Eq. (10), E2 ð^θia Þ ¼ θia .
Using the result E2 ð^θia Þ ¼ θia in Eq. (9),
!
  1 X n
Mi θia 1 X N
Mi θia
E1 E2 ^θappt ¼ E1 ¼ pi ¼ θa : (11)
nM0 i¼1 pi M0 i¼1 pi

Theorem 3.2. The variance of the estimator ^θappt is □


 2 X !
  1 XN
M θ N
Mi2 Ψi
^
V θappt ¼ pi
i ia
 M0 θ a þ (12)
nM02 i¼1 pi i¼1
pi mi

where
T1i θia þ T2i θia
Ψi ¼  : (13)
T1i2 1 þ T3i kik1
i

Proof. V1 is the variance over the first-stage sample and V2 the variance over
the second-stage sample. The variance of the estimator ^θappt is decomposed as
     
V ^θappt ¼ V1 E2 ^θappt þ E1 V2 ^θappt : (14)

The first term of Eq. (14) is


86 G. N. SINGH ET AL.

! !
  1 X n
Mi ^θia 1 X n
Mi θia
V1 E2 ^θappt ¼ V1 E2 ¼ V1
nM0 i¼1 pi nM0 i¼1 pi
 
1 X N
Mi θia 2
¼ p i  M 0 θ a (15)
nM02 i¼1 pi

The second term of Eq. (14) is


! !
  1 X n
Mi ^θia 1 X n
Mi2 V2 ð^θia Þ
E1 V2 ^θappt ¼ E1 V2 ¼ E1
nM0 i¼1 pi ðnM0 Þ2 i¼1 p2i
0 0 0 111
Pmi
B y
B 1 X Mi2 B CCC
n ij
B B 1 B  j¼1  CCC
¼ E1 B V 2 B B  T2i θib CCC
@ðnM0 Þ2 i¼1 p2i @T1i @m 1 þ T ki AAA
i 3i ki 1

0 1
P
mi
B 1 X n
V2 ðyij Þ C
B Mi2 1 j¼1 C
¼ E1 B  2 C
@ðnM0 Þ i¼1 p2i T1i2
2 A
m2i 1 þ T3i kik1
i

0 1
P
mi
B 1 X n
θi0 C
B Mi2 j¼1 C
¼ E1 B  2 C
@ðnM0 Þ i¼1 p2i
2 A
T1i2 m2i 1 þ T3i kik1
i

0 1
B 1 X
n
M2 1 θi0 C
¼ E1 @ i
 2 A
ðnM0 Þ2 i¼1
p2i
T1i2 1 þ T3i kik1
i mi
0 1
X
n  
B 1 M2 1 ki C
¼ E1 @ i
 2 ðT1i θia þ T2i θib Þ 1 þ T3i A:
ðnM0 Þ2 p2i ki  1
i¼1 T1i2 1 þ T3i kik1
i
mi
(16)
Simplifying, we have

  1 X N
Mi2 Ψi
E1 V2 ^θappt ¼ ; (17)
nM02 i¼1 pi mi

where

T1i θia þ T2i θib


Ψi ¼  : (18)
T1i2 1 þ T3i kik1
i

Substituting the values from Eq. (15) and (17) into Eq. (14), we get the
expression of the variance of the estimator ^θappt in Eq. (12). □
MATHEMATICAL POPULATION STUDIES 87

The estimator of the variance of ^θappt is


!2
X ^
^θia  θappt
n
^ ^θappt Þ ¼ 1
Vð : (19)
nðn  1Þ i¼1 M0

3.1.2. Estimation when the first-stage sample is selected using simple


random sampling with replacement
In this case, the selection probability for all the selected clusters in the first-
stage is pi ¼ N1 ði ¼ 1; 2; :::; nÞ. The estimator of θa , when first-stage sample
units are selected with equal probability and with replacement in two-stage
sampling, is
Xn
^θawr ¼ N Mi ^θia : (20)
nM0 i¼1

The variance of the estimator ^θawr is


!
  N N X XN
Mi2
V ^θawr ¼  2
ðM θ
i ia  Mθ a Þ þ Ψi (21)
nM02 N  1 i¼1
mi

and its estimate


  1 Xn  2
^ ^θawr ¼
V NM ^
θ
i ia  ^
θ a ; (22)
nðn  1ÞM02 1¼1
 ¼ M0 :
where M N

3.1.3. Estimation when the first-stage sample is selected with probability


proportional to size sampling
We present estimation procedures when first-stage sample units are selected
using probability proportional to size sampling with replacement and with-
out replacement.

3.1.3.1. Probability proportional to size sampling with replacement. The


probability of selecting the ith cluster in the sample is pi ¼ M
Mi
0
and the
corresponding unbiased estimator of θa is
1 X
n
^θa ¼ ^θia : (23)
ppswr
n i¼1

The variance of ^θa is


!
  1 XN XN
Mi
V ^θappswr ¼ Mi ðθia  θa Þ2 þ Ψi (24)
nM0 i¼1 i¼1
mi
88 G. N. SINGH ET AL.

and its estimate


!2
  X ^
^θia  θappswr
n
^ ^θappswr ¼ 1
V : (25)
nðn  1Þ i¼1 M0

3.1.3.2. Probability proportional to size without replacement. Φi is the


probability that the ith unit belongs to the first-stage sample and Φij is
the probability that both the ith and the jth units belong to this sample
using probability proportional to size sampling without replacement.
Hence, the unbiased estimator of θa is

^θa 1 X n
Mi ^θia
¼ : (26)
ppswor
M0 i¼1 Φi

The variance of the estimator ^θa is


  !
  1 XN X N   Mi θia Mj θja 2 X N
Mi2
^
V θappswor ¼ Φ i Φ j  Φ ij  þ Ψi
M0 2 i¼1 j > i Φi Φj i¼1
mi
(27)
and its estimate
!2 !
  Xn X n   ^ ^ X
1 Φ Φ  Φ M θ M θ N
M 2
^ ^θappswor ¼
V
i j ij i ia

j ja
þ i ^i ;
Ψ
M02 1¼1 i > j Φij Φi Φj i¼1
mi  1
(28)
where
^θ þ T2i ^θib
^ i ¼ T1i
Ψ  ia : (29)
T1i2 1 þ T3i kik1
i

3.2. When the unrelated rare non-sensitive attribute is unknown


Because two parameters are unknown, responses are collected twice from
each individual using two randomization devices in each cluster. These
randomization devices consist of the decks of ki similar cards as described
in Section 3.1. Initially, the respondents selected from the ith cluster are
requested to answer “yes” or “no” using the first randomization device based
on the statements

(i) “Do you have the rare sensitive attribute A?,” with probability T1i ;
(ii) “Do you have the unrelated rare non-sensitive attribute B?,” with
probability T2i ;
MATHEMATICAL POPULATION STUDIES 89

(iii) “Draw another card,” with probability T3i ¼ 1  T1i  T2i .

The rest of the procedure is the same as in Section 3.1.


The respondents selected from the ith cluster are invited again to answer
the same questions using a second randomization device, which consists of
the aforementioned statements with probabilities P1i ; P2i , and P3i instead of
T1i ; T2i , and T3i . The rest of the procedure is the same as in Section 3.1.
Based on responses collected using two randomization devices, the prob-
abilities that respondents in the ith cluster answer “yes” are
 
ki
ζ i1 ¼ ðT1i πia þ T2i πib Þ 1 þ T3i (30)
ki  1

and
 
ki
ζ i2 ¼ ðP1i πia þ P2i πib Þ 1 þ P3i : (31)
ki  1

The attributes A and B are rare in the population; therefore, in the ith cluster,
for mi ! 1, ζ i1 ! 0, and ζ i2 ! 0, we have mi ζ i1 ¼ θi1 > 0, and
mi ζ i2 ¼ θi2 > 0. Subsequently, Eq. (30) and (31) simplify to
 
 ki
θi1 ¼ ðT1i θia þ T2i θib Þ 1 þ T3i (32)
ki  1

and
 
ki
θi2 ¼ ðP1i θia þ P2i θib Þ 1 þ P3i : (33)
ki  1

Likewise, as for Eq. (7) and simplifying Eq. (32) and (33), we get
  
1 X mi
ki
yi1j ¼ 1 þ T3i T1i ^θia þ T2i ^θib (34)
mi j¼1 ki  1

and
  
1 X mi
ki
yi2j ¼ 1 þ P3i P1i ^θia þ P2i ^θib ; (35)
mi j¼1 ki  1

where yi1j and yi2j are the first and the second answers of the jth ðj ¼
1; 2; :::; mi Þ respondent in the ith cluster. Solving Eq. (34) and (35), the
estimators of θia and θib are
90 G. N. SINGH ET AL.

0 1
P
mi P
mi
B P2i j¼1 yi1j T2i yi2j C
^θiau ¼ 1 B j¼1 C
B  C; (36)
mi ðT1i P2i  P1i T2i Þ @1 þ T3i k 1
k i i A
1 þ P3i k k1
i i

where T1i P2i Þ P1i T2i and


0 1
P
mi P
mi
B P1i j¼1 yi1j T1i yi2j C
^θibu 1 B j¼1 C
¼ B  C; (37)
mi ðP1i T2i  T1i P2i Þ 1 þ T3i k 1 1 þ P3i k 1A
@ k i k i
i i

where P1i T2i ÞT1i P2i .


The final estimator of the mean total number of persons having a rare
sensitive attribute in the population is then

^θa ¼ 1
Xn
Mi ^θiau
pptu
: (38)
nM0 i¼1 pi

3.2.1. Properties of the estimator ^θapptu

Theorem 3.3. The estimator ^θapptu of the mean total number of persons having
the rare sensitive attribute is unbiased.
Proof. Consider
! !
  1 X n
Mi ^θiau 1 X n
Mi
E ^θapptu ¼ E1 E2 ¼ E1 E2 ð^θiau Þ : (39)
nM0 i¼1 pi nM0 i¼1 pi
0 0 11
Pmi Pmi

  B B P2i j¼1 yi1j T2i yi2j CC


^ B 1 B j¼1 CC
E2 θiau ¼ E2 B B  ki C C
@mi ðT1i P2i  P1i T2i Þ @1 þ T3i k 1
k i
1 þ P 3i k 1
AA
i i

0 mi 1
P Pmi
BP2i j¼1 E2 ðyi1j Þ T2i j¼1 E2 ðyi2j ÞC
1 B C
¼ B  ki C
mi ðT1i P2i  P1i T2i Þ @ 1 þ T3i k k1i
1 þ P3i ki 1 A
i

0 1
P
mi P
mi
B P2i j¼1 θi1
T2i C θi2
1 B j¼1 C
¼ B  ki C
; (40)
mi ðT1i P2i  P1i T2i Þ @1 þ T3i k k1
i
1 þ P3i k 1A
i i

because yi1j and yi2j follow Poisson distributions with parameters θi1 and θi2 .
MATHEMATICAL POPULATION STUDIES 91

Substituting the value of θi1 from Eq. (32) and the value of θi2 from Eq.
(33) and simplifying,
!
  1 Xn
M θ 1 X N
Mi θia
^
E1 E2 θapptu ¼ E1
i ia
¼ pi ¼ θa : (41)
nM0 i¼1 pi M0 i¼1 pi

Theorem 3.4. The variance of the estimator ^θapptu is □


 2 X !
  1 X N
M θ N
M 2
Δ
V ^θapptu ¼
i ia i i
pi  M0 θa þ ; (42)
nM02 i¼1 pi i¼1
p i m i

where
Δi ¼ ðT1i Ci P2i þ P1i T2i Di Þθa þ ðT2i P2i Ci þ P2i T2i Di Þθb
 2P2i T2i ðT1i P1i θai þ T2i P2i θbi Þ (43)
P2i T2i
Ci ¼   and Di ¼  : (44)
1 þ T3i kik1
i 2 1þP
P1i ki
3i ki 1

Proof. The variance of the estimator ^θapptu is derived as


     
V ^θapptu ¼ V1 E2 ^θapptu þ E1 V2 ^θapptu : (45)

The first term of Eq. (45) is


! !
  1 X n
Mi ^θiau 1 X n
Mi θia
V1 E2 ^θapptu ¼ V1 E2 ¼ V1
nM0 i¼1 pi nM0 i¼1 pi
 
1 X N
Mi θia 2
¼ p i  M 0 θ a : (46)
nM02 i¼1 pi

The second term of Eq. (45) is


! 0  1
  n Mi 2 V2 ^ θiau
1 X n
Mi ^θiau 1 X
E1 V2 ^θapptu ¼ E1 V2 ¼ E1 @ A
nM0 i¼1 pi ðnM0 Þ2 i¼1 pi 2
0 0 0 11 1
Pmi Pmi
B 1 X n B B P 2i y i1j T 2i y i2j CC C
B Mi 2 B 1 B j¼1 j¼1 CC C
¼ E1 B V 2 B B  CC C:
@ðnM0 Þ i¼1 pi 2
2 @mi ðT1i P2i  P1i T2i Þ @1 þ T3i k 1 k i
1 þ P ki AA
3i ki 1 A
i

(47)

We assume that mi ðT1i P2i  P1i T2i Þ ¼ δi :


92 G. N. SINGH ET AL.

0 0 11
P
mi Pmi

B 1 X C 2
V ðy Þ þ D 2
V ðy Þ
2 i2j CC
  Mi 2 1 B
n i 1 i1j i
B B j¼1 j¼1 CC
E1 V2 ^θapptu ¼ E1 B B CC
@ðnM0 Þ2 i¼1 pi 2 ðδi Þ2 @ 2C D Pi Covðy ; y Þ
m
AA
i i i1j i2j
j¼1
!!
1 X n
Mi 2 1 Xmi

Xmi

Xmi
¼ E1 C 2
i θ i1 þ D 2
i θ i2  2C D
i i θi12
ðnM0 Þ2 i¼1 pi 2 ðδi Þ2 j¼1 j¼1 j¼1
!
1 XN 2
Mi 1 X mi X mi X mi
¼ Ci2 θi1 þ D2i θi2  2Ci Di θi12 ;
nM0 i¼1 pi ðδi Þ2
2
j¼1 j¼1 j¼1

(48)
where
 
ki
θi1 ¼ Vðyi1j Þ ¼ ðT1i θia þ T2i θib Þ 1 þ T3i ; (49)
ki  1

 
ki
θi2 ¼ Vðyi2j Þ ¼ ðP1i θia þ P2i θib Þ 1 þ P3i ; (50)
ki  1

and θi12 ¼ Covðyi1j ; yi2j Þ ¼ Eðyi1j ; yi2j Þ  Eðyi1j ÞEðyi2j Þ


  
ki ki
¼ ðP1i T1i θia þ P2i T2i θib Þ 1 þ P3i 1 þ T3i : (51)
ki  1Þ ki  1
Substituting the values from Eq. (46) and (48) into Eq. (45), we get the
expression of the variance of the estimator ^θapptu as given in Eq. (42). □
Finally, the estimate of the variance of ^θapptu is
!2
  1 Xn ^θa Mi
^ ^θapptu ¼
V
pptu
 ^θiau : (52)
nðn  1ÞM02 i¼1 pi

3.2.2. Estimation when the first-stage sample is selected using simple


random sampling with replacement
In this case, the probability of selecting the clusters in the first stage is
pi ¼ N1 ði ¼ 1; 2; :::; nÞ. The estimator of θa , when first-stage sample units are
selected with equal probability and with replacement in two-stage sampling, is
Xn
^θa ¼ N Mi ^θiau : (53)
wru
nM0 i¼1

The variance of the estimator ^θawru is


MATHEMATICAL POPULATION STUDIES 93

!
  N N X N XN
Δi
^
V θawru ¼ 2
ðMi θia  θa Þ þ Mi2
nM0 N  1 i¼1
2
i¼1 mi ðT1i P2i  P1i T2i Þ2
(54)

and its estimate


  1 Xn  2 M0
^ ^θawru ¼
V NMi
^θiau  ^θa ; where ¼ : (55)
nðn  1ÞM0 1¼1
2 wru
N

3.2.3. Estimation when the first-stage sample is selected with probability


proportional to size sampling
3.2.3.1. Probability proportional to size sampling with replacement. When
first-stage sample units are selected using a probability proportional to size
sampling with replacement, then the probability of selecting the ith cluster in
the sample is equal to pi ¼ M
Mi
0
. The corresponding unbiased estimator of θa is

^θa 1X n
^θiau :
ppswru
¼ (56)
n i¼1

The variance of ^θa is


!
  N XN XN
Δi
V ^θappswru ¼ þ ð Mi ðθ ia  θ a ÞÞ 2
þ Mi
nM02
i¼1 i¼1 mi ðT1i P2i  P1i T2i Þ2
(57)

and its estimate


!2
  X ^
^θiau  θappswru
n
^ ^θappswru ¼ 1
V : (58)
nðn  1Þ i¼1 M0

3.2.3.2. Probability proportional to size sampling without replacement. Φi


is the probability that the ith unit belongs to the first-stage sample and Φij is
the probability that both the ith and the jth units belongs to this sample,
using probability proportional to size sampling without replacement. The
unbiased estimator of θa is

^θa 1 X n
Mi ^θiau
¼ : (59)
ppsworu
M0 i¼1 Φi

The variance of ^θappsworu is


94 G. N. SINGH ET AL.

 
V ^θappsworu ¼
  !
1 XN X N   Mi θia Mj θja 2 X N
Mi2
Φi Φj  Φij  þ 2 Δi ;
M0 2 i¼1 j > i Φi Φj i¼1 mi ðT1i P2i  P1i T2i Þ

(60)
and its estimate
 
^ ^θappsworu ¼
V
!2 !
n  
1 X n X
Φi Φj  Φij Mi ^θiau Mj ^θjau XN
Mi2 ^
 þ 2 Δi ;
M02 1¼1 i > j Φij ϕi Φj i¼1 ðmi  1Þðδ i Þ
(61)
where

^ i ¼ P2 T θ þ T2i ^θia P θia þ P2i ^θia


Δ 2i
1i ia  þ T2i2 1i    2P2i T2i ðT1i P1i ^θai þ T2i P2i ^θbi Þ;
1 þ T3i kik1
i 2 1þP
P1i ki
3i ki 1

(62)

and δi ¼ T1i P2i  P1i T2i : (63)

4. Estimation of a rare sensitive attribute under stratified two-stage


sampling using a randomized response model
We consider a stratified population with L strata such that the hth stratum
has Nh ; h ¼ 1; 2; :::::; L; clusters, which are the first-stage units. Mhi is the
size of the ith cluster in the hth stratum, i ¼ 1; 2; :::Nh . At the first-stage, we
select nh clusters, which are the first-stage units, from the hth stratum with
probability phi . At the second-stage, we select mhi , i ¼ 1; 2; :::nh , second-stage
units from the ith cluster taken from the hth stratum, using simple random
sampling with replacement.

4.1. When the unrelated rare non-sensitive attribute is known


Considering the two-stage sampling scheme for stratified population where
the proportion πhib of people having the unrelated rare non-sensitive attri-
bute B in the ith cluster of the hth stratum is known and using the rando-
mization device described in Section 3.1, the probability that a respondent
answers “yes” in the ith cluster of the hth stratum is
MATHEMATICAL POPULATION STUDIES 95

 
khi
ζ hi0 ¼ ðT1hi πhia þ T2hi πhib Þ 1 þ T3hi : (64)
khi  1
The symbols T1hi , T2hi , and T3hi are the probabilities of presenting the state-
ments (i), (ii), and (iii) in the randomization device used in the ith cluster of the
hth stratum ðT1hi þ T2hi þ T3hi ¼ 1Þ. πhia is the proportion of individuals
having the rare sensitive attribute A in the ith cluster of the hth stratum.
Because the attributes A and B are rare, mhi ζ hi0 ¼ θhi0 > 0; mhi πhia ¼ θhia > 0,
and mhi πhib ¼ θhib > 0 for mhi ! 1 as ζ hi0 ! 0, πhia ! 0, and πhib ! 0.
ðyhi1 ; yhi2 ; :::; yhimhi Þ is the random sample of size mhi drawn from a Poisson
distribution of mean θhi0 from the ith cluster of the hth stratum.
The estimator ^θhia of the mean total number of persons bearing the rare
sensitive attribute in the ith cluster of the hth stratum is defined as
0 1
P
mhi
B yij C
^θhia ¼ 1 B B 
j¼1
  T2hi θhib C
C: (65)
T1hi @mhi 1 þ T3hi khi A
khi 1

An estimator of the mean total number θha of individuals having the rare
sensitive attribute in the hth stratum is

^θa ¼ 1 X nh
Mhi ^θhia
hppt
; (66)
nh Mh0 i¼1 phi
P
Nh
where Mh0 ¼ Mhi .
i¼1
Under the stratified two-stage sampling design, the final estimator ^θasppt of
the mean total number of persons having a rare sensitive attribute in the
population is

^θa ¼
X
L
1 X nh
Mhi ^θhia
sppt
Wh ; (67)
h¼1
nh Mh0 i¼1 phi

where phi is the initial probability of drawing the ith cluster, which is a first-
PL
stage unit, in the hth stratum, Wh ¼ NNh and N ¼ Nh :
h¼1

4.1.1. Properties of the estimator ^θasppt


Theorem 4.1. The estimator ^θasppt of the mean total number of persons having
the rare sensitive attribute is unbiased.
iid
Proof. Because yhij ,Pðθhi0 Þ ) Eðyhij Þ ¼ θhi0 , where
 
khi
θhi0 ¼ ðT1hi θhia þ T2hi θhib Þ 1 þ T3hi : (68)
khi  1
96 G. N. SINGH ET AL.

Also, ^θhia is an unbiased estimator of θhia . Hence,


!
  X
L
1 X nh
Mhi ^θhia
^
E θasppt ¼ E1 E2 Wh
nh Mh0 i¼1 phi
h¼1
!
XL
1 X nh
Mhi θhia
¼ E1 Wh
h¼1
nh Mh0 i¼1 phi
X
L
1 X Nh
Mhi θhia X L
¼ Wh phi ¼ Mh θha ¼ θa : (69)
h¼1
Mh0 i¼1 phi h¼1

Theorem 4.2. The variance of the unbiased estimator ^θasppt is

 2 X !
  XL
1 XNh
Mhi θh ia Nh 2
Mhi Ψhi
^
V θasppt ¼ Wh2 phi  Mh0 θha þ ;
2
nh Mh0 phi p mhi
h¼1 i¼1 i¼1 hi
(70)

where,

T1hi θhia þ T2hi θhia


Ψhi ¼  : (71)
2
T1hi 1 þ T3hi khikhi1

Proof. The variance of the estimator ^θasppt is decomposed as □


     
^ ^ ^
V θasppt ¼ V1 E2 θasppt þ E1 V2 θasppt : (72)

The first term of Eq. (72) is simplified to


!
  X
L
1 X nh
Mhi ^θhia
V1 E2 ^θasppt ¼ V1 E2 Wh
nh Mh0 i¼1 phi
h¼1
!
XL
1 X nh
Mhi θhia
¼ V1 Wh
h¼1
nh Mh0 i¼1 phi
X Nh  2
L
1 X Mhi θhia
¼ Wh2
2  Mh0 θha : (73)
h¼1
nh Mh0 i¼1
phi
iid
Because yhij ,Pðθhi0 Þ ) Vðyhij Þ ¼ θhi0 , the second term of Eq. (72) is
simplified to
MATHEMATICAL POPULATION STUDIES 97

!
  X
L
1 X nh
Mhi ^θhia
E1 V2 ^θasppt ¼ E1 V2 Wh
nh Mh0 i¼1 phi
h¼1
!
X
L
1 Xnh 2
Mhi V2 ð^θhia Þ
¼ E1 Wh2
ðnh Mh0 Þ2 i¼1 p2hi
0h¼1 0 11
P
mhi

BX 2 B
V2 ðyhij Þ CC
B
L
1 Xnh
Mhi B j¼1 CC
¼ E1 B Wh2 2 B  2 C C
@ h¼1 ðnh Mh0 Þ2 i¼1 p2hi T1hi @ AA
m2hi 1 þ T3hi khikhi1
0 0 11
BX 2 X
L nh
1 2
Mhi B θhi0 CC
¼ E1 @ Wh 2 2 T2 @  2 A A
h¼1 ðn M
h h0 Þ p
i¼1 hi 1hi mhi 1 þ T3hi khi hi1
k
0 0  11
BX L
1 Xnh 2
Mhi B ð T1hi θ hia þ T2hi θ hib Þ 1 þ T3hi
khi
khi 1 CC
¼ E1 @ Wh2 2 2 T2 @  2 AA:
h¼1 ðn M
h h0 Þ p
i¼1 hi 1hi mhi 1 þ T3hi khi hi1k

(74)

After simplification,

  XL
1 X Nh 2
Mhi Ψhi
E1 V2 ^θasppt ¼ Wh2 2 2 ; (75)
h¼1
nh Mh0 i¼1 phi mhi

where

T1hi θhia þ T2hi θhib


Ψhi ¼  : (76)
2
T1hi 1 þ T3hi khikhi1

Adding Eq. (73) and (75), we get the variance of the estimator ^θasppt as
given in Eq. (70). □
The unbiased estimator of the variance of ^θasppt is
!2
^ ^
X
L
1 Xnh
Mhi ^θhia ^
Vðθasppt Þ ¼ 2
Wh  θha : (77)
h¼1
nh ðnh  1ÞMh0
2
i¼1
phi

4.1.2. Estimation when the first-stage sample is selected using simple


random sampling with replacement
When first-stage sample units are selected with equal probabilities and with
replacement in two-stage sampling, the selection probabilities for all the selected
clusters at first stage from the hth stratum is phi ¼ N1h and the estimator of θa is
98 G. N. SINGH ET AL.

X
L
Nh X nh
^θa ¼ Wh Mhi ^θhia : (78)
swr
h¼1
n h M h0 i¼1

The variance of the estimator ^θhswr is


!
X
L
N N XNh XNh
M 2
Vð^θaswr Þ ¼
h h
Wh2 ðMhi θhia  h0 θha Þ2 þ hi
Ψhi ;
h¼1
n M
h h0
2 Nh  1 i¼1 i¼1
m hi

(79)
and its estimate
X
L Xnh  2
^ ^θaswr Þ ¼ 1 ^θhia  ^θha ;
Vð Wh2 Nh M hi
h¼1
nh ðnh  1ÞMh0
2
i¼1

 h ¼ Mh0 :
where M Nh

4.1.3. Estimation when the first-stage sample is selected using probability


proportional to size sampling
We consider estimation procedures when first-stage sample units from each
stratum are selected using a probability proportional to size sampling, with
replacement and without replacement sampling schemes.

4.1.3.1. Probability proportional to size sampling with replacement. The


probability of selecting the ith cluster from the hth stratum in the sample
is phi ¼ M
Mhi
h0
, hence the corresponding unbiased estimator of θa is
X
L
1X
nh
^θa ¼ Wh ^θhia : (80)
sppswr
h¼1
nh i¼1

The variance of the estimator ^θasppswr is


!
X
L
1 XNh XNh
M
Vð^θasppswr Þ ¼
hi
Wh2 Mhi ðθhia  θha Þ2 þ Ψhi ; (81)
h¼1
n M
h h0 i¼1 i¼1
m hi

and its estimate


!2
X
L Xnh ^
^ ^θasppswr Þ ¼
Vð Wh2
1 ^θhia  θha : (82)
h¼1
nh ðnh  1ÞMh0 i¼1
2 M0

4.1.3.2. Probability proportional to size sampling without replacement. Φhi


is the probability that the ith unit belongs to the first-stage sample and Φij
the probability that both the ith and the jth units belong to this sample, using
probability proportional to size sampling without replacement from the hth
stratum. The unbiased estimator of θa is
MATHEMATICAL POPULATION STUDIES 99

^θa
X
L
1 X nh
Mhi ^θhia
¼ Wh : (83)
sppswor
h¼1
Mh0 i¼1 Φhi

The variance of the estimator ^θasppswor is


 
V ^θasppswor ¼
 2 X !
XL
1 XNh XNh
  Mhi θ hia Mhij θ hja
Nh
M 2
Wh2 2 Φhi Φhj  Φhij  þ hi
Ψhi ;
h¼1
M h0 i¼1 J > i
Φ hi Φ hj i¼1
m hi

(84)
and its estimate
!2
  X nh  
^ ^θasppswor ¼
L
1 Xnh X
Φ hi Φ hj  Φ hij Mhi ^θhia Mhij^θhja
V 2
Wh 2 
h¼1
M h0 i¼1 J>i
Φ hij Φ hi Φhj
!
X nh
Mhi 2
þ Ψhi ;
i¼1
mhi  1
(85)
where
^ ^
^ hi ¼ T1hiθhia þ T2hi θhib :
Ψ (86)
2
T1hi 1 þ T3hi khikhi1

4.2. When the unrelated rare non-sensitive attribute is unknown


We consider the estimation procedure for the mean total number of persons
having a rare sensitive attribute under stratified two-stage sampling design
when πhib is unknown. The randomization device in the ith cluster of the hth
stratum is the same as in Section 3.2. The probabilities that respondents in
the ith cluster of the hth stratum answer “yes” are
 
khi
ζ hi1 ¼ ðT1hi πhia þ T2hi πhib Þ 1 þ T3hi ; (87)
khi  1
and
 
khi
ζ hi2 ¼ ðP1hi πhia þ P2hi πhib Þ 1 þ P3hi ; (88)
khi  1
where (T1hi , T2hi , T3hi ) are the probabilities of presenting the statements (i), (ii),
and (iii) in the first randomization device, when this device is used in the ith
cluster of the hth stratum. (P1hi ; P2hi , P3hi ) are the probabilities of presenting the
100 G. N. SINGH ET AL.

statements (i), (ii), and (iii) in the second randomization device when it is used in
the ith cluster of the hth stratum (T1hi þ T2hi þ T3hi ¼ 1, P1hi þ P2hi þ P3hi ¼ 1).
Because the attributes A and B are rare, we write mhi ζ hi1 ¼ θhi1 > 0, and
mhi ζ hi2 ¼ θhi2 > 0 as mhi ! 1, ζ hi1 ! 0, and ζ hi1 ! 0.
As in Section 3.2, and simplifying Eq. (87) and (88), we get
  
1 X mhi
khi
yhi1j ¼ 1 þ T3hi T1hi ^θhia þ T2hi ^θhib ; (89)
mhi j¼1 khi  1

and
  
1 Xmhi
khi
yhi2j ¼ 1 þ P3hi P1hi ^θhia þ P2hi ^θhib ; (90)
mhi j¼1 khi  1

where yhi1j and yhi2j are the first and the second answers of the jth ðj ¼
1; 2; :::; mhi Þ respondent in the ith cluster of the hth stratum.
Solving Eq. (89) and (90), the estimators of θhia and θhib are
0 1
P
mhi P
mhi
B P2hi j¼1 yhi1j T2hi yhi2j C
^θhiau ¼ 1 B j¼1 C
B  C; (91)
mhi ðT1hi P2hi  P1hi T2hi Þ @1 þ T3hi k hi1 1 þ P3hi k k1
k hi A
hi hi

where T1hi P2hi Þ P1hi T2hi , and


0 1
P
mhi P
mhi
B P1hi j¼1 yhi1j T1hi yhi2j C
^θhibu 1 B j¼1 C
¼ B  C; (92)
mhi ðP1hi T2hi  T1hi P2hi Þ @1 þ T3hi k khi1 1 þ P3hi k khi1A
hi hi

where P1hi T2hi Þ T1hi P2hi .


The estimator of the mean total number θha of persons having a rare
sensitive attribute in the hth stratum is

^θa 1 X nh
Mhi ^θhiau
hpptu
¼ : (93)
nh Mh0 i¼1 phi

The final estimator ^θaspptu of the mean total number θa of persons having
a rare sensitive attribute in the population under stratified two-stage sam-
pling design is

^θa ¼
X
L
1 X nh
Mhi ^θhiau
spptu
Wh : (94)
h¼1
nh Mh0 i¼1 phi
MATHEMATICAL POPULATION STUDIES 101

4.2.1. Properties of the estimator ^θaspptu

Theorem 4.3. The estimator ^θaspptu of the mean total number of persons having
the rare sensitive attribute is unbiased.

Proof. We consider
!
    X
L
1 X nh
Mhi ^θhiau
E ^θaspptu ¼ E1 E2 ^θaspptu ¼ E1 E2 Wh (95)
h¼1
nh Mh0 i¼1 phi

Consider
0 1
P
mhi P
mhi

  BP2hi j¼1 E2 ðyhi1j Þ T2hi j¼1 E2 ðyhi2j ÞC


^ 1 B C
E2 θhiau ¼ B  C
mhi ðT1hi P2hi  P1hi T2hi Þ @ 1 þ T3hi k 1
k hi
1 þ P 3hi k
k hi
1
A
hi hi

0 1
P
mhi P
mhi

B P2hi j¼1 θhi1


T2hi θhi2
C
1 B j¼1 C
¼ B  C
mhi ðT1hi P2hi  P1hi T2hi Þ @1 þ T3hi k khi1 1 þ P3hi k khi1A
hi hi

¼ θhia ;
(96)
because yhi1j ,Pðθhi1 Þ and yhi2j ,Pðθhi2 Þ.
iid iid

Eq. (95) is simplified to


!
  X
L
1 X nh
Mhi θhia
E1 E2 ^θaspptu ¼ E1 Wh
h¼1
nh Mh0 i¼1 phi
X
L
1 X Nh
Mhi θhia X L
1 X Nh
¼ Wh phi ¼ Wh Mhi θhia
h¼1
Mh0 i¼1 phi h¼1
Mh0 i¼1
X
L
¼ Wh θha ¼ θa : (97)
h¼1

Theorem 4.4. The variance of the estimator ^θaspptu is

 2 X !
  XL
1 XNh
Mhi θhia Nh
Mhi 2 Δhi
V ^θaspptu ¼ Wh2 2 phi  Mh0 θha þ ;
h¼1
nh Mh0 i¼1
phi i¼1
phi mhi
(98)
102 G. N. SINGH ET AL.

where
Δhi ¼ ðT1hi Chi P2hi þ P1hi T2hi Dhi Þθha þ ðT2hi P2hi Chi þ P2hi T2hi Dhi Þθhb
P2hi
 2P2hi T2hi ðT1hi P1hi θhai þ T2hi P2hi θhbi Þ; Chi ¼  ; (99)
1 þ T3hi khikhi1

and
T2hi
Dhi ¼  : (100)
2
P1hi 1 þ P3hi khikhi1

Proof. The variance of the estimator ^θaspptu is


     
V ^θaspptu ¼ V1 E2 ^θaspptu þ E1 V2 ^θaspptu : (101)

The first term of Eq. (101) is simplified to


!
  X
L
1 X
nh
Mhi ^θhiau
V1 E2 ^θaspptu ¼ V1 E1 Wh
nh Mh0 i¼1 phi
h¼1
!
XL
1 X nh
Mhi θhia
¼ V1 Wh
h¼1
nh Mh0 i¼1 phi
X  2
L
1 X Nh
Mhi θhia
¼ Wh2
2 phi  Mh0 θha : (102)
h¼1
nh Mh0 i¼1
phi

The second term of Eq. (101) is simplified to


!
  X
L
1 X nh
Mhi ^θhiau
^
E1 V2 θaspptu ¼ E1 V2 Wh
h¼1
nh Mh0 i¼1 phi

0  1
XL nh Mhi 2 V2 ^
X θ
1 hiau
¼ E1 @ Wh2 2
A (103)
h¼1 ðnh Mh0 Þ i¼1 phi 2

 
V2 ð ^θhiau
0 0 11
P
mhi P
mhi
B B P2hi j¼1 yhi1j T2hi yhi2j CC
B 1 B j¼1 CC
¼ V2 B B  khi C C: (104)
@mhi ðT1hi P2hi  P1hi T2hi Þ @1 þ T3hi hi k
1 þ P 3hi AA
k 1 hi k 1 hi

We assume that mhi ðT1hi P2hi  P1hi T2hi Þ ¼ δhi ,


MATHEMATICAL POPULATION STUDIES 103

  X
L
1 X nh
Mhi 2 1 X
mhi X
mhi
E1 V2 ^θaspptu ¼ E1 Wh2 C 2
hi V2 ðy hi1j Þ þ D 2
hi V2 ðyhi2j Þ
h¼1
nh Mh0 2 i¼1 phi 2 ðδhi Þ2 j¼1 j¼1
!!
X
mhi
2Chi Dhi Covðyhi1j ; yhi2j
j¼1

X
L
1 X nh
Mhi 2 1 X
mhi X
mhi

¼ E1 Wh2 2 2 C 2
hi θ hi1 þ D 2
hi θhi2
h¼1
n h M h0 i¼1
p hi
2
ðδ hi Þ j¼1 j¼1
!!
X
mhi

2Chi Dhi θhi12


j¼1

(105)
where
 
khi
θhi1 ¼ Vðyhi1j Þ ¼ ðT1hi θhia þ T2hi θhib Þ 1 þ T3hi ; (106)
khi  1

 
khi
θhi2 ¼ Vðyhi2j Þ ¼ ðP1hi θhia þ P2hi θhib Þ 1 þ P3hi ; (107)
khi  1
and
θhi12 ¼ Covðyhi1j ; yhi2j Þ ¼ Eðyhi1j ; yhi2j Þ  Eðyhi1j ÞEðyhi2j Þ
  
khi khi
¼ ðP1hi T1hi θhia þ P2hi T2hi θhib Þ 1 þ P3hi 1 þ T3hi :
khi  1 khi  1
(108)
Substituting the values from Eq. (102) and (105) in Eq. (101), we get the
expression of the variance of the estimator ^θaspptu as given in Eq. (98). □
The estimate of the variance of the estimator ^θaspptu is
!2
  X L
1 Xnh ^θhiau Mhi
V^ ^θaspptu ¼ Wh2  ^θhiau : (109)
h¼1
n h ðnh  1ÞM 2
h0 i¼1 p hi

4.2.2. Estimation when the first-stage sample is selected using simple


random sampling with replacement
When first-stage sample units are selected with equal probability and with repla-
cement in two-stage stratified sampling, the probabilities of selecting the ith from
the hth stratum is phi ¼ N1h : The corresponding unbiased estimator of θa is
X
L
Nh X h n
^θa ¼ Wh Mhi ^θhiau : (110)
swru
h¼1
n h M h0 i¼1

The variance of the estimator ^θaswru is


104 G. N. SINGH ET AL.

  X L
Nh X Nh
^ 2 Nh  h θha Þ2
V θaswru ¼ Wh ðMhi θhia  M
h¼1
nh Mh0 Nh  1 i¼1
2
! (111)
XNh
Δhi
þ 2
Mhi ;
i¼1 mhi ðT1hi P2hi  P1hi T2hi Þ2
 h ¼ Mh0 , and its estimate
where M Nh
!
  XL
1 Xnh  2
^ ^θaswru ¼
V Wh2 Nh Mhi ^θhiau  ^θhau : (112)
h¼1
nh ðnh  1ÞMh0
2
1¼1

4.2.3. Estimation when the first-stage sample is selected using probability


proportional to size sampling
4.2.3.1. Probability proportional to size sampling with replacement. The
probability of selecting the ith cluster from the hth stratum in the sample
is phi ¼ M
Mhi
h0
. The corresponding unbiased estimator of θa is
X
L
1X hn
^θa ¼ Wh ^θhiau : (113)
sppswru
h¼1
nh i¼1

The variance of ^θasppswru is


  X L
Nh X Nh
V ^θasppswru ¼ Wh2 2 ðMhi ðθhia  θha ÞÞ2
h¼1
n M
h h0 i¼1
! (114)
X N h
Δhi
þ Mhi ;
i¼1 mhi ðT1hi P2hi  P1hi ; T2hi Þ2
and its estimate
!2
  XL
1 X
nh ^θ
^ ^θasppswru ¼
V Wh2 ^θhiau  ahppswru : (115)
h¼1
n h ð nh  1 Þ i¼1
Mh0

4.2.3.2. Probability proportional to size sampling without replacement. Φhi


is the probability that the ith unit belongs to the first-stage sample and Φhij is
the probability that both the ith and the jth units belong to the first-stage
sample, using a probability proportional to size sampling without replace-
ment from the hth stratum. The unbiased estimator of θa is

^θa
X
L
1 X nh
Mhi ^θhiau
¼ Wh : (116)
sppsworu
h¼1
Mh0 i¼1 Φhi

The variance of ^θasppsworu is


MATHEMATICAL POPULATION STUDIES 105

  PL P Nh 
Nh P  2
V ^θasppsworu ¼
M θ
Wh2 ðM1 2 Φhi Φhj  Φhij MΦhi θhihia  Φhj hjhja
h0
h¼1 i¼1 j > i
(117)
P
Nh
Mhi 2
þ M1 2 mhi ðT1hi P2hi P1hi T2hi Þ2
Δhi Þ;
h0
i¼1

and its estimate


  PL P nh 
nh P  ^ 2
^ ^θasppsworu ¼ Φhi Φhj Φhij Mhi θhiau Mhj ^θhjau
V Wh2 ðM12 Φhij ϕ  Φhj hi
h¼1 h0 1¼1 i > j
(118)
P
Nh 2
þ M12
Mhi ^ Þ;
Δ
h0 ðmhi 1Þðδhi Þ2 hi
i¼1

where
^ ^ ^ ^
^ hi ¼ P2 T1hi θhia þ T2hi θhia þ T 2 P1hiθhia þ P2hi θhia
Δ 2hi 2hi
1 þ T3hi khik1
hi 2
P1hi 1 þ P3hi khik1
hi

 2P2hi T2hi ðT1hi P1hi ^θahi þ T2hi P2hi ^θbhi Þ; (119)


and
δhi ¼ T1hi P2hi  P1hi T2hi : (120)

5. Numerical comparison
We compare the performance of our estimation procedures to those of Lee
et al. (2014).

5.1. Simulated quantities


We examine Lee et al. (2014)’s estimators under two-stage sampling and
stratified two-stage sampling schemes, where first-stage samples are drawn
from the clustered population using sampling scheme probability propor-
tional to size with replacement.
(a) When the proportion of persons having the unrelated rare attribute is
known under two-stage sampling design:
!
1 Xn
M 1 1 Xmi
^λ1 ¼
i
yij  ð1  Ti1 Þλi2 ; (121)
ppswr
nM0 i¼1 pi T1i mi j¼1

and its variance is


106 G. N. SINGH ET AL.

!
  1 XN  2 XN
M
V ^λ1ppswr ¼
i
Mi hi1  λ1 þ Υi ; (122)
nM0 i¼1 i¼1
mi

where
λi1 ð1  T1i λi1 Þ
Υi ¼ þ : (123)
T1i T1i2

(b) When the proportion of persons having the unrelated rare attribute is
unknown under two-stage sampling design:
!
1 Xn
M 1 Xmi Xmi
^λ1 ¼
i
ð1  Ti1 Þ yi1j  ð1  Pi1 Þ yi2j ;
ppswru
nM0 i¼1 pi mi ðT1i  P1i Þ j¼1 j¼1

(124)
and its variance is
!
  N X
N X
N
Λi
V ^λ1ppswru ¼ þ ðMi ðλi1  λ1 ÞÞ2 þ Mi ;
nM02 i¼1 i¼1 mi ðT1i  P1i Þ2
(125)
where
 
Λi ¼ T1i ð1  Pi1 Þ2 þ P1i ð1  Ti1 Þ2  2T1i P1i ð1  Pi1 Þð1  Ti1 Þ
 
þ ð1  Pi1 Þð1  Ti1 Þð2  Ti1  Pi1 Þ  2ð1  Pi1 Þ2 ð1  Ti1 Þ2 : (126)

(c) When the proportion of persons having the unrelated rare attribute is
known under stratified two-stage sampling design:
!
XL
1 Xnh
M 1 1 X
mhi
^λ1 ¼ Wh
hi
yhij  ð1  T1hi Þλhi2 (127)
sppswr
h¼1
n h M h0 i¼1
p hi T 1hi m hi j¼1

and its variance is


!
  XL
1 XN XN
Mhi
V ^λ1sppswr ¼ Wh2 2
Mhi ðλhi1  λh1 Þ þ Υ hi ; (128)
h¼1
nh Mh0 i¼1 i¼1
mhi

where
λhi1 ð1  T1hi λhi1 Þ
Υ hi ¼ þ 2 : (129)
T1hi T1hi

(d) When the proportion of persons having the unrelated rare attribute is
unknown under stratified two-stage sampling design:
MATHEMATICAL POPULATION STUDIES 107

!
^λ1 1 X nh
Mhi 1 X
mhi X
mhi
¼ ð1  T1hi Þ yhi1j  ð1  P1hi Þ yhi2j
sppswru
nh Mh0 i¼1 phi mhi ðT1hi  P1hi Þ j¼1 j¼1

(130)
and its variance is
!
  XL
1 XNh XNh
Λhi
V ^λ1sppswru ¼ Wh2 2
ðMhi ðλhi1  λh1 ÞÞ þ Mhi ;
h¼1
nh Mh0 i¼1 i¼1 mhi ðT1hi  P1hi Þ2
(131)
where
 
Λhi ¼ T1hi ð1  P1hi Þ2 þ P1hi ð1  T1hi Þ2  2T1hi P1hi ð1  P1hi Þð1  T1hi Þ
 
þ ð1  P1hi Þð1  T1hi Þð2  T1hi  P1hi Þ  2ð1  P1hi Þ2 ð1  T1hi Þ2 :
(132)
The relative efficiencies in percent of the estimators ^θappswr , ^θappswru , ^θasppswr , and
^θa with respect to the estimators ^λ1ppswr , ^λ1ppswru , ^λ1sppswr , and ^λ1sppswru are
sppswru

Vð^λ1ppswr Þ Vð^λ1ppswru Þ Vð^λ1sppswru Þ


E1 ¼  100; E2 ¼  100; E3 ¼  100; (133)
Vð^θappswr Þ Vð^θappswru Þ Vð^θasppswr Þ
and
Vð^λ1sppswru Þ
E4 ¼  100: (134)
Vð^θa sppswru
Þ
To perform the numerical comparison under two-stage sampling design, we
consider a population of five clusters (N= 5) with sizes
M1 ¼ 1000; M2 ¼ 2000; M3 ¼ 1000; M4 ¼ 3000, and M5 ¼ 5000. Two clus-
ters (n= 2) are drawn using the probability proportional to size with replace-
ment sampling scheme. We assume

(a) pi ¼ MMi
0
, where M0 ¼ 12000,
(b) θ1b ¼ θ2b ¼ θ3b ¼ θ4b ¼ θ5b ¼ 1 are taken for the rare unrelated
attribute,
(c) the values of P1i , P2i , Ti1 , and Ti2 are equal in all clusters,
P1i ¼ P1 ; P2i ¼ P2 ; T1i ¼ T1 ; T2i ¼ T2 ; i ¼ 1; ; 2; :::; 5,
Pi3 ¼ 1  Pi1  Pi2 , and Ti3 ¼ 1  Ti1  Ti2 ,
(d) the total number of cards, k1i ¼ k ¼ 100, is proposed deck for each
cluster.

In the stratified two-stage sampling scheme, a population is stratified into two


strata ðh ¼ 2Þ, each of which consists of two clusters ðN1 ¼ 2Þ with sizes M1i ¼
ð2000; 3000Þ for i ¼ 1; 2 in the first stratum and three clusters ðN2 ¼ 3Þ with sizes
108 G. N. SINGH ET AL.

M1i ¼ ð3000; 4000; 5000Þ for i ¼ 1; 2; 3 in the second stratum. From each stra-
tum, we draw a cluster ðn1 ¼ n2 ¼ 1Þ. The samples from each cluster represent
10% of the cluster size. The parameters of the unrelated rare attribute are equal to
1 in both cases θijb ¼ 1; i; j ¼ 1; 2. We have taken M0 ¼ 17000 and P1hi , P2hi , T1hi ,
and T2hi are equal for all clusters and strata (P111 ¼ P121 ¼ P1 ;
P211 ¼ P221 ¼ P2 ; T111 ¼ T121 ¼ T1 , and T211 ¼ T221 ¼ T2 ). The relative effi-
ciencies in percent E1 , E2 , E3 , and E4 are shown in Tables 1–6.

5.2. Results
Tables 1 and 2 present the relative efficiencies in percent of the estimator
^θa under two-stage sampling design for the known unrelated rare attribute
ppswr

Table 1. Relative efficiencies in percent of the estimator ^θappswr with respect to the estimator
^λ1ppswr .
Mean total number of persons possessing the rare sensitive Probabilities of statements selection
attribute A (T1 ¼ T2 )
Relative efficiencies in %
θ1a θ2a θ3a θ4a θ5a 0.1 0.2 0.3 0.4 0.5
1 1 1 1 1 904.0 401.5 234.0 150.3 100.0
1 1 1 1 2 197.5 122.1 107.7 102.6 100.0
1 1 1 2 1 224.5 128.7 110.1 103.4 100.0
1 1 1 2 2 159.2 113.2 104.6 101.6 100.0
1 1 2 1 1 272.2 140.9 114.5 104.9 100.0
1 1 2 1 2 168.1 115.3 105.4 101.8 100.0
1 1 2 2 1 180.1 118.2 106.4 102.2 100.0
1 1 2 2 2 147.1 110.6 103.7 101.3 100.0
1 2 1 1 1 272.2 140.9 114.5 104.9 100.0
1 2 1 1 2 168.1 115.3 105.4 101.8 100.0
1 2 1 2 1 180.1 118.2 106.4 102.2 100.0
1 2 1 2 2 147.1 110.6 103.7 101.3 100.0
1 2 2 1 1 197.3 122.3 107.9 102.7 100.0
1 2 2 1 2 152.5 111.8 104.2 101.4 100.0
1 2 2 2 1 159.4 113.4 104.8 101.6 100.0
1 2 2 2 2 139.3 108.9 103.2 101.1 100.0
2 1 1 1 1 379.3 171.3 126.0 108.9 100.0
2 1 1 1 2 180.1 118.2 106.4 102.2 100.0
2 1 1 2 1 197.3 122.3 107.9 102.7 100.0
2 1 1 2 2 152.5 111.8 104.2 101.4 100.0
2 1 2 1 1 223.8 128.9 110.3 103.5 100.0
2 1 2 1 2 159.4 113.4 104.8 101.6 100.0
2 1 2 2 1 168.2 115.5 105.5 101.9 100.0
2 1 2 2 2 142.9 109.7 103.5 101.2 100.0
2 2 1 1 1 223.8 128.9 110.3 103.5 100.0
2 2 1 1 2 159.4 113.4 104.8 101.6 100.0
2 2 1 2 1 168.2 115.5 105.5 101.9 100.0
2 2 1 2 2 142.9 109.7 103.5 101.2 100.0
2 2 2 1 1 180.1 118.4 106.5 102.2 100.0
2 2 2 1 2 147.3 110.8 103.8 101.3 100.0
2 2 2 2 1 152.7 112.0 104.3 101.5 100.0
2 2 2 2 2 136.4 108.3 103.0 101.0 100.0
*T3 ¼ 1  T1  T2 .
MATHEMATICAL POPULATION STUDIES 109

Table 2. Relative efficiencies in percent of the estimator ^θappswr with respect to the estimator
^λ1ppswr .
Probabilities of statement selection T1
Mean total number of persons possessing the rare 0.1 0.2 0.3 0.4 0.5 0.1 0.1
sensitive attribute A Probabilities of statement selection T2
Relative efficiencies in %
θ1a θ2a θ3a θ4a θ5a 0.2 0.3 0.4 0.5 0.4 0.7 0.8
1 1 1 1 1 569.0 301.0 186.2 122.3 122.3 150.3 122.3
1 1 1 1 2 184.3 119.1 106.1 101.4 100.9 122.8 111.3
1 1 1 2 1 205.7 124.7 108.0 101.8 101.2 126.4 112.9
1 1 1 2 2 152.6 111.6 103.7 100.8 100.6 116.3 108.3
1 1 2 1 1 241.7 134.8 111.4 102.6 101.8 131.2 114.9
1 1 2 1 2 160.1 113.4 104.3 101.0 100.7 118.0 109.1
1 1 2 2 1 170.1 115.8 105.1 101.2 100.8 120.1 110.1
1 1 2 2 2 142.2 109.3 103.0 100.7 100.5 113.7 107.0
1 2 1 1 1 241.7 134.8 111.4 102.6 101.8 131.2 114.9
1 2 1 1 2 160.1 113.4 104.3 101.0 100.7 118.0 109.1
1 2 1 2 1 170.1 115.8 105.1 101.2 100.8 120.1 110.1
1 2 1 2 2 142.2 109.3 103.0 100.7 100.5 113.7 107.0
1 2 2 1 1 184.2 119.4 106.3 101.4 101.0 122.8 111.3
1 2 2 1 2 146.9 110.4 103.4 100.8 100.5 114.9 107.6
1 2 2 2 1 152.7 111.8 103.8 100.9 100.6 116.3 108.3
1 2 2 2 2 135.4 107.8 102.5 100.6 100.4 111.9 106.2
2 1 1 1 1 314.9 158.8 120.0 104.6 103.3 138.3 117.8
2 1 1 1 2 170.1 115.8 105.1 101.2 100.8 120.1 110.1
2 1 1 2 1 184.2 119.4 106.3 101.4 101.0 122.8 111.3
2 1 1 2 2 146.9 110.4 103.4 100.8 100.5 114.9 107.6
2 1 2 1 1 205.2 124.9 108.1 101.9 101.3 126.3 112.8
2 1 2 1 2 152.7 111.8 103.8 100.9 100.6 116.3 108.3
2 1 2 2 1 160.2 113.6 104.4 101.0 100.7 118.0 109.1
2 1 2 2 2 138.6 108.6 102.8 100.6 100.4 112.7 106.6
2 2 1 1 1 205.2 124.9 108.1 101.9 101.3 126.3 112.8
2 2 1 1 2 152.7 111.8 103.8 100.9 100.6 116.3 108.3
2 2 1 2 1 160.2 113.6 104.4 101.0 100.7 118.0 109.1
2 2 1 2 2 138.6 108.6 102.8 100.6 100.4 112.7 106.6
2 2 2 1 1 170.1 116.1 105.2 101.2 100.8 120.1 110.1
2 2 2 1 2 142.4 109.5 103.1 100.7 100.5 113.7 107.1
2 2 2 2 1 147.0 110.6 103.4 100.8 100.5 114.9 107.6
2 2 2 2 2 132.9 107.3 102.4 100.6 100.4 111.2 105.8
*T3 ¼ 1  T1  T2 .

B, when the probabilities that an individual has the rare sensitive attribute A
and the rare unrelated attribute B are either equal ðT1 ¼ T2 Þ or unequal
ðT1 ÞT2 Þ. The relative efficiencies in percent of the estimator ^θappswr are
greater than 100 for all considered parameters, which implies that the
estimator ^θappswr performs better than the estimator ^λ1ppswr . The relative effi-
ciencies are decreasing when T1 and T2 increase simultaneously. T1 ¼ T2
must not exceed 0.5. The estimator ^θappswr converges to the estimator ^λ1ppswr
when T1 ¼ T2 ¼ 0:5, (because T3 ¼ 0) and provides the relative efficiencies
values as 100. Therefore the range of T1 ¼ T2 may be 0; 0:5½. The gain is
also observed in relative efficiencies for the smaller values of both T1 and T2 .
110 G. N. SINGH ET AL.

Tables 3 and 4 show that the relative efficiencies in percent of the estimator
^θa are obtained for the unknown rare unrelated attribute B, when (i) the
ppswru

probability that an individual has the rare sensitive attribute A and the rare
unrelated attribute B are identical ðP1 ¼ P2 ¼ P; T1 ¼ T2 ¼ TÞ for both ran-
domized devices and (ii) the probability that an individual has the rare sensitive
attribute A and the unrelated rare attribute B are different ðP1 ÞP2 ; T1 ÞT2 Þ for
both randomized devices. Tables 3 and 4 also show the relative efficiencies
in percent of the estimator ^θappswru exceed 100 for all chosen parametric values.
The proposed estimator ^θa performs well whenever T > P. The values of P
ppswru

Table 3. Relative efficiencies in percent of the estimator ^θappswru with respect to the estimator
^λ1ppswru .
Probabilities of statements selection T1 ¼ T2
Mean total number of persons possessing the rare 0.1 0.1 0.2 0.25 0.1 0.2 0.25
sensitive attribute A Probabilities of statements selection P1 ¼ P2
Relative efficiencies in %
θ1a θ2a θ3a θ4a θ5a 0.2 0.3 0.3 0.3 0.4 0.4 0.4
1 1 1 1 1 288.5 395.6 155.7 111.2 483.2 180.6 124.6
1 1 1 1 2 129.3 111.7 115.6 107.6 106.6 106.6 104.4
1 1 1 2 1 137.1 115.4 119.0 108.3 108.8 108.6 105.5
1 1 1 2 2 119.5 107.5 110.9 106.4 104.2 104.3 103.0
1 1 2 1 1 150.7 122.5 124.4 109.1 113.0 112.3 107.5
1 1 2 1 2 122.3 108.7 112.3 106.8 104.9 105.0 103.4
1 1 2 2 1 126.2 110.3 114.2 107.3 105.9 105.9 104.0
1 1 2 2 2 116.7 106.4 109.5 105.9 103.6 103.7 102.6
1 2 1 1 1 150.7 122.5 124.4 109.1 113.0 112.3 107.5
1 2 1 1 2 122.3 108.7 112.3 106.8 104.9 105.0 103.4
1 2 1 2 1 126.2 110.3 114.2 107.3 105.9 105.9 104.0
1 2 1 2 2 116.7 106.4 109.5 105.9 103.6 103.7 102.6
1 2 2 1 1 131.6 112.8 116.7 107.9 107.3 107.3 104.8
1 2 2 1 2 118.6 107.1 110.5 106.3 104.0 104.2 102.9
1 2 2 2 1 120.9 108.1 111.7 106.6 104.6 104.7 103.2
1 2 2 2 2 114.9 105.6 108.6 105.6 103.2 103.3 102.4
2 1 1 1 1 180.0 141.8 134.0 110.1 125.1 121.4 111.5
2 1 1 1 2 126.2 110.3 114.2 107.3 105.9 105.9 104.0
2 1 1 2 1 131.6 112.8 116.7 107.9 107.3 107.3 104.8
2 1 1 2 2 118.6 107.1 110.5 106.3 104.0 104.2 102.9
2 1 2 1 1 139.9 116.8 120.2 108.5 109.7 109.4 106.0
2 1 2 1 2 120.9 108.1 111.7 106.6 104.6 104.7 103.2
2 1 2 2 1 124.0 109.4 113.2 107.1 105.4 105.5 103.7
2 1 2 2 2 116.3 106.2 109.3 105.9 103.5 103.7 102.6
2 2 1 1 1 139.9 116.8 120.2 108.5 109.7 109.4 106.0
2 2 1 1 2 120.9 108.1 111.7 106.6 104.6 104.7 103.2
2 2 1 2 1 124.0 109.4 113.2 107.1 105.4 105.5 103.7
2 2 1 2 2 116.3 106.2 109.3 105.9 103.5 103.7 102.6
2 2 2 1 1 128.1 111.2 115.1 107.5 106.4 106.5 104.3
2 2 2 1 2 117.9 106.9 110.1 106.2 103.9 104.0 102.8
2 2 2 2 1 119.9 107.7 111.2 106.5 104.4 104.5 103.1
2 2 2 2 2 114.7 105.6 108.5 105.6 103.2 103.3 102.3
*P3 ¼ 1  P1  P2 and T3 ¼ 1  T1  T2 .
MATHEMATICAL POPULATION STUDIES 111

Table 4. Relative efficiencies in percent of the estimator ^θappswru with respect to the estimator
^λ1ppswru .
Probabilities of statements selection
T1 0.1 0.2 0.3 0.1 0.2 0.3 0.5
T2 0.2 0.3 0.2 0.2 0.3 0.2 0.1
Mean total number of persons
possessing the rare sensitive P1 0.3 0.3 0.5 0.4 0.4 0.4 0.3
attribute A P2 0.4 0.4 0.4 0.5 0.5 0.5 0.2
θ1a θ2a θ3a θ4a θ5a Relative efficiencies in %
1 1 1 1 1 233.5 119.2 126.9 278.8 131.7 137.8 114.3
1 1 1 1 2 108.8 106.7 102.6 105.3 103.6 110.5 101.1
1 1 1 2 1 111.5 108.0 103.4 107.0 104.6 112.7 101.5
1 1 1 2 2 105.7 104.9 101.7 103.4 102.4 107.2 100.5
1 1 2 1 1 116.6 110.0 104.8 110.4 106.4 116.3 102.0
1 1 2 1 2 106.7 105.5 102.0 104.0 102.8 108.1 100.6
1 1 2 2 1 107.9 106.3 102.4 104.8 103.3 109.3 100.7
1 1 2 2 2 105.0 104.4 101.5 103.0 102.1 106.1 100.3
1 2 1 1 1 116.6 110.0 104.8 110.4 106.4 116.3 102.0
1 2 1 1 2 106.7 105.5 102.0 104.0 102.8 108.1 100.6
1 2 1 2 1 107.9 106.3 102.4 104.8 103.3 109.3 100.7
1 2 1 2 2 105.0 104.4 101.5 103.0 102.1 106.1 100.3
1 2 2 1 1 109.8 107.3 102.9 105.9 104.0 110.9 100.9
1 2 2 1 2 105.5 104.9 101.7 103.3 102.4 106.7 100.4
1 2 2 2 1 106.3 105.4 101.9 103.8 102.7 107.5 100.4
1 2 2 2 2 104.4 104.1 101.3 102.7 101.9 105.4 100.2
2 1 1 1 1 129.6 113.4 108.2 119.6 110.8 122.5 103.4
2 1 1 1 2 107.9 106.3 102.4 104.8 103.3 109.3 100.7
2 1 1 2 1 109.8 107.3 102.9 105.9 104.0 110.9 100.9
2 1 1 2 2 105.5 104.9 101.7 103.3 102.4 106.7 100.4
2 1 2 1 1 112.7 108.7 103.7 107.8 105.1 113.2 101.1
2 1 2 1 2 106.3 105.4 101.9 103.8 102.7 107.5 100.4
2 1 2 2 1 107.3 106.0 102.2 104.4 103.1 108.4 100.5
2 1 2 2 2 104.9 104.4 101.5 102.9 102.1 105.9 100.2
2 2 1 1 1 112.7 108.7 103.7 107.8 105.1 113.2 101.1
2 2 1 1 2 106.3 105.4 101.9 103.8 102.7 107.5 100.4
2 2 1 2 1 107.3 106.0 102.2 104.4 103.1 108.4 100.5
2 2 1 2 2 104.9 104.4 101.5 102.9 102.1 105.9 100.2
2 2 2 1 1 108.7 106.8 102.6 105.3 103.6 109.7 100.5
2 2 2 1 2 105.4 104.8 101.6 103.2 102.3 106.4 100.2
2 2 2 2 1 106.0 105.2 101.8 103.6 102.6 107.0 100.2
2 2 2 2 2 104.4 104.1 101.3 102.7 101.9 105.3 100.1
*P3 ¼ 1  P1  P2 and T3 ¼ 1  T1  T2 .

and T must lie within 0; 0:5½ and the estimator ^θappswru converges to the
estimator ^λ1ppswru when T ¼ P ¼ 0:5 (because T3 ¼ P3 ¼ 0).
Tables 5 and 6 show that the relative efficiencies in percent of the estimators
^θa and ^θasppswru under stratified two-stage sampling design are obtained for
sppswr

known and unknown values of the proportion of unrelated rare attribute B. The
relative efficiencies in percent of the estimators ^θasppswr and ^θasppswru exceed 100 for
all choices of the parameters. This shows that the estimators ^θa and ^θa
sppswr sppswru

perform better than Lee et al. (2014)’s estimators. Table 5 also indicates that
112 G. N. SINGH ET AL.

Table 5. Relative efficiencies in percent of the estimator ^θasppswr with respect to the estimator
^λ1sppswr .
Probabilities of statements selection
Mean total number of persons possessing the rare T1 0.1 0.2 0.3 0.4 0.5 0.7
sensitive attribute A T2 0.2 0.7 0.4 0.5 0.4 0.1
θ12a θ12a θ21a θ22a θ23a Relative efficiencies in %
1 1 1 1 1 569.0 122.3 186.2 122.3 122.3 150.3
1 1 1 1 2 176.6 104.2 105.6 101.3 100.8 101.1
1 1 1 2 1 191.8 105.0 106.8 101.5 101.0 101.3
1 1 1 2 2 146.4 102.6 103.3 100.8 100.5 100.7
1 1 2 1 1 214.8 106.2 108.7 102.0 101.3 101.7
1 1 2 1 2 151.4 102.9 103.6 100.8 100.6 100.7
1 1 2 2 1 157.7 103.2 104.1 100.9 100.6 100.8
1 1 2 2 2 135.9 102.0 102.5 100.6 100.4 100.5
1 2 1 1 1 539.6 121.1 172.1 118.2 116.5 130.8
1 2 1 1 2 175.8 104.2 105.5 101.3 100.8 101.1
1 2 1 2 1 190.6 104.9 106.7 101.5 101.0 101.3
1 2 1 2 2 146.1 102.6 103.3 100.8 100.5 100.7
1 2 2 1 1 213.0 106.1 108.6 101.9 101.3 101.7
1 2 2 1 2 151.0 102.9 103.6 100.8 100.5 100.7
1 2 2 2 1 157.2 103.2 104.1 100.9 100.6 100.8
1 2 2 2 2 135.8 102.0 102.5 100.6 100.4 100.5
2 1 1 1 1 139.2 102.2 102.7 100.6 100.4 100.5
2 1 1 1 2 127.9 101.6 101.9 100.4 100.3 100.4
2 1 1 2 1 129.5 101.7 102.0 100.5 100.3 100.4
2 1 1 2 2 122.7 101.3 101.6 100.4 100.3 100.3
2 1 2 1 1 131.4 101.8 102.2 100.5 100.3 100.4
2 1 2 1 2 123.8 101.4 101.7 100.4 100.3 100.3
2 1 2 2 1 125.0 101.4 101.7 100.4 100.3 100.4
2 1 2 2 2 120.0 101.2 101.4 100.3 100.2 100.3
2 2 1 1 1 139.0 102.2 102.7 100.6 100.4 100.5
2 2 1 1 2 127.7 101.6 101.9 100.4 100.3 100.4
2 2 1 2 1 129.4 101.7 102.0 100.5 100.3 100.4
2 2 1 2 2 122.7 101.3 101.6 100.4 100.3 100.3
2 2 2 1 1 131.3 101.8 102.2 100.5 100.3 100.4
2 2 2 1 2 123.7 101.4 101.7 100.4 100.3 100.3
2 2 2 2 1 124.9 101.4 101.7 100.4 100.3 100.4
2 2 2 2 2 119.9 101.2 101.4 100.3 100.2 100.3
*T3 ¼ 1  T1  T2 and θ11b ¼ θ12b ¼ θ21b ¼ θ22b ¼ θ32b ¼ 1.

relative efficiencies are higher for the smaller values of T1 and T2 , which may be
useful in constructing the randomized devices.

6. Conclusion
The estimators using Singh et al. (1994)’s randomized response model under
two-stage and stratified two-stage sampling schemes are beneficial in terms
of relative efficiencies when compared with Lee et al. (2014)’s estimators. The
method of Sections 3 and 4 are then more effective in obtaining the truthful
response from respondents having a rare sensitive attribute, be it in the
homogeneous or in the heterogeneous population
MATHEMATICAL POPULATION STUDIES 113

Table 6. Relative efficiencies in percent of the estimator ^θasppswru with respect to the estimator
^λ1sppswru .
Probabilities of selection statements
T1 0.1 0.3 0.4 0.2 0.3 0.4 0.5
T2 0.2 0.2 0.3 0.4 0.1 0.1 0.3
Mean total number of persons possessing the rare P1 0.3 0.5 0.2 0.3 0.4 0.7 0.1
sensitive attribute A P2 0.4 0.4 0.6 0.5 0.5 0.2 0.4
θ1a θ2a θ3a θ4a θ5a Relative efficiencies in %
1 1 1 1 1 284.5 137.8 104.1 100.0 185.0 119.9 102.0
1 1 1 1 2 109.1 109.8 100.3 100.2 117.0 100.7 100.1
1 1 1 2 1 111.0 111.4 100.4 100.1 120.0 100.9 100.1
1 1 1 2 2 105.7 106.6 100.1 100.3 110.9 100.5 100.1
1 1 2 1 1 113.7 113.5 100.6 100.0 124.4 101.0 100.1
1 1 2 1 2 106.2 107.1 100.2 100.2 111.9 100.5 100.1
1 1 2 2 1 106.8 107.7 100.2 100.2 113.1 100.5 100.1
1 1 2 2 2 104.6 105.4 100.1 100.3 108.7 100.4 100.1
1 2 1 1 1 243.4 136.3 103.7 100.0 180.6 113.8 101.5
1 2 1 1 2 109.0 109.7 100.3 100.2 116.8 100.7 100.1
1 2 1 2 1 110.8 111.2 100.4 100.1 119.8 100.8 100.1
1 2 1 2 2 105.7 106.5 100.1 100.3 110.8 100.5 100.1
1 2 2 1 1 113.5 113.3 100.6 100.0 124.0 101.0 100.1
1 2 2 1 2 106.1 107.0 100.2 100.2 111.8 100.5 100.1
1 2 2 2 1 106.8 107.7 100.2 100.2 113.0 100.5 100.1
1 2 2 2 2 104.5 105.3 100.1 100.3 108.7 100.4 100.0
2 1 1 1 1 104.4 105.4 100.2 100.1 109.0 100.3 100.0
2 1 1 1 2 103.4 104.2 100.1 100.2 106.8 100.3 100.0
2 1 1 2 1 103.6 104.4 100.1 100.2 107.1 100.3 100.0
2 1 1 2 2 103.0 103.6 100.0 100.3 105.8 100.3 100.0
2 1 2 1 1 103.7 104.5 100.1 100.2 107.4 100.3 100.0
2 1 2 1 2 103.0 103.7 100.0 100.2 105.9 100.3 100.0
2 1 2 2 1 103.1 103.8 100.0 100.2 106.2 100.3 100.0
2 1 2 2 2 102.7 103.3 100.0 100.3 105.2 100.2 100.0
2 2 1 1 1 104.4 105.3 100.2 100.1 108.9 100.3 100.0
2 2 1 1 2 103.4 104.2 100.1 100.2 106.7 100.3 100.0
2 2 1 2 1 103.5 104.3 100.1 100.2 107.1 100.3 100.0
2 2 1 2 2 103.0 103.6 100.0 100.3 105.7 100.3 100.0
2 2 2 1 1 103.6 104.5 100.1 100.2 107.4 100.3 100.0
2 2 2 1 2 103.0 103.7 100.0 100.2 105.9 100.3 100.0
2 2 2 2 1 103.1 103.8 100.0 100.2 106.1 100.3 100.0
2 2 2 2 2 102.7 103.3 100.0 100.3 105.2 100.2 100.0
*P3 ¼ 1  P1  P2 and T3 ¼ 1  T1  T2 .
*θ11b ¼ θ12b ¼ θ21b ¼ θ22b ¼ θ32b ¼ 1.

Acknowledgments
Authors are thankful to the Indian Institute of Technology (Indian School of Mines),
Dhanbad, for supporting the present work. Authors are also thankful to the reviewers.

References
Greenberg, B. G., Abul-Ela, A. A., Simmons, W. R., and Horvitz, D. G. (1969). The unrelated
question randomized response model: theoretical framework. Journal of the American
Statistical Association, 64(326): 520–539. doi:10.1080/01621459.1969.10500991.
114 G. N. SINGH ET AL.

Horvitz, D. G., Shah, B. V., and Simmins, W. R. (1967). The unrelated question randomized
response model. In: Proceedings of the American statistical association, social statistics
section. Washington, DC: American Statistical Association, 65–72.
Land, M., Singh, S., and Sedory, S. A. (2012). Estimation of a rare sensitive attribute using
Poisson distribution. Statistics, 46(3): 351–360. doi:10.1080/02331888.2010.524300.
Lee, G. S., Uhm, D., and Kim, J. M. (2014). Estimation of a rare sensitive attribute in
probability proportional to size measures using Poisson distribution. Statistics, 48(3):
685–709. doi:10.1080/02331888.2012.760091.
Mangat, N. S. (1992). Two stage randomized response sampling procedure using unrelated
question. Journal of the Indian Society of Agricultural Statistics, 44(1): 82–87.
Mangat, N. S. and Singh, R. (1990). An alternative randomized response procedure.
Biometrika, 77(2): 439–442. doi:10.1093/biomet/77.2.439.
Moors, J. A. (1971). Optimization of the unrelated question randomized response model.
Journal of the American Statistical Association, 66(335): 627–629. doi:10.1080/
01621459.1971.10482320.
Singh, H. P. and Tarray, T. A. (2014). A dexterous randomized response model for estimating
a rare sensitive attribute using Poisson distribution. Statistics and Probability Letters, 90
(July): 42–45. doi:10.1016/j.spl.2014.03.019.
Singh, H. P. and Tarray, T. A. (2017). An optional randomized response model for estimating
a rare sensitive attribute using Poisson distribution. Communications in Statistics-Theory
and Methods, 46(6): 2638–2654. doi:10.1080/03610926.2015.1040506.
Singh, S., Singh, R., Mangat, N. S., et al. (1994). An alternative device for randomized
responses. Statistica, 54(2): 233–243.
Tarray, T. (2017). Scrutinize on Stratified Randomized Response Technique (Illustrated
Edition). Munich: Grin Verlag.
Tarray, T. A. and Singh, H. P. (2015). A randomized response model for estimating a rare
sensitive attribute in stratified sampling using Poisson distribution. Model Assisted
Statistics and Applications, 10(4): 345–360. doi:10.3233/MAS-150338.
Warner, S. L. (1965). Randomized response: A survey technique for eliminating evasive
answer bias. Journal of the American Statistical Association, 60(309): 63–69.

You might also like