You are on page 1of 19

Journal of Survey Statistics and Methodology (2016) 0, 1–19

THE PSEUDO-EBLUP ESTIMATOR FOR A WEIGHTED


AVERAGE WITH AN APPLICATION TO THE
CANADIAN SURVEY OF EMPLOYMENT, PAYROLLS
AND HOURS

SUSANA RUBIN-BLEUER
LEON JANG

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


SERGE GODBOUT*

Submitted 5 March 2015; Revised 24 April 2016; accepted 27 April 2016

The pseudo–Empirical Best Linear Unbiased Predictor (pseudo-EBLUP)


was previously developed for simple means under the basic nested error
regression model with constant error variances. In this paper, we extend
the estimator to the pseudo-EBLUP of weighted means under the one-
fold nested error regression model with heteroscedastic errors. The ex-
tended pseudo-EBLUP estimator takes into account the survey weights
and the economic weights that make up the weighted mean. We obtain a
second-order approximation to the mean squared error (MSE) of the ex-
tended pseudo-EBLUP estimator and an unbiased estimator of the MSE
also up to the second order. We illustrate the methodology using a syn-
thetic population based on a sample from the Canadian Survey of
Employment, Payrolls and Hours (SEPH): We compare the extended
pseudo-EBLUP with other model-based and direct cross-sectional
domain estimators in terms of design-based MSE, and compare the

SUSANA RUBIN-BLEUER is Adjunct Research Professor at Carleton University and Senior Survey
Methodologist at Statistics Canada. SERGE GODBOUT and LEON JANG are Senior Survey
Methodologists at Statistics Canada.
The authors would like to thank J. N. K. Rao for his advice and for his encouragement and to
Victor Estevao from Statistics Canada for his support in our use of the Statistics Canada Small
Area System.
This work was partially supported by Statistics Canada.
*Address correspondence to Susana Rubin-Bleuer, Statistics Canada, 16 RHC Building, 100
Tunney’s Pasture Driveway, Ottawa, Ontario, Canada K1A 0T6; E-mail: susana.rubin-
bleuer@canada.ca.
doi: 10.1093/jssam/smw013
C The Author 2016. Published by Oxford University Press on behalf of the American Association for Public Opinion Research.
V
All rights reserved. For Permissions, please email: journals.permissions@oup.com
2 Rubin-Bleuer, Jang, and Godbout

model-based MSE estimates with the Monte Carlo design–based MSE.


We find that for SEPH, the extended pseudo-EBLUP, is among the best
performers.
KEYWORDS: Extended pseudo-EBLUP; Heteroscedastic nested error
model; Weighted mean.

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


1. Introduction
There is growing demand for reliable disaggregated estimates for business and
economic data in general. On the other hand, increasing costs of data collection
and the need to reduce response burden imply small sample sizes and direct
survey estimators with large variability. The purpose of this study was to inves-
tigate the feasibility of producing small area estimates for the Canadian Survey
of Employment, Payrolls and Hours (SEPH).
The target parameter was the average weekly earnings (AWE) in a “small”
domain, i; expressed as the “weighted” average of yik ¼ AWEik , which is the
average weekly earnings for establishment k in domain i; weighted by the
number of employees Cik :

X
Ni .X
Ni
Y iC ¼ Cik yik Cik ; Cik > 0: (1.1)
k¼1 k¼1

We considered the direct Generalized Regression Survey Estimator (GREG)


and several model-based estimators under the basic unit level model, among
them the Empirical Best Linear Unbiased Predictor (EBLUP) (Battese, Harter,
and Fuller 1988; Stukel and Rao 1997) and the pseudo-EBLUP estimators
(Prasad and Rao 1999; You and Rao 2002).
The pseudo-EBLUP estimators had been developed under a model with
constant error variances by Rao and his co-authors (1999 and 2002). In an ear-
lier proceedings paper, Rubin-Bleuer et al. (2007a, 2007b) worked out an ex-
tension of both pseudo-EBLUP estimators for the estimation of weighted
domain means Y iC under a model with heteroscedastic errors. In this article,
we present an expanded version of the original paper that includes EBLUP es-
timators of weighted means and a derivation of a model-based MSE estimator
of a pseudo-EBLUP estimator of weighted means.
We used a simulated population based on SEPH administrative and sample
data that had been created previously with redesign objectives. This provided a
rich source for the analysis of properties of small area estimators, not only for
our survey but for other business surveys that experience similar challenges. It
enabled us to calculate design-based quality measures via repeated sampling:
We evaluated the small area estimators in terms of Monte Carlo design abso-
lute relative bias (ARB) and mean squared error (MSE). We observed the ef-
fect of the survey’s highly skewed distributions and outliers on the estimators.
Pseudo-EBLUP Estimation 3

We identified domains and/or units where the model failed, and investigated
whether we can use model-based MSE estimates for the design-based MSE for
the SEPH data.
In section 2, we describe the Canadian SEPH data, the regression model
used for direct estimation, and the simulation process to create the synthetic
“SEPH” population. In section 3, we define the EBLUP and pseudo-EBLUP
estimators of a weighted mean under a nested error regression model with het-
eroscedastic errors. We give an approximation of the MSE of the pseudo-

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


EBLUP up to the second order, which is valid for domains that do not cut
across strata. We also propose an estimator of the MSE that is unbiased up to
the second order. In section 4, we define the design-based performance mea-
sures used in the evaluation. Model fitting and analysis of the results are re-
ported in section 5. In section 6, we summarize the results and draw
conclusions. We use the Statistics Canada Small Area System (Estevao et al.
2015) for most of the calculations.

2. The data and the Generalized Regression Estimator


2.1 The Data
SEPH is a monthly survey designed to produce estimates of levels and month-
to-month trends of payrolls, employment, paid hours, and earnings. The target
population consists of all employees in Canada except for those in a few select
industries. The domains of interest are defined by geography and by levels of
the North American Industry Classification System (NAICS). Construction of
Buildings for example, is NAICS 3 level, whereas Residential Construction and
Commercial Construction are NAICS 4 level within Construction of Buildings.
SEPH is a large business survey with a sample size of approximately 15,000
establishments. It takes advantage of the availability of administrative data: the
Payroll Deduction Accounts (PD7) file obtained from the Canada Revenue
Agency contains number of employees, Cik ; and gross monthly payroll, Pik ;
for the approximately 1,000,000 employers in Canada. The PD7 file is com-
bined with data from the monthly survey to produce estimates. In 2005, the
survey was undergoing redesign. A stratified simple random sample without
replacement (SRSWOR) design and the sample generalized regression
(GREG) estimator were proposed: strata were defined by NAICS 3 level by
province by size, where size referred to the number of employees in the estab-
lishment. The coefficient of variation (CV) of the GREG was controlled not at
the strata level but at the higher level of NAICS 3 by province. As in most sur-
veys, there were domains of interest with small sample size for which the
GREG estimator had a large CV. In particular, it was desirable to publish esti-
mates at NAICS 4 level by province.
The survey summarized weekly earnings (SWEik ), and also Cik and Pik for
establishment k in domain i;i ¼ 1; . . . ; m. We assumed Cik and Pik were
4 Rubin-Bleuer, Jang, and Godbout

equal in both the PD7 and survey files. Two other variables were derived: aver-
age weekly earnings ðAWEik ¼ SWEik =Cik Þ and average monthly earnings
ðAMEik ¼ Pik =Cik Þ.

2.2 The GREG Model and the Synthetic SEPH Population


Let xik ¼ ð1; AMEik Þ; and yik and Cik as in (1.1). Let ni be the sample size in

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


domain i. The Generalized Regression (GREG) estimates were based on the
model (Beaucage et al. 2005)
id
yik ¼ x0 ik b þ ik ; ik ð0; r2 =Cik Þk ¼ 1; . . . ; ni ; i ¼ 1; . . . ; m: (2.1)

A heuristic explanation of the model error structure is as follows. Both summa-


rized weekly earnings SWEik and gross monthly payroll Pik increase with the
number of employees Cik . A linear model for SWE was SWEik ¼ kPik þ gik ,
reasonable with varðSWEik Þ ¼ varðgik ÞaPik : Dividing by Cik , we obtain
varðSWEik =Cik ÞaPik =Cik2 a1=Cik because AMEik ¼ Pik =Cik remains stable as
Cik increases. Under (2.1) and sampling weights w ^G , the GREG
~ ik , ^y ik ¼ x0 ik b
G
survey estimator Y^ is given by
iC

X
Ni XNi X
ni XNi
Y^
GiC ¼ Cik ^y k = Cik þ ~ ik Cik ðyik  ^y ik Þ=
w Cik ;
k¼1 k¼1 k¼1 k¼1
!1
X
m X
ni X
m X
ni
^G ¼
b ~ ik Cik xik x0 ik
w ~ ik Cik xik yik :
w
i¼1 k¼1 i¼1 k¼1

The SEPH population was built as follows. First, the monthly administrative
source was used to list all the establishments in the population with their re-
spective number of employees and average monthly earnings. Table 2.1 reports
population and sample sizes for four key industries or model groups (MGs) in
December.
For respondent sampled units, yik was available and linked to the administra-
tive source. For nonrespondents in the sample, yik was imputed from historical
monthly data because it was assumed stable over short periods of time. For
nonsampled units, yik was imputed by the nearest-neighbor method using xik
and Cik while preserving the cross-sectional correlations between yik and xik
within Industry groups. This produced a census with xik and Cik values ob-
tained from administrative sources and “imputed” yik : Twelve monthly popula-
tions from January to December were created independently of each other.
Note that correlations between yik and xik differ because of the inclusion of bo-
nus and other types of payments in some of the weeks.
To check that the cross-sectional relationships were preserved, we calculated
the unweighted correlations in the original sample and the correlations in the
Pseudo-EBLUP Estimation 5

Table 2.1. Population and Sample Sizes by MG

Population size Sample size

Industry 1 Industry 2 Industry 3 Industry 4 Industry 1 Industry 2 Industry 3 Industry 4

25,233 24,260 996 17,965 335 988 70 241

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


nonsampled simulated population. The unweighted correlations were approxi-
mately unbiased because the preredesign sample was noninformative; that is,
the sampling weights were not correlated with yik when conditioned to xik .
Correlations for industries 1 and 4 are listed in Table 2.2.
We observed that both sets of correlations follow the same trend over the
twelve months. Correlations were preserved for February and December, but
in October they were higher in the original sample than in the synthetic popula-
tion; nevertheless, the small area model fit the synthetic population in October
well, and the estimators performed alike in every month.
In the following, we assume that the synthetic population represents well
the SEPH population of 2005 for the purpose of determining the best small
area estimator for SEPH and whether its design root relative mean squared
error is of reasonable quality for publication.
Table 2.3 lists monthly correlations between yik and xik in the synthetic
“SEPH population” for the four industries studied here and the number of
small domains in each model group. Correlations were lower for industries
2 and 3 and highest for Industry 4.

3. The EBLUP and pseudo-EBLUP estimators


Our aim was to establish a working model adequate for most industries in the
survey. We started from the GREG model in (2.1) used for direct estimation
and added a random area effect. We assumed a finite population with domains
of size Ni following the nested error linear regression model with heteroscedas-
tic errors:
iid id
yik ¼ x0 ik b þ vi þ ik ;vi  ð0; r2v Þik ð0; r2 =Cik Þ;k ¼ 1; . . . ; Ni ;i ¼ 1; . . . ; m;
(3.1)

where b is a vector of fixed regression coefficients; vi and ik are independent


of each other.
Under unit-level model (3.1), we considered four cross-sectional estimators
to compare with GREG. These estimators use the model to “borrow strength”
from other domains, thus increasing the effective sample size and reducing the
variability. They are developed to reduce the design-based mean squared error.
6 Rubin-Bleuer, Jang, and Godbout

We did not evaluate time series models because longitudinal relationships


were not taken into account in the simulation of the population.

3.1 The Empirical Linear Unbiased Predictor


The Empirical Linear Unbiased Predictor (EBLUP) does not include sampling
weights and therefore is the most efficient estimator under model (3.1). If the
model for the population is not good for the sample (because of selection bias),

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


the EBLUP estimator is not design-consistent (as ni ! 1). Under model
(3.1), the EBLUP of the weighted mean Y iC is obtained by replacing
P P
 i ;yia ;xia ; ci , and b
X  iC ¼ Ni Cik xik = Ni Cik ;yiC ;x iC ; ^c E and b
^ with X ^E in
k¼1 k¼1 iC
2 2
(7.2.16) of Rao (2003), with variance estimators r ^ v and r
^ :
E
Y^ iC ¼ X0 iC b
^E þ ^c E ðyiC  x0 iC b
iC
^E Þ; (3.2)
Pi Pi Pi Pi
with ^c EiC ¼ r
^ 2v =ð^ ^ 2e = nk¼1
r 2v þ r Cik Þ;yiC ¼ nk¼1 Cik yik = nk¼1 Cik ; xiC ¼ nk¼1
Pi  
Cik xik = nk¼1 ^E ¼ Pm X0 i V1 Xi 1 Pm X0 i V1 yi ; X0 i V
Cik ; b b 1 Xi ¼ r ^ 2
i¼1 i i¼1 i i 
Pni P ni  0 b 1 2
 P ni
C x x 0
 ^
c E
f C g x  
x0 X V y ¼ r
^ C x y
k¼1 ik ik ik iC k¼1 ik iC iC i i i  k¼1 ik ik ik
Pi
^c EiC f nk¼1 Cik gxiC  yiC Þ and Vi ¼ r2 diagð1=Ci1 ; . . . ; 1=Cini Þ þ r2v 1ni  10 ni ;

Table 2.2. Correlation qðyik ; xik Þ in the Sample and in the Nonsampled
Population

q in sample q in nonsampled population

Industry 1 February 0.468 0.467


October 0.805 0.622
December 0.593 0.594
Industry 4 February 0.875 0.795
October 0.903 0.708
December 0.816 0.799

Table 2.3. Correlation qðyik ; xik Þ per Industry

Month Correlation by Industry Number of domains by Industry

1 2 3 4 1 2 3 4

February 0.467 0.148 0.407 0.795 26 38 32 52


October 0.622 0.285 0.378 0.708 26 38 32 52
December 0.593 0.297 0.335 0.799 26 38 32 52
AVG(12m) m 0.500 0.252 0.404 0.709 26 38 32 52
Pseudo-EBLUP Estimation 7

Pni P i
if both the sampling rate ni =Ni and the employment rate ci ¼ k¼1 Cik = Nk¼1 Cik
are “negligible,” and if ni =Ni or ci are not negligible,
E
Y^
 iC ¼ ci  yiC þ ð1  ci Þfx0  iC b
^E þ ^c E ðyiC  x0 iC b
iC
^E Þg; (3.3)
P P
x iC
with  ¼ Cik xik = Cik ; si ¼ sample in domain i:
k62si k62si

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


3.2 The YR Estimator Under a Model With Heteroscedastic Variances
The two pseudo-EBLUPs (YouP and Rao 2002; Prasad and Rao 1999) estimate
simple small area means Y i ¼ Nk¼1 i
yik =Ni under a model with constant error
variances for domains equal to the strata. Rubin-Bleuer et al. (2007a) extended
these to the estimation of weighted means Y iC under a model with heterosce-
dastic variances.
The YR estimator (extended from You and Rao pseudo-EBLUP 2002) in-
cludes the sampling weights to construct a design-consistent estimator, with
the regression coefficient estimated at the unit level. Moreover, the estimators
are automatically benchmarked, in that the small domain means sum to the
higher-level direct survey regression estimator (see the supplementary mate
rials online or Rubin-Bleuer et al. 2007a). We denote the original sampling
weight for unit P i k in domain i by w ~ ik and the standardized weight by
wik ¼ w~ ik Cik = nj¼1 ~ ij Cik . The YR estimator of Y iC under model (3.1) with
w
known auxiliary population means X  iC is given by:
YR
Y^ iC ¼ X0 iC b
^YR þ ^c YR ðyiCw  x0 iCw b
iCw
^YR Þ; (3.4)
Pi Pi P i
with ^c YRiCw ¼ r^ 2v =ð^ ^ 2 d2iCw Þ; d2iCw ¼ nk¼1
r 2v þ r w2ik =Cik ¼ nk¼1 ~ 2ik Cik = nk¼1
w

~ ik Cik Þ2 and b
w ^YR ¼ Pm Pni w ~ C x ðx  ^c YR x Þ0 Þ1
Pm Pni
i¼1 Pniik ik ik ik
k¼1 PniCw
iCw i¼1 k¼1
~ ik Cik ðxik  ^c iEw xiCw Þyik , xiCw ¼ k¼1 wik xik , yiCw ¼ k¼1
w i
wik yik , if both ni =Ni
and ci are “negligible” and
YR
Y^
 iC ¼ ci  yiC þ ð1  ci Þfx0  iC b
^YR þ ^c YR ðyiCw  x0 iCw b
iCw
^YR Þg (3.5)

x0  iC as in (3.3) if ni =Ni or ei are not negligible (Rubin-Bleuer et al.


with 
2007a).

Remark 3.1.
For this study, we use the well-known method of fitting constants to esti-
mate the variance components. Under model (3.1), r ^ 2 is model-unbiased and
2 2
^ v ¼ maxð~
r rv ; 0Þ is model-consistent (see Rao 2003, pp. 138) for details. Note
that maximum likelihood estimators of the variance components are also
model-consistent under the heteroscedastic model (Jiang and Nguyen, 2012).
Both methods of variance estimation assume that there is no selection bias, and
8 Rubin-Bleuer, Jang, and Godbout

they do not use sampling weights. We will see later that even if this assumption
is not valid, YR, calculated with unweighted variance estimates, seems to be
robust against stratification effects.
Remark 3.2.
Note that we defined YR as a design-consistent estimator of
Y iC as ni ! 1, and such that xiCw and yiCw are design-consistent estimators
of X  iC and Y iC , respectively. In addition, YR coincides with the EBLUP esti-

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


mator Y iC under simple random sample.

3.3 The PR Estimator Under a Model With Heteroscedastic Variances


The PR estimator (extended from Prasad and Rao pseudo-EBLUP 1999) uses
the sampling weights to develop a design-consistent estimator as ni ! 1. The
regression coefficient is estimated at the area level. Thus, it is less efficient
than the EBLUP and the YR, but it is more robust against the linear model mis-
specification. The PR estimator for the weighted domain mean under model
(3.1) is given by:
PR
Y^ iC ¼ X0 iC b
^PR þ ^c iCw ðyiCw  x0 iCw b
^PR Þ; (3.6)

if sampling rates are negligible, and if sampling rates are not negligible, is
given by:
PR
Y^
 iC ¼ ci  yiC þ ð1  ci Þfx0  iC b
^PR þ ^c iCw ðyiCw  x0 iCw b^PR Þg; (3.7)
 P m  1 Pm
with ^c iCw as in (3.4) and b^PR ¼ c iCw xiCw x0 iCw
i¼1 ^ i¼1 ^
c iCw xiCwyiCw (see
also Rubin-Bleuer et al. 2007a).

3.4 The MSE of the YR Pseudo-EBLUP for SEPH


Torabi and Rao (2010) obtained a second-order approximation (i.e., up to a
term of the order of oð1=mÞ) to the mean squared prediction error (MSE) of the
pseudo-EBLUP estimator of a simple mean under model (3.1) with constant
model error variance (i.e., Cik  1).
Here we provide an approximation to the MSE of the YR estimator of Y iC
under model (3.1) with heteroscedastic model variances and for the case where
~ ik ¼ w
w ~ i ;k ¼ 1; . . . ; ni ;i ¼ 1; . . . ; m:
YR YR
 iC Þ ¼ EðY^ iC  YÞ
MSEðY^  2 ¼ g1iCw þ g2iCw þ g3iCw þ oð1=mÞ; (3.8)

d4
where g1iCw ¼ ð1  ciCw Þr2v ; g3iCw ¼ ðr2 þriCw
2 d2 3 r 2v Þ 2r2v  r2  cov
ðr4  varð^
v  iCw Þ

r 2v ; r
ð^ ^ 2 Þ þ r4v  varð^
r 2 ÞÞ (see supplementary materials online) and g2iCw ¼
 iC  ciCw 
ðX xiCw Þ0 Varfb^YR gðX ^YR g ¼ ðPm Pni x ik z0 Þ1
 iC  ciCw xiCw Þ; Varfb
i¼1 k¼1 ik
Pseudo-EBLUP Estimation 9

P Pni P Pni
Varð m zik yik Þð m zik x0 ik Þ1 with zik ¼ w
~ ik Cik ðxik  ciCw xiCw Þ
Pm Pni
i¼1 k¼1 i¼1 P
k¼1
m Pni Pni P Pi
and Varð i¼1 k¼1 zik yik Þ ¼ rv i¼1 ð k¼1 zik Þ ð k¼1 zik Þ0 þ r2 mi¼1 nk¼1
2

zik  z0 ik =Cik : YR
A second-order unbiased estimator of MSEðY^ iC Þ is given by:
YR
mseðY^
 iC Þ ¼ g1iCw ð^
r 2v ; r
^ 2 Þ þ g2iCw ð^
r 2v ; r
^ 2 Þ þ 2g3iCw ð^
r 2v ; r
^ 2 Þ (3.9)

in the case of sampling rates ni =Ni and ci are “negligible,” and otherwise by

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


 YR    2 2  2 2  2 2 
mse Y^ iC ¼ ð1  ci Þ2 g1iCw r ^  þ g2iCw r
^v; r ^v;r^  þ 2g3iCw r^v;r
^
0X 1
Ni
B Cik C
þr^ 2 @Xk¼ni þ1 2 A:
Ni
C
k¼1 ik
(3.10)

The term g2iCw is obtained from g2iCw by changing X  iC to x0  iC . See the
supplementaryYR
materials online for the theoretical proof and a simulation that
shows mseðY^
 iC Þ is second-order unbiased.

4. Steps for the evaluation


We included the model-based estimators Synthetic (SYN), EBLUP, YR, and
PR and the direct survey estimators GREG and domain-specific GREG estima-
tor (DSG) calibrated by the estimated number of employees. The definition of
SYN and DSG for the weighted mean is in the supplementary materials online.
DSG and GREG behaved similarly in all industries, so we present results on
SYN, PR, YR, EBLUP, and GREG.
One thousand stratified simple random samples were drawn under the pro-
posed redesign, and for each sample the six estimators were calculated and
evaluated in terms of the distributions over domains of the Monte Carlo
design–based absolute relative bias and design-based root relative MSE
(RRMSE):
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
 1000   1000  2
 1 X ^ s  X s
   1000 Y iC  Y iC   
1
1000 Y^  Y
iC iC

ARB Y^
 iC ¼ s¼1
and RRMSE Y^ iC ¼ s¼1
:
Y iC Y iC
(4.1)

These measures cannot be calculated during production. We investi-


gated the use of model-based MSE estimates as proxies for the design MSE in
SEPH, and compared the RRMSE with the design expectation of the estimated
model-based RRMSE:
10 Rubin-Bleuer, Jang, and Godbout

Distribuon of Absolute Distribuon of Relave Root


Relave Bias - MG 1 MSE - MG 1
MAX 0.47 MAX 0.48
0.4 0.4

0.2 0.2

0 0
SYN GREG PR YR EBLUP SYN GREG PR YR EBLUP

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


Distribuon of Absolute Distribuon of Relave Root
Relave Bias - MG 2 MSE - MG 2
MAX 2.28 0.29 1.51 1.07 1.15 MAX 2.38 1.94 2.12 1.49 1.55
0.2 0.3
0.2
0.1
0.1
0 0
SYN GREG PR YR EBLUP SYN GREG PR YR EBLUP

Distribuon of Absolute Distribuon of Relave Root


Relave Bias - MG 3 MSE - MG 3
MAX 2.23 0.38 1.14 1.43 1.39 MAX 2.24 3.30 1.47 1.84 1.79
0.2 0.4

0.1 0.2

0 0
SYN GREG PR YR EBLUP SYN GREG PR YR EBLUP

Distribuon of Absolute Distribuon of Relave Root


Relave Bias - MG 4 MSE - MG 4
MAX 0.27 0.23 0.22 0.23 MAX 0.92
0.2 0.3
0.2
0.1
0.1
0 0
SYN GREG PR YR EBLUP SYN GREG PR YR EBLUP

Figure 5.1. Absolute Relative Bias and Relative Root MSE, December

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
YRðsÞ
mse Y^ iC
1 X 1000
Model RRMSE ¼ YRðsÞ
;i ¼ 1; . . . ; m: (4.2)
1000 s¼1
Y^ iC

5. Model fitting and analysis


We settled on the heteroscedastic model (3.1) with ik id ð0; r2 =Cik Þ (see the
appendix for details). Figure 5.1 presents the distributions of design absolute
Pseudo-EBLUP Estimation 11

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


Figure 5.2. Industry 4, December

relative bias and design root relative mean squared error in December, for all
industries and estimators. SYN is the most biased, whereas GREG is the least
^ 2v =^
biased across industries. The estimated signal to noise ratios r r 2 were 1.4%
(Industry 1), 1% (Industry 2), 0.1% (Industry 3), and 0.4% (Industry 4). There
seemed to be a weak association between the bias of the synthetic estimator
and signal-to-noise ratio.
The sampling bias of PR and YR and EBLUP estimators was relatively low.
Figure 5.2 highlights their differences. Some domains are entirely contained
within strata and hence self-weighted, but other domains cut across strata and
are affected by selection bias. EBLUP displays less bias in the former domains,
while PR and YR exhibit less bias in the latter domains: they appear to be ro-
bust against the effects of size stratification.
In terms of RRMSE, Figure 5.1 shows that the direct survey estimator
GREG performs the worst. This was expected as the sample sizes were very
small, generating large sampling errors in GREG. The reduction in RRMSE of
the small area estimators is considerable even when the y  x correlation is
low and the area effects are weak. Differences were dramatic when we com-
pared the RRMSE of PR versus the CV of GREG in Figure 5.3. The RRMSE
of PR (dotted line) stayed well below 10%, while the CV of GREG (circled
line) varied between 10% and 90%. The CV of GREG was below 10% on do-
mains with large samples.
Figure 5.4 highlights the differences in RRMSE among PR, YR, and
EBLUP. In domains that did not cut across the size strata, EBLUP performed
better than YR, but PR performed better than YR and EBLUP in most
12 Rubin-Bleuer, Jang, and Godbout

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


Figure 5.3. Industry 4, December

Figure 5.4. Industry 4, December

domains. The superiority of PR is due to robustness against various sources of


model misspecification canceling themselves in the aggregation when estimat-
ing the regression coefficients (see, for example, Figure A.5 in the appendix).
Pseudo-EBLUP Estimation 13

Distribuon of Absolute Distribuon of Relave


Relave Bias - MG 4 Root MSE - MG 4
MAX 0.36 0.15 0.29 0.29 0.29 MAX 0.37 0.97 0.33 0.34 0.34
0.1 0.3
0.2
0.05
0.1
0 0
SYN GREG PR YR EBLUP SYN GREG PR YR EBLUP

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


Figure 5.5. Absolute Relative Bias and Relative Root MSE, Industry 4, October

Figure 5.6. Industry 4, December.

5.1 Model-Based RRMSE Versus Design-Based RRMSE


Data from Industry 4 in October showed a strong y  x linear relationship.
Examination of the residuals and the RRMSE distribution indicated one of the
best fits among industries and months. We use these data to assess the model-
based RRMSEs. Figure 5.5 depicts the ARB and RRMSE distributions of the
estimators over the domains. PR, YR, and EBLUP performed very well. The
extreme value corresponds to a domain where the linear model fails: it is a do-
 iC much lower than in other domains. It is then very im-
main with ratio Y iC =X
portant to try, during production, to identify domains that do not follow the
model and to tag them for further study.
We used (3.09) if sampling rates were negligible and (3.10) otherwise. Both
formulas yield second-order unbiased estimators for domains with equal
weights. Design-based expectation of model-based RRMSEs were computed
under two scenarios: the “neg” scenario, where design- and model-based
MSEs were calculated assuming negligible sampling rates for every domain in
each of the 1000 samples, and the “mix” scenario, where design- and model-
14 Rubin-Bleuer, Jang, and Godbout

Table 5.1. Distribution of % Average Coverage and Respective Interval Length

Max Q3 Median Q1 Min

100% (319) 98.1% (344) 97.9% (255) 90.8% (277) 23.9% (240)

based MSEs were calculated assuming negligible sampling rates for rates un-

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


der 10%, and non-negligible sampling rates otherwise, in each of the 1000
samples. We omitted the 12 domains contained in the territories.
We display in Figure 5.6 the model-based RRMSEs under the “neg” (starred
line) and “mix” (dotted line) scenarios. Both model-based RRMSEs are very
similar even though the mix scenario sometimes yields larger RRMSEs be-
cause of the extra variability in the small area estimates. There are a few do-
mains with large population sizes where the two model-based RRMSEs
coincide. The line with squares corresponds to design-based RRMSEs.
Design-based RRMSE in domain 19 is 34%. This extreme value was also ob-
served in figure 5.5.
Industry 4 contains four subindustries, and the domains in figure 5.6 are or-
dered by subindustries. Model-based estimators of RRMSE overestimate
design-based RRMSEs in domains 1 to 30 (corresponding to the first three sub-
industries), except in the domain with the extreme value. On the other hand,
they mostly underestimate design-based RRMSEs in the fourth subindustry.
Model-based RRMSE estimates were sensitive to the number of domains in
the model (m ¼ 52), to the criteria for negligible sampling rates, to domains
with stratification effects, to the variability of ni not taken into account, and to
the dependence of the model errors ik within strata. However, we suspect that
the main impact on the model-based RRMSEs was generated by the non-
normality of the model errors.
There is no theory available for design coverage in small area estimation,
but we calculated average coverage probabilities over repeated sampling of the
qffiffiffiffiffiffiffiffiffiffiffiffiffiYR
YR
traditional prediction intervals Y^ iC 61:96 mseðY^ iC Þ (nominal coverage of
95%). Table 5.1 lists the distribution of average coverage and respective aver-
age length. Coverage was chiefly related to the extent of under (over)-estima-
tion of the model MSE with respect to the design MSE. Prediction intervals in
25% of the domains had coverage under 90%, while domain 19 had an average
coverage of 23.9%.

6. Summary
We studied the performance of several small area estimators in terms of design
bias and MSE. The standard pseudo-EBLUP estimators were adapted for this
Pseudo-EBLUP Estimation 15

study to the estimation of a weighted mean under a unit level linear mixed
model with heteroscedastic error variances. Rubin-Bleuer et al. (2007) devel-
oped these extensions. In this paper, we developed the approximation to their
model-based MSE and MSE estimators for negligible and non-negligible sam-
pling rates.
Our recommendation for Survey of Employment, Payrolls and Hours is
based on the premise that the synthetic population represents reasonably well
the SEPH population with respect to the models used.

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


We found that for most domains at the subIndustry by province level GREG
is nearly unbiased, but not acceptable for publication given large variability.
The pseudo-EBLUP estimators (PR and YR) display the lowest bias among all
the model-based estimators. The differences between YR and PR are small,
but, PR these data, PR consistently yields the lowest root relative MSE across
industries and months (Figures A.5 and A.6 in the Appendix suggest a better
fit at the area level). YR might be more efficient by construction than PR, but
for SEPH data PR yields estimators with smaller overall error. This is consis-
tent with the empirical knowledge that EBLUP-type estimators are sensitive to
outliers in the model relationship: PR appears to be more robust against these
“model” outliers than YR, whereas YR is more robust than EBLUP against se-
lection bias and “model” outliers. If benchmarking is required, YR might be a
better choice than PR because it is automatically benchmarked, and bench-
marking the PR estimators might increase their variability.
Another estimator that is comparable to PR and YR in the sense that it is
design-consistent and outlier-robust is the model-assisted small area estimator
recently developed by Fabrizi et al. (2014). It might be a good competitor, and
it would be interesting to see how this estimator addresses estimation in do-
mains that yield extreme bias.
With respect to measures of error, survey statisticians would like to see area-
specific, design-based quality measures. These can be obtained in feasibility
studies through simulation but not during production. From feasibility studies
and other means of validation, we are confident that we can produce point esti-
mators, with acceptable design-based mean square errors. On the other hand, es-
timates and approximations to the model-based MSE do not track well the
corresponding design-based MSEs for SEPH. Moreover, good-quality point es-
timators may be associated with prediction confidence intervals subject to se-
vere undercoverage. However, computer power and coding nowadays enable
us to repeat this type of study as frequently as we need to make sure that any
change in the dynamics of the response variable is taken into account.

Appendix
We examined visually diagnostic plots of the standardized transformed resid-
uals (STDR) (Estevao et al. 2015) in order to back up the heteroscedastic
16 Rubin-Bleuer, Jang, and Godbout

model variance structure. We fit the homoscedastic model with Varðyik Þar2 ,
and the heteroscedastic model with Varðyik Þar2 =Cik , to each of the finite pop-
ulations. The STDR residuals decrease in variability as Cik increases under
constant error variance. See, for example, Figures A.1 and A.2, respectively.
Under the heteroscedastic model, the respective plots of the PR, YR, and
EBLUP residuals show that the linear model is a reasonable fit: Figure A.3

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


Figure A.1. STDR vs Number of Employees, homoscedastic model, Industry 4,
December, whole population

Figure A.2. STDR vs Number of Employees, heteroscedastic model, Industry 4,


December, whole population
Pseudo-EBLUP Estimation 17

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


Figure A.3. STDR vs unit number, heteroscedastic model, Industry 4, December,
whole population

Figure A.4. QQ-plot of the STDR, Heteroscedastic model, Industry 4, December,


whole population
18 Rubin-Bleuer, Jang, and Godbout

3500

2963
3000

2500

2000

1500

1000
709
500

0
0 2000 4000 6000 8000 10000 12000 14000

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


Figure A.5. Establishment level yik vs xik, bubble size is the number of employees
in the establishment, Industry 4, December

3500

2963
3000

2500

2000

1500

1000
709
500

0
0 1000 2000 3000 4000 5000 6000 7000

Figure A.6. Area level Y iC vs X


 iC , bubble size is the number of employees in the
establishment, Industry 4, December

shows the standardized transformed residual from fitting the December popu-
lation of Industry 4 under the heteroscedastic model. Figure A.4 shows in the
corresponding QQ plot of the STDRs that the model errors deviate from nor-
mality. Similar results were obtained when we fit the model to some of the in-
dividual samples from each of the four populations.

References
Battese, G. E., R. M. Harter, and W. A. Fuller (1988), “An error-components model for prediction
of county crop areas using survey and satellite data,” Journal of the American Statistical
Association, 83, 28–36.
Beaucage, Y., S. Godbout, and Y. Morin (2005), “Survey of Employment, Payrolls and Hours:
New Modelling Perspectives,” Internal document, Statistics Canada.
Estevao, V., M. A. Hidiroglou, Y. You, and S. Rubin-Bleuer (2015), “Methodology Software
Library Small–Area Estimation Unit Level Model with EBLUP and Pseudo EBLUP Estimation
Methodology Specifications,” International Cooperation and Corporate Statistical Methods
Division, Internal document, Statistics Canada.
Fabrizi, E., N. Salvati, M. Pratesi, and N. Tzavidis (2014), “Outlier robust model-assisted small
area estimation,” Biometrical Journal, 56, 157–175.
Jiang, J., and T. Nguyen (2012). “Small area estimation via heteroscedastic nested-error regres-
sion,” Canadian Journal of Statistics, 40, 588–603.
Prasad, N. G. N., and J. N. K. Rao (1999), “On robust small area estimation using a simple random
effects model,” Survey Methodology, 25, 67–72.
Pseudo-EBLUP Estimation 19

Rao, J. N. K. (2003), Small Area Estimation, New Jersey: John Wiley and Sons, Inc.
Rubin-Bleuer, S., S. Godbout, and Y. Morin (2007a), “Evaluation of small domain estimators for
the Survey of Employment, Payroll and Hours,” Proceedings od the Survey Methods Section of
the Statistical Society of Canada, June 10 - 14, 2007. Available at http://www.ssc.ca/sites/ssc/
files/survey/documents/SSC2007_S_RubinBleuer.pdf.
———— (2007b), “Evaluation of small domain estimators for the Survey of Employment, Payroll
and Hours,” Long abstract and presentation, Small Area Estimation, (SAE 2007) in Pisa, Italy,
International Statistical Institute Satellite Conference. Available at http://citeseerx.ist.psu.edu/
viewdoc/download?doi¼10.1.1.503.2985&rep=rep1&type¼pdf.
Stukel, D., and J. N. K. Rao (1997), “Estimation of Regression Models with Nested error Structure

Downloaded from http://jssam.oxfordjournals.org/ at Cornell University Library on September 21, 2016


and Unequal Error variances under two and three stage Cluster Sampling,” Statistics and
Probability Letters, 35, 401–407.
Torabi, M., and J. N. K. Rao (2010), “Mean squared error estimators of small area means using sur-
vey weights,” The Canadian Journal of Statistics, 38, 598–608.
You, Y., and J. N. K. Rao (2002), “A pseudo-empirical best linear unbiased prediction approach to
small area estimation using survey weights,” The Canadian Journal of Statistics, 30, 431–439.

You might also like