Zero Inefficiency Stochastic Frontier Models With V 2016 European Journal of

European Journal of Operational Research 249 (2016) 1113–1123
Contents lists available at ScienceDirect
European Journal of Operational Research

journal homepage: www.elsevier.com/locate/ejor
Stochastics and Statistics
Zero-inefficiency stochastic frontier models with varying mixing

proportion: A semiparametric approach
Kien C. Tran a,1,∗, Mike G. Tsionas b
a
Department of Economics, University of Lethbridge, 4401 University Drive W, Lethbridge, Alberta T1K 3M4, Canada
b
Department of Economics, Lancaster University Management School, LA1 4YX, UK
a r t i c l e i n f o a b s t r a c t
Article history: In this paper, we propose a semiparametric version of the zero-inefficiency stochastic frontier model of
Received 18 April 2015 Kumbhakar, Parmeter, and Tsionas (2013) by allowing for the proportion of firms that are fully efficient to
Accepted 9 October 2015
depend on a set of covariates via unknown smooth function. We propose a (iterative) backfitting local maxi-
Available online 17 October 2015
mum likelihood estimation procedure that achieves the optimal convergence rates of both frontier parame-
Keywords: ters and the nonparametric function of the probability of being efficient. We derive the asymptotic bias and
Zero-inefficiency variance of the proposed estimator and establish its asymptotic normality. In addition, we discuss how to test
Varying proportion for parametric specification of the proportion of firms that are fully efficient as well as how to test for the
Semiparametric approach presence of fully inefficient firms, based on the sieve likelihood ratio statistics. The finite sample behaviors
Backfitting local maximum likelihood of the proposed estimation procedure and tests are examined using Monte Carlo simulations. An empirical
Sieve likelihood ratio statistics application is further presented to demonstrate the usefulness of the proposed methodology.
© 2015 Elsevier B.V. and Association of European Operational Research Societies (EURO) within the
International Federation of Operational Research Societies (IFORS). All rights reserved.
1. Introduction cial form of the latent class model considered by, among others,
Ivaldi, Monier-Dilhan, and Simioni (1995), Caudill (2003), Orea and
One drawback or restrictive assumption of estimating productiv- Kumbhakar (2004) and Greene (2005). The interesting feature of the
ity and efficiency through stochastic frontier analysis originally pro- proposed model is that only non-existence and existence of ineffi-
posed by Aigner, Lovell, and Schmidt (1977), Meeusen and van den ciency differs but not the frontier itself. KPT and RS also extend the
Broeck (1977), (see also, Ondrich & Ruggiero, 2001) was recently ZISF model to allow for π to depend on a set of covariates via a logit or
pointed out by Kumbhakar, Parmeter, and Tsionas (2013, KPT here- a probit function. Estimation of the model parameters can be carried
after) and Rho and Schmidt (2015, RS hereafter). The assumption out by using either standard maximum likelihood or E-M algorithm
that, a priori, all firms are inefficient and their inefficiency is modeled (see RS).
through a continuous density was shown to have considerable impli- In this paper we use a non-parametric formulation for the proba-
cations. When some firms are, in fact, fully efficient, a fact that we bility as a function of covariates, π (.) which does not impose restric-
cannot preclude on prior grounds, applying stochastic frontier analy- tive assumptions on what determines full efficiency. The issue is im-
sis with the familiar distributions (half-normal, exponential, etc.) re- portant as misspecification of the parametric form of probability has
sults in biased estimates of inefficiency. implication for estimating technical efficiency and, more specifically,
To overcome this draw back, KPT (and independently by RS) pro- which firms are fully efficient. Although functional forms for produc-
posed a new model for which they call “zero-inefficiency stochastic tion or cost functions are more or less established in applied studies,
frontier (ZISF) model, that allows for the inefficiency term to have this is not so for the functional form of the probability of firms being
mass at zero with certain probability, π and a continuous distribu- fully efficient,π (.). This is quite important since the functional form
tion, with probability, 1 − π . In essence, their model takes a spe- of E (y|X ) depends on the functional form of π (.) and the covariates.
To accommodate for the unknown probability of firms being effi-
cient function in the estimation, we develop an iterative backfitting
∗
Corresponding author. Tel: +1 403 329 2511. local maximum likelihood procedure which is fairly simple to com-
E-mail address: kien.tran@uleth.ca, tsionas@aueb.gr (K.C. Tran). pute in practice. We also derive the necessary asymptotic theory of
1
An earlier version of this paper was presented at the North American Productivity
the proposed estimator. Specifically, we derive the asymptotic bias
Workshop (NAPW) in Ottawa, June 4–7, 2014. The authors would like to thank the Edi-
tor, I. Bomze and two anonymous referees for constructive comments and suggestions and variance of the proposed estimator and established its asymp-
that substantially improved an earlier version of this paper. totic normality. In addition, we discuss how to test for parametric
http://dx.doi.org/10.1016/j.ejor.2015.10.019
0377-2217/© 2015 Elsevier B.V. and Association of European Operational Research Societies (EURO) within the International Federation of Operational Research Societies (IFORS).
All rights reserved.
1114 K.C. Tran, M.G. Tsionas / European Journal of Operational Research 249 (2016) 1113–1123
specification of the probability function of firms that are fully effi- To complete the specification of the model, let f (z) and f (y|x, z)
cient as well as how to test for the presence of fully efficient firms, denote the marginal density of z and the conditional density of y
based on the bootstrap sieve likelihood ratio statistics (Fan, Zhang, & given x and z, respectively. In addition, we assume throughout the
Zhang, 2001). paper that f (y|x, z) is known and belongs to a class of parametric
We use both Monte Carlo experiments and real-world data from densities with parameter θ ∈ ⊂ k where k a positive integer and
U.S. banks to illustrate the applicability of the new model, and com- the function π (z) : q → [0, 1] is a smooth function which is twice
pare the results with the standard stochastic frontier models as well continuously differentiable.
as the model proposed by KPT where the probability is a parametric
function of covariates. Our Monte Carlo results indicate that the pro-
3. Estimation
posed estimation methods as well as the bootstrap sieve likelihood
ratio statistics perform well in samples of the size typically used in
3.1. Backfitting local maximum likelihood procedure
applied econometric studies.
The rest of the article is organized as follows. Section 2 intro-
To make specific assumption regarding the conditional distribu-
duces the semiparametric zero-inefficiency stochastic frontier model.
tion of f (y|x, z), we follow the standard practice and assume that
Section 3 derives the backfitting local maximum likelihood estima-
vi |x, z ∼ i.i.d. N(0, σv2 ) and ui |x, z ∼ i.i.d. |N(0, σu2 )|, albeit other dis-
tor and discusses construction of inefficiency scores. Section 4 estab-
tributions such as exponential, truncated normal or gamma can also
lishes the asymptotic properties of the proposed estimator. Hypothe-
be considered for ui . The conditional probability density function of
sis testing for the parametric specification of probability of firms be-
εi = vi − ui is given by
ing fully efficient as well as testing for the presence of fully inefficient
ε λ
firms are discussed in Section 5. Monte Carlo simulations are pre-
π (z) ε 2
sented in Section 6, while Section 7 provides an empirical application f (ε|x, z) = φ + (1 − π (z)) φ −ε ,
to the U.S. banking industry. Section 8 provides concluding remarks.
σv σv σ σ σ
Proofs of the theorems are gathered in Appendix A. (2)
2. The model
where σ2= σu2 + σ 2,
v λ = σu /σv , φ(.) and (.) are the probabil-
ity density (pdf) and cumulative distribution functions (CDF) of a
standard normal variable, respectively. To avoid the non-negativity
We consider the following semiparametric version of the zero-
restrictions we make use of the following transformation: λ =
inefficiency stochastic frontier (SP-ZISF) model of KPT:
exp (λ̄) = λ̃ and σ 2 = exp (σ̄ 2 ) = σ̃ 2 . Let θ = (β , λ̃, σ̃ 2 ) then it fol-
xi β + vi with probablity π (zi ) lows that the conditional pdf of y given x and z is
yi = , (1)
xi β + vi + sui with probablilty 1 − π (zi )
f (y|x, z)
where yi is a scalar representing output of firm i, xi is a d × 1 vector
π (z) y − x β
of inputs, vi is random noise, ui is one-sided random variable repre- = φ
senting technical inefficiency, s = +1 for cost frontier and s = −1 for
σ̃v σ̃v

production frontier, π (.) is an unknown smooth function represent- 2 y − x β λ̃
ing the proportion of firms that are fully efficient and zi is a q × 1 +(1 − π (z)) φ −(y − x β) , (3)
σ̃ σ̃ σ̃
vector of covariates which influence whether a firm is inefficient or
not; and the zi may or may not be a subset of xi . Note that in (1) and conditional log-likelihood is then given by
the technology is the same for both regimes, and the composed er- n
ror is vi − ui (1 − 1{ui = 0}) where 1{.} is an indicator function and L∗ (π (zi ), θ ) = log f (yi ; π (zi ), θ |x, z) (4)
i=1
P (1{ui = 0}) = π (zi ). For illustration purpose, we focus mainly on the
production frontier. Cost frontier can be handled in the same way by From (4), it is clear that if π (z) is known and belongs a class of
replacing the negative sign on ui by a positive sign. In addition, to sim- parametric function with finite dimensional parameter vector, then
plify our discussion, we consider univariate z. Extension to multivari- standard maximum likelihood (ML) estimator can be obtained by
ate z is straightforward but at the expenses of increasing notational maxing (4) as discussed in KPT. However, π (z) is generally unknown
complexity and the “curse of dimensionality” problem. in practice rendering the standard MLE infeasible. To make the MLE
operational, we approximate the unknown function π (z) locally by a
2.1. Identification Issues linear function, albeit in practice, one might wish to consider higher
orders of local polynomials for π (z). For a given value of z0 , and z in
Under the standard stochastic frontier framework, there is no the neighborhood of z0 , a Taylor series expansion of π (zi ) at z0 gives
identification issue arise since the parameter σu2 , the variance of ui π (zi ) ≈ π (z0 ) + π (z0 )(zi − z0 ) = a(z0 ) + b(z0 )(zi − z0 ),
is identified through the moment restrictions on the composed er-
rors εi = vi − ui . However, in the context of model (1), we have an where π (z0 ) is the derivative of π (zi ) evaluated at z0 . Then, the con-
additional parameter π (.) which can be identified only if there are ditional local log-likelihood function associated with (4) can be writ-
non-zero observations in each class. In addition, as KPT and RS point ten as
out, when σu2 → 0, π (.) is not identified since the two classes be-
come indistinguishable. Conversely, when π (.) → 1 for a given z, σu2
L1 (a(z0 ), b(z0 ), θ )
n
is not identified. In fact, when a data set contains little inefficiency, = {log f (yi ; a(z0 ), b(z0 ), θ |x, z)}Kh (zi − z0 ),
i=1
(5)
one might expect that σu2 and π (.) to be imprecisely estimated, since
it is difficult to identify whether little inefficiency is due to π (.) is where Kh (ξ ) = h−1 K (ξ /h) and K (.) is a kernel function and h is the
close to 1 or σu2 is close to zero. However, this identification issue is appropriate bandwidth. Thus the conditional local log-likelihood de-
more relevant to the testing problem of all firms are efficient (or in- pends on z. However, since the parameter θ does not depend on z,
efficient). We will return to the discussion of this hypothesis testing we suggest the following backfitting procedure which motivated by
as well as other hypothesis testing problems in the later section. For Huang and Yao (2012) for estimating semiparametric mixture regres-
the present discussion, we will assume that σu2 > 0 and 0 < π (.) < 1 sion models. Specifically, for a given value of z0 , we first estimate π (.)
so that all the parameters in model (1) are identified. locally by maximizing (5) with respect to a, b and θ . Let ã, b̃ and θ̃
K.C. Tran, M.G. Tsionas / European Journal of Operational Research 249 (2016) 1113–1123 1115
be the solution to the maximization problem of (5), that is2 ĥ selected by CV will be in the order of n−1/5 which does not sat-
isfy the required undersmoothing conditions. However, a reasonable
(ã, b̃, θ̃) = arg max L1 (a, b, θ )
(a, b, θ ) adjusted bandwidth which suggested by Li and Liang (2008) that sat-
isfies the undersmoothing condition can be used, and it is given by
Then π̃(z0 ) = ã(z0 ) and θ̃ (z0 ) = θ̃ . Now note that the global h̃ = ĥ × n−2/15 = O(n−1/3 ). We will apply this adjusted bandwidth in
parameter vector θ does not depend on z and since θ is estimated our simulations and empirical application below.
√
locally, it does not possess the usual parametric n-consistency. To
√ Finally, the iterative backfitting local MLE described in this sec-
preserve the n-consistency and to improve the efficiency, given the
estimate of π̃ (z0 ), the parameter vector θ can be estimated globally tion uses direct maximization of the log likelihood functions (5)–(7).
by maximizing the following (global) log-likelihood function where An alternative approach is to use EM algorithm procedure. The main
we replace π (z0 ) with its estimate π̃ (z0 ) in (4) advantage of EM algorithm is that it is numerically stable and pos-
sesses the ascent property in the sense that when the sample size is

n
large enough, each iteration raises the likelihood value (Greene, 2012;
L2 (θ ) = log f (yi ; π̃ (zi ), θ |x, z). (6)
Huang & Yao, 2012). However, the main drawback of EM algorithm is
i=1
that it requires extensive computation, especially for the model con-
Let θ̂ be the solution of maximizing (6). In the next section, we siders in this paper, and the convergence can be very slow. Huang and
√
will show that, under certain regularity conditions, θ̂ retains its n- Yao (2012) give detail implementations of EM algorithms for normal
consistency property. Given the estimates of θ̂ and to improve effi- mixture model which is very similar to our model. Interested readers
ciency, the function π (z0 ) can be obtained by maximizing the local are referred to their paper for more details.
likelihood function
3.2. Estimation of firm-specific inefficiency

n
L3 (a(z0 ), b(z0 ), θ̂ ) = {log f (yi ; a(z0 ), b(z0 ), θ̂ |x, z0 )}Kh (zi −z0 )}.
Follow the discussion of KPT, we can similarly consider several
n=1
approaches to estimate firm-specific inefficiency. The first approach
(7)
is based on the popular estimator of Jondrow, Lovell, Materov, and
Let π̂ (.) = â(.) be the solution of maximizing (7). Finally, θ̂ and Schmidt (1982) where under our setting, the conditional density of u
π̂ (z) can further be improved by iterating until convergence. We will given ε is
denote the final π̂ (z) and θ̂ as iterative backfitting local MLE.
0 with probability π (z)
We summarize the above backfitting local ML estimation proce- f (u|ε) = ,
dure with the following computational algorithm:
N+ (μ∗ , σ∗2 ) with probability (1 − π (z))
where N+ (.) denotes the truncated normal, μ∗ = −εσu2 /σ 2 and σ∗2 =
Step 1: For each zi , i = 1, ..., n, in the sample, maximize the condi-
σu2 σv2 /σ 2 . Thus, the conditional mean of u given ε = y − x β is:
tional local log-likelihood (5) to obtain the estimate of π̃ (zi ).
Note that if the sample size n is large, (5) could be performed σ λ φ( − λε /σ ) λε
E (u|ε , z) = (1 − π (z)) − . (9)
on a random subsample Ns where Ns n to reduce the com- 1 + λ2 ( − λε /σ ) σ
putational burden. Also, to ensure that the estimates of π (.)
fall within the interval [0, 1], we reparameterizing the local A point estimator of individual inefficiency score could be ob-
linear parameters using logistic function. tained by replacing the unknown parameters in (9) by their estimates
Step 2: From Step 1, conditional on π̃ (zi ), maximize the conditional and ε by ε̂(x) = y − x β̂ .
global log-likelihood function (6) to obtain θ̂ . The second approach is to use the modal estimator which defined
Step 3: Conditional on θ̂ from Step 2, maximize the conditional local as
log-likelihood function (7) to obtain π̂ (zi ). df (u|ε , z)
M(u|ε , z) = = 0, (10)
Step 4: Using π̂ (zi )repeat Step 2 and then Step 3 until the estimate du
of θ̂ converges. and (10) is known to have a zero at the value of u = μ∗ whenever
Remark. First, in practice, one could stop at Step 3 to reduce the
ε < 0, and zero otherwise. Hence, multiplying μ∗ by (1 − π (z)) yields
the modal estimator.
computational burden. However, iteration between Step 2 and Step 3
The final approach is to construct the posterior estimates of inef-
until convergence is highly recommended. Based on our limited ex-
ficiency ũi . To do this, let p∗i (.) denotes the posterior estimate of the
perience, convergence is typically fast as it requires only a few iter- probability of being fully efficient where
ations. Second, Step 1 requires specifications of the kernel function
Kh (.) as well as bandwidth h. For the kernel function, an Epanech- p∗i (z)
nikov or Gaussian function is a popular choice. As for the bandwidth (π̂ (z)/σ̂v )φ(ε̂i λ̂i /σ̂v )
selection, data driven methods such as cross-validation (CV) can be = .
(π̂ (z)/σ̂v )φ(ε̂i λ̂i /σ̂v ) + (1 − π̂(z))(2/σ̂ ) φ(ε̂i λ̂i /σ̂ ) ( − ε̂i λ̂i /σ̂ )
used (see for example Li & Racine, 2007). In our context, we use a
likelihood version of CV which is given by (11)
1
n
CV(h) = log f (yi ; π̂ (i) (zi ), θ̂ (i) |x, z), (8) Then the posterior estimate of inefficiency can be defined as ũi =
n
i=1 (1 − p∗i (z)) ûi where ûi is the estimated of inefficiency based on (9) or
(10). KPT provide an intuitive explanation for why the estimator given
where π̂ (i) (zi ) and θ̂ (i) are the leave-one-out version of the back-
in (11) would be particularly helpful for researchers and regulators in
fitting local MLE described above. Third, it is important to note
the merger case to determine the probability of a specific firm or a
that, in semiparametric modeling, undersmoothing conditions (see
group firms in the industry is to being fully efficient.
Theorem 1 below) are typically required in order to obtain
√
n−consistency for the global parameters. The optimal bandwidth
4. Asymptotic properties
2
In practice, the estimation is performed at a set of given z0 values. A simple ap- In this section, we derive the sampling property of the proposed
proach is to set z0 = z1 , ..., z0 = zn which yields a set of π̃(z0 ) values. backfitting local MLE π̂ (z) and θ̂ = (β̂ , σ̂ 2 , λ̂) . In particular, we will
√
show that the backfitting estimator θ̂ is n−consistent and follows h(zi , δ) = exp (zi δ)/[1 + exp (zi δ)] or h(zi , δ) = (zi δ) where (.) is
an asymptotic normal distribution. In addition, we also provide the the cumulative distribution function of a standard normal random
asymptotic bias and variance of the estimator π̂ (z), and show that variate. Under the null hypothesis, model (1) reduces to the paramet-
asymptotically, it has smaller variance compared to π̃ (z). To this end, ric zero-inefficiency stochastic frontier model considered by KPT and
let us define the following additional notations. RS. However, under the alternative hypothesis, model (1) is a semi-
Let γ (z) = (π (z), θ ) and (γ (z), x, y) = log f (y|x, γ (z)). De- parametric model and hence the number of parameters under the al-
∂ (γ (z),x,y) ∂ 2 (γ (z),x,y) ternative is undefined. One useful approach to test for the above null
fine qγ (γ (z), x, y) = ∂γ , qγ γ (γ (z), x, y) = ∂γ ∂γ and the
hypothesis is to use sieve likelihood ratio (hereafter SLR) statistics
terms qθ , qπ , qθ θ , qθ π and qπ π can be defined similarly. In addition,
suggested by Fan et al. (2001), and it is given by:
let ψ(w|z) = E[qπ (γ (z), x, y)|z = w],
T = 2{L∗ (H1 ) − L∗ (H0 )}, (13)
Iγ γ (z) = −E[qγ γ (γ (z), x, y)|z]
where L∗ (H0 ) L∗ (H1 )
and denote the log-likelihood function com-
Iθ θ (z) = −E[qθ θ (γ (z), x, y)|z] puted under the null and the alternative hypothesis, respectively.
Iπ π (z) = −E[qππ (γ (z), x, y)|z] Fan et al. (2001) show that the SLR statistics are asymptotically dis-
tribution free of nuisance parameters and follow χb2 distributions
Iπ θ (z) = −E[qπθ (γ (z), x, y)|z] n
(for a sequence bn → ∞) under the null hypothesis (i.e., Wilks phe-

Finally, let μ j = u j K (u)du and κ j = u j K 2 (u)du. We make the nomenon) for testing a number of useful hypotheses for a variety of
following assumptions: useful models such as nonparametric regression, varying coefficient
and generalized varying coefficient models. However, since model (1)
Assumption 1. The sample {(xi , yi , zi ), i = 1, . . . , n} is independently belongs to the class of semiparametric mixture models, the asymp-
and identically distributed from the joint density f (x, y, z) which has totic null distribution of the SLR may or may not follow χb2 distri-
n
continuous first derivative and positive in is support. The support for
bution. Thus, one approach is to derive the asymptotic distribution
z, denoted by Z, is a compact subset of and f (z) > 0 for all z ∈ Z.
of T . Alternatively, we can use the conditional bootstrap procedure
Assumption 2. The unknown function π (z) is twice continuously suggested by Cai, Fan, and Li (2000) to approximate the asymptotic
differentiable in its argument. Furthermore, π (z) > 0 hold for all null distribution. The conditional bootstrap can be conducted as fol-
z ∈ Z. lows. Let {β̄ , σ̄ 2 , λ̄, δ̄} be the MLE under the null hypothesis. Given xi ,
generate a bootstrap sample, y∗i from a given distribution of y speci-
Assumption 3. The matrix Iγ γ (z) and Iθ θ are positive definite.
fied in (1) with {π (.), β , σ 2 , λ} are replaced by their MLE estimates
Assumption 4. The kernel density function K (.) is symmetric, con- {β̄ , σ̄ 2 , λ̄, δ̄}. For each bootstrap sample, calculate the test statistic T ∗
tinuous and has bounded support. in (13), and use the distribution T ∗ as an approximation to the distri-
bution of T .
Assumption 5. For some ζ < 1 − r−1 , n2ζ −1 h → ∞ and E (z2r ) < ∞. It is important to note that, the conditional bootstrap described
All the above assumptions are relatively mild and have been used above is valid only if the asymptotic null distribution is independent
in the mixture models and local likelihood estimation literature. of nuisance parameter π (.)(i.e., Wilk’s phenomenon). We investigate
Given the above assumptions, we now ready to state our main results the Wilk’s phenomenon via Monte Carlo simulation below. Our sim-
in the following theorems. ulation results indicate that, indeed Wilk’s type of phenomenon con-
tinue to hold for the model consider in this paper.
Theorem 1. Under Assumptions 1–5 and in addition, nh4 → 0 and Another interesting question that arises is whether all firms are
nh2 log (1/h) → ∞, we have inefficient. This question leads to the following testing hypothesis:
√ D H0 : π (z) = 0 for all z. Under null hypothesis of H0 : π (z) = 0, model
n(θ̂ − θ ) −
→ N(0, A−1 A−1 ),
(1) reduces to a standard stochastic frontier and this is simply a spe-
where A = E {Iθ θ (z)} and = Var{ ∂ (π (∂θ
z),θ ,x,y)
− Iθ π (z)d(x, y, z)} with cial case of the testing problem of constancy of π (.) which take on a
d(x, y, z) is the first element of Iγ−1γ (z)qγ (γ (z), x, y). specific value of 0. Thus, in principle, a simple modification of sieve
likelihood ratio statistics in (13) can be used to test the null. However,
Theorem 2. Under Assumptions 1–5 and in addition, as n → ∞, h → 0 since the value of 0 lies on the boundary of the parameter space of π ,
and, nh → ∞ we have the asymptotic null distribution of the test statistics is no longer a χ 2
D
nh{π̂(z) − π (z) − B(z) + o p (h2 )} −
→ N{0, κ0 f −1 (z)Iππ
−1
}, distribution. Thus, one approach is to derive the asymptotic distribu-
tion of the test statistics under this null hypothesis along the line of
where B(z) =
2 μ2 h Iπ π (z)ψ (z|z).
1 2 −1
Andrews (2001), which is very complicated, given the semiparamet-
The proofs of Theorems 1 and 2 are given in Appendix A. Note that, ric nature of the alternative. In addition, it is beyond the scope of this
the result from Theorem 2 shows that, as for common semiparamet- paper and we will leave it for future research.
ric model, the estimate of θ has no effect on the first-order asymptotic Alternatively, since the null hypothesis of H0 : π (z) = 0 is a spe-
√ cial case of H0 : π (z) = π , the conditional bootstrap described earlier
since the rate of convergence of π̂ (z) is slower than that of n. Con-
sequently, it is fairly straightforward to see that π̂ (z) is more efficient can be used to approximate the asymptotic distribution of the test
than the initial estimate of π̃ (z). statistics, provided that the Wilk’s type of phenomenon continues to
hold. Our Monte Carlo results below show that the test which is based
5. Hypothesis testing on the conditional bootstrap has approximately correct sizes.
Finally, note that we did not pursue the hypothesis testing prob-
Given the structure of model (1), it is of great interest to ask lem of H0 : π (z) = 1 for all z (i.e., all firms are efficient) simply be-
whether the probability of a firm being efficient takes a specific para- cause under this null hypothesis, there is a technical problem related
metric form such as those suggested in KPT or RS. This question leads to the fact that σu2 is not identified, which invalidates the conditional
to the following hypothesis testing problem: bootstrap procedure, albeit the SLR test would remain valid. In this
case, there is a need for deriving the asymptotic distribution of the
H0 : π (zi ) = h(zi , δ), (12)
test statistics, and this is beyond the scope of this paper. Further in-
where h(zi , δ) is a specific parametric function and δ is a vector of vestigation for this case would be interesting and useful for future
unknown parameters. For example, as in KPT and RS, one can assume research.
Table 1 Table 3
MSE of (β̂ , σ̂ 2 , λ̂) and MASE of π̂(.). Standard Deviations, Standard Errors and Coverage Probabilities.
n = 2500, λ = 1.0 n = 5000, λ = 1.0 Parameter STD SE(STD) 95 percent Coverage

MSE MSE
n = 2500, h = 0.06
β 0.0028 0.0014 β1 0.027 0.029(0.006) 94.6 percent
σ2 0.0057 0.0022 σ2 0.015 0.014(0.004) 95.2 percent
λ 0.1011 0.0051 λ 0.022 0.021(0.002) 93.2 percent
MASE MASE n = 2500, h = 0.12
π(.) 0.1357 0.0072 β1 0.028 0.027(0.006) 94.7 percent
σ2 0.017 0.018(0.0004) 95.2 percent
n = 2500, λ = 2.5 n = 5000, λ = 2.5
λ 0.023 0.025(0.002) 93.5 percent
MSE MSE
n = 2500, h = 0.24
β 0.0021 0.0011 β1 0.019 0.020(0.005) 94.8 percent
σ2 0.0051 0.0023 σ2 0.009 0.008(0.004) 95.4 percent
λ 0.0097 0.0044 λ 0.013 0.013(0.002) 93.5 percent
MASE MASE n = 5000, h = 0.05
π(.) 0.1019 0.0059 β1 0.018 0.017(0.003) 95.0 percent
σ2 0.009 0.010(0.002) 95.3 percent
n = 2500, λ = 5.0 n = 5000, λ = 5.0 λ 0.012 0.011(0.001) 95.0 percent
MSE MSE n = 5000, h = 0.10
β1 0.019 0.018(0.003) 94.7 percent
β 0.0018 0.0010
σ2 0.010 0.010(0.002) 95.1 percent
σ2 0.0049 0.0025
λ 0.011 0.011(0.001) 94.2 percent
λ 0.0085 0.0044
MASE MASE n = 5000, h = 0.20
π(.) 0.0095 0.0044 β1 0.020 0.020(0.003) 93.1 percent
σ2 0.011 0.010(0.003) 94.0 percent
λ 0.012 0.013(0.001) 92.2 percent
Note: STD = standard deviations of estimated parameters; SE = es-

6. Monte Carlo simulations timated standard errors using bootstrap procedure.
In this section, we conduct some simulations to study the finite

sample performance of the proposed estimator and test statistics. To The performance of the estimates of the production parameters is
this end, we consider the following data generating process (DGP): measured by the mean squared errors (MSE)
1
1000
with probablity π (zi )
2
1 + xi + vi MSE = (θ̂r − θ ) ,
yi = ,
1 + xi + vi − ui with probablilty1 − π (zi ) 1000
r=1
where π (zi ) = 0.05 + 0.6 sin (π zi ). We generate zi from an uniform where θ̂ = β̂ , σ̂ 2 or λ̂. The simulations were performed on the main-
distribution on [0, 1] and the xi is generated from a N(0, 1). The ran- frame using FORTRAN 77 using G77 complier of GNU.
dom error term vi is generated as N(0, 0.5) and the one-sided er- The simulation results for the estimated MSE of the production
ror ui is generated as |N(0, 0.5λ)|. For all of our simulations, we parameter estimates and the estimated MASE of π̂ (zi ), for various
set λ = {1, 2.5, 5}, and let the sample sizes vary over n = 2500 or values of λ are presented in Table 1. From Table 1, first we observe
n = 5000. For each experimental design, 1000 replications are per- that as the sample size increases, both estimated MSE for produc-
formed. tion parameter estimates, θ̂ and MASE for π̂ (zi ) reduces. Second, we
We use the Gaussian kernel function and the bandwidth is chosen also observe that as the sample size doubles, the estimated MSE for
according to h̃ = ĥ × n−2/15 where ĥ is the optimal bandwidth based production parameter estimates reduces to about half of the original
values; this is consistent with the fact that the back-fitting local ML
on CV approach previously discussed in Section 3.1. √
We measure the performance of the estimate of the probability estimator of θ is n-consistent as predicted by Theorem 1.
of firms being fully efficient function π (z) by computing the mean Table 2 reports the empirical sizes of the bootstrap SLR statistics.
average squared errors (MASE): From Table 2, we see that, there are little sizes distortions indicat-
ing the conditional bootstrap provides a good approximation for the
asymptotic distribution of the SLR statistics.
1 1
1000 n
[π̂r (zi ) − πr (z j )] .
2 Table 3 summarizes the performance of the bootstrap approach
MASE =
1000 n for standard errors of estimate of parameters for two different
r=1 j=1
Table 2
Empirical Sizes of the Bootstrap SLR Statistics (λ = 2.5).
n = 2, 500 n = 5, 000
H0 : π(z) = ez /(1 + ez ) H0 : π(z) = ez /(1 + ez )
1 percent 5 percent 10 percent 1 percent 5 percent 10 percent
Emp. sizes 0.0117 0.0493 0.1105 0.0101 0.0502 0.099
n = 2, 500 n = 5, 000
H0 : π(z) = 0 H0 : π(z) = 0
1 percent 5 percent 10 percent 1 percent 5 percent 10 percent
Emp. sizes 0.0112 0.0489 0.1112 0.0104 0.0503 0.108

Table 4
The pointwise coverage probabilities for π(z).
z 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
n = 2500, h = 0.06
π(θ̂) 94.6 percent 94.7 percent 94.8 percent 90.9 percent 95.4 percent 95.2 percent 94.9 percent 98.9 percent 97.7 percent
π(θ) 94.1 percent 94.3 percent 94.5 percent 90.7 percent 95.0 percent 95.0 percent 94.8 percent 98.0 percent 97.1 percent
n = 2500, h = 0.12
n = 2500, h = 0.24
n = 5000, h = 0.05
n = 5000, h = 0.10
n = 5000, h = 0.20
Note: π(θ̂): when θ̂ is estimated and π(θ): when θ are assumed to be known.
samples, and three different bandwidths which correspond to different values of δ , and compute the conditional null distribution
under-smoothing (h̃ = ĥ × n−2/15 ), appropriate amount (ĥ) and over- based on its 500 bootstrap samples. The resulting densities are de-
smoothing (2ĥ). In the table, the standard deviation of 1000 estimates picted as dotted curves in the same Fig. 1a–c. From these figures, we
are denoted by STD which can be viewed as the true standard errors, can observe that the proposed conditional bootstrap approach per-
while the average bootstrap standard errors are denoted SE along formed quite well to approximate the asymptotic null distribution.
with their standard deviations are given the parentheses. The SE are
calculated as the average of 1000 estimated standard errors. The cov- 7. Empirical application
erage probabilities for all the parameters are given the last column
and they are obtained based on the estimated standard errors. The There exists a vast literature on measuring productivity and effi-
results from Table 3 show that the suggested bootstrap procedure ap- ciency for the banking sectors in various countries (see for example,
proximates the true standard deviations quite well and the coverage Galán, Veiga, & Wiper, 2015; Sathye, 2003; Tzeremes, 2015 and the
probabilities are close to the nominal levels for almost all cases. articles in Volume 98, Issue 2 (1997) of the European Journal of Oper-
Note that the bootstrap procedure also allows us to compute ational Research, just to name a few). However, all these applications
the point-wise coverage probabilities for the probability functions. typically do not allow for the presence of fully efficiency banks, and
Table 4 provides the 95 percent coverage probabilities of π (z) for a hence these results could potentially be misleading if in fact there are
set of evenly space grid points distributed on the support of z. In the efficient banks in the sample.
table, the row labeled with π (θ̂ ) gives the results using the proposed In this section, we provide an application of the U.S. banking sec-
approach, while π (θ ) provides the results assuming θ were known. tors to illustrate the usefulness and merit of our proposed model
For most cases, the coverage probabilities are close to the nominal and approach. The data we use are taken from Koetter, Kolari, and
level, especially when under-smoothing or appropriate smoothing is Spierdijk (2012) which consist of large number of individual U.S. com-
used. For the case of over-smoothing, the results are somewhat less mercial banks from Reports of Condition and Income of the Federal
satisfactory. Moreover, the coverage levels are slightly low for point Reserve System.4 The data contain annual year-end from all U.S. in-
0.4 and slightly high for points 0.8 and 0.9. sured banks between 1976 and 2007. After controlling for outliers and
Next, we investigate whether Wilk’s type of phenomenon hold for missing observations, the final sample use in the estimation consists
the proposed model. Under the null hypothesis of (12), we assume the of 342,868 observations.
probability of efficient firm takes a specific parametric form. The DGP Following convention in the competition and efficiency literature,
is the same as above except now we generate π (z) as logistic function the regressors used in our model are logs of three input prices: price
or standard normal CDF with parameter δ . For each function, we fixed of fixed assets (w1 ), cost of labor (w2 ) and purchased fund costs (w3 ),
the value of λ = 2.5 and set 3 different values of δ = {−1, 0, 1}, and levels of two outputs: loans (y1 ) and federal funds sold and securities
use nonparametric kernel density estimation to compute the uncon- purchased (y2 ), a time trend (t) and the log of total assets to control
ditional (asymptotic) null distribution of SLR statistics with n = 2500 for size effects (z). In addition, to be in line with the intermediation
and h = 0.06 via 500 replications.3 The resulting densities are plot- approach, it is assumed that banks transform various saving of con-
ted in solid lines in Fig. 1a–c. As can be seen from these plots, the sumers and firms into loans and investment, and seek to minimize
resulting densities are very close indicating that the asymptotic dis- costs. Thus, the dependent variable is total operating costs implying
tribution of the SLR statistics are not sensitive to the choice of func- a cost frontier approach is employed.
tion π (z). This suggests that Wilk’s type of results continue to hold Note that our proposed model and approach is designed for cross-
for our model. section data, and since we are using panel data, we need to make
Finally, to validate the conditional bootstrap approach, for each some assumptions regarding the temporal behavior of the technical
assumed function, we select 3 typical samples generated from the 3 inefficiency and random noise. Following KPT, first we include a time
variable in the SF function to allow for technical change or shift in the
3
We also conduct simulations using other bandwidths h = 0.12 and h = 0.24. The
4
results are very similar, and hence we do not report them here but available from the See Koetter et al. (2012) for the issues involved as well as details construction of
authors upon request. the data set. The data are available also from Restrepo-Tobon and Kumbhakar (2014).
Fig. 1. (a) Estimated densities of the null distributions of SLR statistics, δ = −1, λ = 2.5, h = 0.06. (b) Estimated densities of the null distributions of SLR statistics, δ = 0, λ =
2.5, h = 0.06. (c) Estimated densities of the null distributions of SLR statistics, δ = 1, λ = 2.5, h = 0.06.
Fig. 2. Estimated probability function π(z).
frontier; and second, for simplicity we assume that both u and v are timum, and hence there are benefits from expansion (i.e., economies
independently and identically distributed. of scale). The opposite holds true when RTS is less than one. For TC,
We employ the translog specification for the cost frontier which it is defined as the rate of change in cost over time, ceteris paribus,
can be written as: i.e., ∂ ln C/∂ t. Therefore, a negative value of TC suggests a reduction in
cost overtime, implying technical progress, and a positive value of TC
1
2 2 2 2
ln Cit = β0 + β j ln y j,it + γ j ln w j,it + βlk ln yl,it ln yk,it shows a technical regress, ceteris paribus.
2
j=1 j=1 l=1 k=1 As previously mentioned in Section 1, it is important to recognize

2
2 that the prominent feature of the ZISF model is that the frontier itself
1
+ γlk ln wl,it ln wk,it does not vary across the two classes of firms but only the existence
2 or non-existence of inefficiency differs. Thus, we would expect that
l=1 k=1

2
2
2 the estimated RTS and TC of the three models would not differ sig-
+ αlk ln yl,it ln wk,it + αt t + αtt t 2 + δlkt ln yl,it nificantly. Indeed, our results indicated that the estimated RTS and
l=1 k=1 l=1 TC are very similar across all models. For RTS, the estimated values
ranging from 0.85 to 1.3 with the mean value of 1.11 and standard de-

2
+ θlkt ln wl,it + vit + uit , viation of 0.22; while the estimated values of TC ranging from −0.091
l=1 to 0.019 with the mean value of −0.016 and standard deviation of
0.0068. These results indicate that most of the banks experienced
where we assume uit ∼ i.i.d. |N(0, σu2 )| and vit ∼ i.i.d. N(0, σv2 ). We
economies of scale as well as technical progress.
impose the usual symmetry conditions in which we assume βlk = βkl
We now turn our attention to the results of the estimated proba-
and αlk = αkl . In addition, we normalized the cost and input prices
bility function, and the estimated technical efficiencies. To present re-
by one input price (here we use w3 ) to ensure that the linear homo-
sults for the probability function, we normalize the log of total assets
geneous restriction of the cost function with respect to input prices
as zit∗ = (zit − zmax )/(zmax − zmin ) where zit is the log of total assets.
hold.
The function is then evaluated at 100 points between 0 and 1, and
For comparison purpose, we also estimate a half-normal standard
presented (along with two standard error bands) in Fig. 2. From Fig.
scholastic frontier model (HN-SFM) and a zero-inefficiency stochastic
2, we observe that banks attaining full efficiency are, most likely, con-
frontier (ZISF) model of KPT, which assumes the underlying probabil-
centrated near the 20 percent asset quantile where the probability
ity of firms being fully efficient follows the logit specification of zi .
peaks at about 80 percent while 95 percent confidence intervals in-
Note that, since the estimated parameters of the translog frontier
dicate that the probability can be as high as one. Larger banks seem to
do not have any direct economic interpretation, for brevity, we do
be inefficient with large probability, although there is some evidence
not report all the parameter estimates here but these are available
of marginal behavior near the 79–85 percent quantiles. Zero ineffi-
from the authors upon request. Instead, we summarize the results
ciency seems to prevail for most banks roughly below the median
for the estimated returns to scale (RTS), technical change (TC) as well
(normalized) assets. These results are consistent with the hypothesis
as report the results that associated with the estimated probability
that larger banks are inefficient due, for example, to the “quiet life hy-
function and the estimated technical inefficiencies.
pothesis” (Koetter & Vins, 2008; Koetter et al., 2012), albeit other rea-
For a cost function, RTS measures the proportional increase in
sons could also be responsible for inefficient larger banks and nearly
costs due to an increase in all outputs, that is, RTS can be defined as
efficient smaller banks.
the reciprocal of j (∂ ln C/∂ y j ). Thus, if RTS is larger than one then
The estimated technical inefficiency distributions are displayed
a proportional increase in all outputs will lead to a less than propor-
in Fig. 3. It can be seen from Fig. 3 that, the SFM based on the
tional increase in cost, implying that the scale operation is below op-
Fig. 3. Technical inefficiency distributions.
half-normal specification yields inefficiency results that display vir- the probability of firms being efficient. We derive the asymptotic bias
tually no mass at zero, indicating that no banks are fully efficient. The and variance of the proposed estimator and establish its asymptotic
inefficiency scores lie in the range of 0.5 percent to 14 percent with normality. In addition, we discuss how to test for parametric specifi-
the mean value of 6.5 percent and the standard deviation of 0.021. In cation of the proportion of firms that are fully efficient as well as how
contrast, results from ZISF indicate that there is some mass at zero to test for the presence of fully inefficient firms, based on the con-
with a long right tail in the inefficiency distribution. This suggests ditional bootstrap sieve likelihood ratio statistics. The finite sample
that, albeit there are some fully efficiency banks, inefficiency can be behaviors of the proposed estimation procedure and tests are exam-
as high as 20 percent for some other banks. The inefficiency scores lie ined using Monte Carlo simulations. We apply the proposed method
in a wider range than in the case of half-normal SFM, ranging from to data on a large number of individual U.S. commercial banks to ex-
0 percent to 20 percent with the mean value of 4.5 percent and the amine the effects of total assets on the probability of banks being
standard deviation of 0.032. The semi-parametric specification places efficient as well as technical inefficiency measurements overall. Our
even more mass at zero, and the inefficiency distribution is much analysis indicated that flexible specification of the probability func-
tighter than both half-normal SFM and ZISF. The inefficiency scores tion of banks being efficient is critical in efficiency estimation.
lies between 0 percent and 10 percent, with the mean value of 2.3 Note that the estimation approach proposed in this paper can also
percent and the standard deviation of 0.012. Thus, from the results be easily modified and extended to other models as well that allow
in Fig. 3, we can see that parametric models (HN-SFM or ZISF) de- for the distribution of ui to depend on covariate zi either paramet-
liver very different inefficiency distributions compared to the semi- rically (e.g., Caudill & Ford, 1993; Reifschneider & Stevenson, 1991;
parametric specification. Caudill, Ford, & Gropper, 1995) or nonparametrically. For example, if
To determine which specification is more appropriate for the data it is assumed that σ 2 (zi ) = exp (zi α), then by simply redefining the
considered, we use the SLR test discussed in Section 5 to test for finite dimensional parameter vector θ , the estimation algorithm pro-
the hypotheses of (i) H0 : π (z) = eα z /(1 + eα z ) (ZISF model) and (ii) ceeds as discussed. On the hand, if we assume σ 2 (zi ) to be an un-
H0 : π (z) = 0 (HN-SFM), based on the conditional bootstrap critical known smoothing function, then by approximating this function lo-
values. Our SLR tests produce the conditional bootstrap p-values of cally at a point z0 and by modifying the local log likelihood function
0.0236 for testing (i) and 0.0073 for testing (ii), suggesting that both in (4), the estimation algorithm remains unaffected.
null hypotheses are rejected at a 5 percent significant level in favor Finally, it would be interesting to extend the current model to full
of the nonparametric specification of the probability function. Conse- nonparametric setting that includes both continuous and categorical
quently, for this particular data set, our results provide evidence that variables in the frontier as well as in the probability function.
a flexible specification of the probability function is critical and, in
particular, material in terms of inefficiency estimation.
8. Concluding remarks Appendix A. Proofs of the theorems

√
√ Let π̃ = nh(π̃ −
We first ∗
In this paper, we propose semiparametric approach for estimat- √ introduce some √additional notations.
ing the zero-inefficiency stochastic frontier model (e.g., KPT, 2013; π ), β̃ = nh(β̃ − β), λ̃ = nh(λ̃ − λ), σ̃ = nh(σ̃ − σ 2 ) where
∗ ∗ 2∗ 2

RS, 2015) by allowing for the proportion of firms that are fully effi- π , β , λ and σ 2 are the true values. Also, let θ̃ ∗ = (β̃ ∗ , λ̃∗ , σ̃ 2∗ ) and

cient to depend on a set of covariates via unknown smooth function. γ̃ ∗ = (π̃ ∗ (.), θ̃ ∗ )
In particular, we propose an iterative backfitting local maximum like-
lihood estimation procedure that achieves the optimal convergence Proof of Theorem 1. : The proof of this theorem follows similarly to
rates of both frontier parameters and the nonparametric function of that of Huang and Yao (2012). We show the key steps of the proof.
To derive the asymptotic properties of θ̂ , we first examine the Thus, from (A.2) in conjunction with (A.4) and an application of
asymptotic behavior of γ̃ = (π̃ , θ̃ ) which is the local MLE of (4). Let quadratic approximation lemma (see for example Fan & Gijbel, 1996,
√ p. 210), leads to
θ̂ ∗ = n(θ̂ − θ )
θ̂ ∗ = B−1 An + o p (1) (A.5)
(π̃ (zi ), θ , xi , yi ) = log f (yi |π̃(zi ), θ )
if An is a sequence of stochastically bounded vectors. Consequently,
(π̃ (zi ), θ̂ + n−1/2 θ̃ , xi , yi ) = log f (yi |π̃(zi ), θ̂ + n−1/2 θ̃) the asymptotic normality of θ̂ ∗ follows from that of An . Note that since
Then θ̂ ∗ is the maximize of An is the sum of i.i.d. random vectors, it suffices to compute the mean
and covariance matrix of An and evoke the Central Limit Theorem. To

n
this end, from (A.3), we have
Ln (θ ∗ ) = { (π̃ (zi ), θ̂ + n−1/2 θ̃ , xi , yi ) − (π̃ (zi ), θ , xi , yi )} (A.1)
i=1 ∂ (π (z), θ , x, y)
E (An ) = n1/2 E − Iθ π (z)d(x, y, z) (A.6)
By using a Taylor series expansion and after some calculation, ∂θ
yields The expectation of each element of the first term on the right
1 ∗ hand side can be shown to be equal to 0 and further calculation
Ln (θ ∗ ) = An θ ∗ + θ Bn θ ∗ + o p (1) (A.2) shows that E {Iθ π (z)d(x, y, z)} = 0. Thus E (An ) = 0. The variance of An
2
is Var(An ) = Var{ ∂ (π (∂θ
z),θ ,x,y)
− Iθ π (z)d(x, y, z)} = . By the Central
where Limit Theorem, we obtain the desired result.

n
∂ (π̃ (zi ), θ , xi , yi )
An = n−1/2 Proof of Theorem 2:. Recall that, given the estimate of θ̂ ,
i=1
∂θ π̂ (z) maximizes (5). Let η(zo, z) = a(zo) + a (zo)(z − z0 ) and π ∗ =
(nh)1/2 {π − a(z0 ), h(π − a (z0 ))} , then π̂ ∗ maximizes

n
∂ 2 (π̃ (zi ), θ , xi , yi )
n
Bn = n−1/2 L∗n (π ∗ ) =

{ (η(z0 , zi ) + (nh)−1/2 π ∗ wi , θ̂ , xi , yi )
i=1
∂θ ∂θ
i=1
Next we evaluate the terms An and Bn . First, expanding An around − (η(z0 , zi ), θ̂ , xi , yi )}Kh (zi − z0 )
π (zi ), we obtain where wi = (1, (zi − z0 )/h) . Using Taylor expansion of (.)and after

n
∂ (π (zi ), θ , xi , yi )
∂ (π (zi ), θ , xi , yi )
n 2 some calculation, we have
An = n−1/2 + n−1/2
i=1
∂θ i=1
∂θ ∂π ˆ n π ∗ + 1 π ∗
L∗n (π ∗ ) = ˆ n π ∗ + o p (1) (A.7)
2
× [π̃ (zi ) − π (zi )] + O p (n −1/2
||π̃ (.) − π (.)|| )2
∞ where

n
∂ (π (zi ), θ , xi , yi )
n
∂ (η(zi , z0 ), θ̂ , xi , yi )
=n −1/2
+D1n +O p (n−1/2 ||π̃ (.)−π (.)||2∞ ) ˆ n = (nh)−1/2
wi Kh (zi − z0 )
i=1
∂θ ∂η
i=1
where the definition of D1n should be apparent. Now, applying

n
∂ 2 (η(zi , z0 ), θ̂ , xi , yi )
ˆ n = (nh)−1 wi wi Kh (zi − z0 )
Lemma A.1 of Fan and Huang (2005), we have
i=1
∂η∂η

n
∂ (γ (z j ), x j , y j ) By the SLLN, Assumption 3 and Lemma A.1 of Fan and Huang
γ̃ (zi ) − γ (zi ) = n f (zi )Iγ γ (zi )
−1 −1 −1
Kh (z j − zi )
∂γ (2005), it can be shown that
j=1
+O p (δn ) 1 0
E (
ˆ n ) → − f (z0 ) ⊗ Iπ = −
ˆ
0 μ2
where δn = n−1/2 h3/2 + (nh)−1 log (1/h). Under the condition
nh2 / log (1/h) → ∞, we have O p (n1/2 δn ) = o p (1). Furthermore, since and var{(n )i j } = O((nh)−1 ) implying that ˆ + o p (1). Thus
ˆ n = −
π (zi ) − π (z j ) = O((zi − z j )2 ) and K (.) is symmetric about 0, we (A.7) can be written as
have ˆ n π ∗ + 1 π ∗
L∗n (π ∗ ) = ˆ π ∗ + o p (1) (A.8)

n 2
n
∂ 2 (π (zi ), θ , xi , yi ) −1
D1n = n −3/2
f (zi )d(x j , y j , z j )Kh (zi −z j ) Using the quadratic approximation lemma, yields
∂θ ∂π
j=1 i=1
π̂ = ˆ −1
∗ ˆ n + o p (1) (A.9)
+O p (n1/2 h2 )
if
ˆ n is a sequence of stochastically bounded random vectors. An ex-
= D2n + O p (n1/2 h2 ) pansion of ˆ n leads to
where d(x j , y j , z j ) is the first element of Iγ−1γ (z j )qγ (γ (z j ), x j , y j )
n
∂ (η(zi , z0 ), θ , xi , yi )
ˆ n = (nh)−1/2
wi Kh (zi − z0 ) + Gn + o p (1)
and the definition of D2n should be apparent. Let D3n = ∂η

−n−1/2 nj=1 Iθ π (z j )d(x j , y j , z j ), then it can be shown that i=1
p = n + Gn + o p (1)
→ 0. Hence, under the condition nh4 → 0, we have
D2n − D3n −
where

n
∂ 2 (η(zi , z0 ), θ , xi , yi )

n
∂ (π (zi ), θ , xi , yi ) Gn = (nh)−1/2 wi (θ̂ − θ )Kh (zi − z0 )
An = n −1/2
− Iθ π (zi )d(xi , yi , zi ) + o p (1) ∂η∂θ
∂θ i=1
i=1 √
(A.3) Since n(θ̂ − θ ) = O p (1), it can be shown that Gn =
−h−1/2
Iηθ (z) f (z) = o p (1) where Iηθ (z) = −E[qηθ (η(z), θ , x, y)|z].
For Bn , it can be shown that Thus, (A.9) becomes
Bn = −E[Iθ θ (x)] + o p (1) = B + o p (1) (A.4) π̂ ∗ = ˆ −1 n + o p (1) (A.10)
The asymptotic normality of π̂ ∗ follows from that of n so it suf- Fan, J., & Huang, T. (2005). Profile likelihood inferences on semiparametric varying-
fices to calculate the mean and variance of n . Since K (.) is symmet- coefficient partially linear models. Bernoulli, 11, 1031–1057.
Fan, J., Zhang, C., & Zhang, J. (2001). Generalized likelihood ratio statistics and Wilks
ric and bounded, we have phenomenon. Annals of Statistics, 29, 153–193.
Galán, J. E., Veiga, H., & Wiper, M. P. (2015). Dynamics effects in efficiency: Evidence
∂ (η(z, z0 ), θ , x, y)
E (n ) = n(nh)−1/2 E wKh (z − z0 ) from Colombian banking sector. European Journal of Operational and Research, 240,
∂η 562–571.
Greene, W. H. (2005). Reconsidering heterogeneity in panel data estimators of the
stochastic frontier model. Journal of Econometrics, 126(2), 269–303.
(nh)1/2 f (z0 ) μ2 ˆ (z0 )ψ (z0 ){1 + o p (1)}
= ⊗ Greene, W. (2012). Econometric analysis (5th ed). New York: Prentice Hall.
2 0 Huang, M., & Yao, W. (2012). Mixture of regression model with varying mixing propor-
tions: A semiparametric approach. Journal of American Statistical Association, 107,
and 711–724.
2 Ivaldi, M., Monier-Dilhan, S., & Simioni, M. (1995). Stochastic production frontiers and
∂ (η(z, z0 ), θ , x, y)
panel data: A latent variable framework. European Journal of Operational Research,
Var(n ) = h E −1
ww Kh (z − z0 )
2 80(3), 534–547.
∂η Jondrow, J., Lovell, C. A. K., Materov, I. S., & Schmidt, P. (1982). On the estimation of
technical inefficiency in the stochastic frontier production function model. Journal
κ0 κ1 of Econometrics, 19(2/3), 233–238.
= f (z0 ) ⊗ ˆ (z0 ){1 + o p (1) Koetter, M., & Vins, O. (2008). The quiet life hypothesis in banking: Evidence from Ger-
κ1 κ2 man savings banks. Working paper.
Koetter, M., Kolari, J., & Spierdijk, L. (2012). Enjoying the quiet life under deregula-
Let e1 = (1 0) denote a (2 × 1) unit vector, then tion: Evidence from adjusted Lerner indices for U.S. banks. Review of Economics
and Statistics, 94(2), 462–480.
{e1 Var(n ) e1 }−1 {e1 n − e1 E (n )} −
D
→ N(0, 1) Kumbhakar, S., Parmeter, C. F., & Tsionas, E. G. (2013). A zero inefficiency stochastic
frontier model. Journal of Econometrics, 172, 66–76.
by using standard argument. Li, R., & Liang, H. (2008). Variable selection in semiparametric modeling. Annals of
Statistics, 36, 261–286.
Li, Q., & Racine, J. (2007). Nonparametric econometrics. Princeton, NJ: Princeton Univer-
References sity Press.
Meeusen, W., & van den Broeck, J. (1977). Efficiency estimation from Cobb-Douglas
Aigner, D. J., Lovell, C. A. K., & Schmidt, P. (1977). Formulation and estimation of stochas- production functions with composed error. International Economic Review, 18(2),
tic frontier production models. Journal of Econometrics, 6(1), 21–27. 435–444.
Andrews, D. W. K. (2001). Testing when a parameter is on the boundary of the main- Ondrich, J., & Ruggiero, J. (2001). Efficiency measurement and stochastic frontier
tained hypothesis. Econometrica, 69(3), 683–734. model. European Journal of Operational and Research, 129, 432–442.
Cai, Z., Fan, J., & Li, R. (2000). Efficient estimation and inference for varying-coefficient Orea, L., & Kumbhakar, S. C. (2004). Efficiency measurement using a latent class
models. Journal of American Statistical Association, 95, 888–902. stochastic frontier model. Empirical Economics, 29(1), 69–83.
Caudill, S. B. (2003). Estimating a mixture of stochastic frontier regression models via Restrepo-Tobon, D., & Kumbhakar, S. C. (2014). Enjoying the quiet life under deregula-
the em algorithm: a multiple cost function application. Empirical Economics, 28, tion? Not quite. Journal of Applied Econometrics, 29(2), 333–343.
581–598. Reifschneider, D., & Stevenson, R. (1991). Systematic departures from the Frontier: A
Caudill, S. B., & Ford, J. M. (1993). Biases in frontier estimation due to heteroskedasticity. framework for the analysis of firm inefficiency. International Economic Review, 32,
Economics Letters, 41, 17–20. 715 723.
Caudill, S. B., Ford, J. M., & Gropper, D. M. (1995). Frontier estimation and firm-specific Rho, S., & Schmidt, P. (2015). Are all firms inefficient? Journal of Productivity Analysis,
inefficiency measures in the presence of heteroscedasticity. Journal of Business Eco- 43, 327–349.
nomic Statistics, 13, 105–111. Sathye, M. (2003). Efficiency of banks in a developing economy: The case of India. Eu-
Fan, J., & Gijbels, I. (1996). Local polynomial modelling and its applications. London: Chap- ropean Journal of Operational and Research, 148, 662–671.
man and Hall. Tzeremes, N. (2015). Efficiency dynamics in Indian banking: A conditional directional
distance approach. European Journal of Operational and Research, 240, 807–818.

Zero Inefficiency Stochastic Frontier Models With V 2016 European Journal of

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Zero Inefficiency Stochastic Frontier Models With V 2016 European Journal of

Uploaded by

Copyright:

Available Formats

European Journal of Operational Research 249 (2016) 1113–1123

Contents lists available at ScienceDirect

European Journal of Operational Research

Stochastics and Statistics

Zero-ineﬃciency stochastic frontier models with varying mixing

n = 2500, λ = 1.0 n = 5000, λ = 1.0 Parameter STD SE(STD) 95 percent Coverage

Note: STD = standard deviations of estimated parameters; SE = es-

In this section, we conduct some simulations to study the ﬁnite

H0 : π(z) = ez /(1 + ez ) H0 : π(z) = ez /(1 + ez )

1 percent 5 percent 10 percent 1 percent 5 percent 10 percent

Emp. sizes 0.0117 0.0493 0.1105 0.0101 0.0502 0.099

1 percent 5 percent 10 percent 1 percent 5 percent 10 percent

Emp. sizes 0.0112 0.0489 0.1112 0.0104 0.0503 0.108

z 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Fig. 2. Estimated probability function π(z).

Fig. 3. Technical ineﬃciency distributions.

8. Concluding remarks Appendix A. Proofs of the theorems

where the deﬁnition of D1n should be apparent. Now, applying

You might also like