Professional Documents
Culture Documents
Graduate School of Economics, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
E-mail: tsukamoto.takahiro@j.mbox.nagoya-u.ac.jp
January 27, 2018, Revised: July 30, 2018, Revised: November 14, 2018
Abstract
By integrating Battese and Coelli’s (1995) model and the spatial autoregressive model (SAR), a spatial
autoregressive stochastic frontier model for panel data is developed. The main feature of this frontier model is a
spatial lag term of explained variables and the joint structure of a production possibility frontier with a model of
technical inefficiency. The model addresses both spatial dependence and heteroskedastic technical inefficiency.
This study applies maximum likelihood methods considering the endogenous spatial lag term. The proposed
model nests several existing models. Further, an empirical analysis using data on the Japanese manufacturing
industry is conducted and the existing models are tested against the proposed model, which is found to be
statistically supported. The findings suggest that estimates in the existing spatial and non-spatial models may
exhibit bias because of lack of determinants of technical inefficiency, as well as a spatial lag. This bias also affects
the technical efficiency score and its ranking.
Keywords Stochastic frontier analysis (SFA), Determinants of technical inefficiency, Spatial autoregressive
dependence, Japanese manufacturing industry
1
For more information on spatial econometrics, see Anselin (1988) and LeSage and Pace (2009).
𝑦𝑖𝑡 = 𝒙′𝑖𝑡 𝜷 + 𝜌 ∑𝑁 𝑡
𝑗=1 𝑤𝑖𝑗 𝑦𝑗𝑡 + 𝑣𝑖𝑡 − 𝑢𝑖𝑡 , 𝑖 = 1,2, … , 𝑁, 𝑡 = 1,2, … , 𝑇𝑖 , (1)
𝑣𝑖𝑡 ~𝑖. 𝑖. 𝑑. N(0, 𝜎𝑣2 ), (2)
𝑢𝑖𝑡 ~𝑖. 𝑖. 𝑑. N + (𝜇𝑖𝑡 , 𝜎𝑢2 ), (3)
𝜇𝑖𝑡 = 𝒛′𝑖𝑡 𝜹. (4)
In the above, 𝑦𝑖𝑡 is a scalar output of producer 𝑖 in period 𝑡 and 𝒙𝑖𝑡 is a vector of inputs used by producer 𝑖 in
period 𝑡. 𝒛𝑖𝑡 is a vector of determinants that may generate technical inefficiency, including both producer-specific
attributes and environmental factors. The second term on the right-hand-side of Equation (1) represents the SAR
term that captures spatial dependency. 𝑤𝑖𝑗𝑡 is the 𝑖𝑗 element of the spatial weight matrix in period 𝑡. The whole
spatial weight matrix 𝐖 is a block diagonal matrix of {𝐖1 , 𝐖 2 , … , 𝑾𝑇 }, where 𝐖𝑡 = {𝑤𝑖𝑗𝑡 }, and 𝜌 is an unknown
2
Affuso’s (2010) pioneering study included the spatial lag of explained variables to explanatory variables in a
stochastic frontier model, where the log-likelihood function cannot however address the endogeneity of the
spatial lag term.
𝜕𝒚 (5)
= (𝑰𝑁𝑇 − 𝜌𝑾)−1 𝛽𝑟 = (𝑰𝑁𝑇 + 𝜌𝑾 + 𝜌2 𝑾2 + 𝜌3 𝑾3 + ⋯ )𝛽𝑟 ,
𝜕𝒙𝑟 ′
where 𝒚 = {𝑦𝑖𝑡 } is a vector of outputs and 𝛽𝑟 is the 𝑟 th parameter of 𝜷 . The marginal effect varies across
observations. Every diagonal element of the matrix refers to the marginal effect of its own explanatory variable,
which is called a direct effect. Every non-diagonal element of the matrix refers to the marginal effect of the
explanatory variable that is not its own, which is called an indirect effect. LeSage and Pace (2009) proposed using
the average of the diagonal elements of the matrix as summary statistics of the direct effect. However, since the
average is missing a large amount of information, indices representing dispersion of the direct effect (e.g.,
maximum and minimum values) should also be reported.
The proposed model nests many existing spatial and non-spatial stochastic frontier models. If 𝜌 = 0,
our model becomes equivalent to the model suggested by Battese and Coelli (1995). If 𝒛𝑖𝑡 includes only a constant
term, it becomes a spatial stochastic frontier model assuming a homoskedastic truncated normal distribution as
the distribution that represents technical inefficiency. If 𝜹 = 𝟎 , our model becomes equivalent to the SAR
stochastic frontier model assuming a homoskedastic half-normal distribution as a distribution that represents
technical inefficiency, as proposed by Glass et al. (2016). If 𝜌 = 0 and 𝒛𝑖𝑡 has only a constant term, the model
will be a non-spatial stochastic frontier model assuming a truncated normal distribution as the distribution that
represents technical inefficiency, as proposed by Stevenson (1980). If 𝜌 = 0 and 𝜹 = 𝟎 , our model becomes
equivalent to a non-spatial stochastic frontier model assuming a homoskedastic half-normal distribution as the
distribution that represents technical inefficiency, as proposed by Aigner et al. (1977).
The spatial lag term ∑𝑁 𝑡
𝑗=1 𝑤𝑖𝑗 𝑦𝑗𝑡 is endogenous; thus, estimating the proposed model using the ML
methods for non-spatial stochastic frontier models causes a bias, unless 𝜌 = 0. Taking the endogeneity of the
spatial lag term into consideration, we present the ML methods to estimate the spatial autoregressive stochastic
frontier model.
Following Battese and Coelli (1995), we re-parameterize as follows:
3
To estimate the spatial econometrics model, (𝑰𝑁𝑇 − 𝜌𝑾) needs to be a non-singular matrix (𝑰𝑁𝑇 is the unit
matrix of the observation magnitude). When 𝐖 is a symmetric matrix, the range of 𝜌 is (1⁄𝜔𝑚𝑖𝑛 , 1⁄𝜔𝑚𝑎𝑥 ).
Here, 𝜔𝑚𝑎𝑥 and 𝜔𝑚𝑖𝑛 are the smallest and largest eigenvalues of 𝐖, respectively. If 𝐖 is an asymmetric matrix,
the eigenvalues may become complex numbers. LeSage and Pace (2009) show that, if 𝐖 is row-normalized
(each row’s sum is set to 1), the range of 𝜌 should be (1⁄𝑟𝑚𝑖𝑛 , 1). Note that 𝑟𝑚𝑖𝑛 is the most negative purely
real eigenvalue of 𝐖. Regardless whether 𝐖 is a symmetric or asymmetric matrix, if 𝐖 is row-normalized, the
range of 𝜌 is (1⁄𝑟𝑚𝑖𝑛 , 1).
Here, ln|𝑰𝑁𝑇 − 𝜌𝑾| comes from the Jacobian matrix that accompanies the variable transformation from 𝜀𝑖𝑡 ≔
𝑣𝑖𝑡 − 𝑢𝑖𝑡 to 𝑦𝑖𝑡 , considering the endogeneity of the spatial lag term. Derivation of the likelihood is shown in the
Appendix. The first-order conditions of the ML estimators are as follows:
𝑁 𝑇𝑖
∂𝐿𝐿 𝒛′𝑖𝑡 𝜹 + 𝑦𝑖𝑡 − 𝒙′𝑖𝑡 𝜷 − 𝜌 ∑𝑁 𝑡
𝑗=1 𝑤𝑖𝑗 𝑦𝑗𝑡
∗
𝜙(𝑑𝑖𝑡 ) 𝛾
= ∑∑[ 2
+ ∗ ) ∗ ] 𝒙𝑖𝑡 = 𝟎,
(13)
∂𝜷 𝜎 𝛷(𝑑𝑖𝑡 𝜎
𝑖=1 𝑡=1
𝑁 𝑇𝑖
∂𝐿𝐿 𝒛′𝑖𝑡 𝜹 + 𝑦𝑖𝑡 − 𝒙′𝑖𝑡 𝜷 − 𝜌 ∑𝑁 𝑡
𝑗=1 𝑤𝑖𝑗 𝑦𝑗𝑡 𝜙(𝑑𝑖𝑡 ) 1 ∗
𝜙(𝑑𝑖𝑡 ) (1 − 𝛾)
= ∑ ∑ [− 2
− + ∗) ] 𝒛𝑖𝑡, (14)
∂𝜹 𝜎 )
𝛷(𝑑𝑖𝑡 𝜎√𝛾 𝛷(𝑑𝑖𝑡 𝜎∗
𝑖=1 𝑡=1
=𝟎
𝑁 𝑁 𝑇𝑖 2
∂𝐿𝐿 1 (𝒛′𝑖𝑡 𝜹 + 𝑦𝑖𝑡 − 𝒙′𝑖𝑡 𝜷 − 𝜌 ∑𝑁 𝑡
𝑗=1 𝑤𝑖𝑗 𝑦𝑗𝑡 )
= − {(∑ 𝑇𝑖 ) − ∑ ∑
∂𝜎 2 2𝜎 2 𝜎2
𝑖=1 𝑖=1 𝑡=1
𝑁 𝑇𝑖 (15)
∗
𝜙(𝑑𝑖𝑡 ) 𝜙(𝑑𝑖𝑡 ) ∗
−∑∑[ 𝑑𝑖𝑡 − 𝑑 ]} = 0,
𝛷(𝑑𝑖𝑡 ) 𝛷(𝑑𝑖𝑡 ) 𝑖𝑡
∗
𝑖=1 𝑡=1
𝑁 𝑇𝑖
∂𝐿𝐿 𝜙(𝑑𝑖𝑡 ) 1
= ∑∑[ ( 𝑑 )
∂𝛾 𝛷(𝑑𝑖𝑡 ) 2𝛾 𝑖𝑡
𝑖=1 𝑡=1 (16)
∗
𝜙(𝑑𝑖𝑡 ) 𝒛′𝑖𝑡 𝜹 + 𝑦𝑖𝑡 − 𝒙′𝑖𝑡 𝜷 − 𝜌 ∑𝑁 𝑡
𝑗=1 𝑤𝑖𝑗 𝑦𝑗𝑡
∗
(1 − 2𝛾)𝑑𝑖𝑡
− ∗) { + }] = 0,
𝛷(𝑑𝑖𝑡 𝜎∗ 2(1 − 𝛾)𝛾
∂𝐿𝐿
= −tr((𝑰𝑁𝑇 − 𝜌𝑾)−1 𝑾)
∂𝜌
𝑇𝑖
𝑁 𝑁
𝒛′𝑖𝑡 𝜹 + 𝑦𝑖𝑡 − 𝒙′𝑖𝑡 𝜷 − 𝜌 ∑𝑁 𝑡 ∗ (17)
𝑗=1 𝑤𝑖𝑗 𝑦𝑗𝑡 𝜙(𝑑𝑖𝑡 ) 𝛾
+ ∑ ∑ ∑ 𝑤𝑖𝑗𝑡 𝑦𝑗𝑡 [ + ∗ ) ∗ ] = 0,
𝜎2 𝛷(𝑑𝑖𝑡 𝜎
𝑖=1 𝑡=1 𝑗=1
where 𝜙(⋅) and 𝛷(⋅) respectively indicate the probability density and cumulative distribution functions of the
standard normal distribution. As Equations (8)–(17) cannot be solved analytically, we maximize the log-likelihood
function numerically with this first-order condition satisfied.
As Battese and Coelli (1988) proposed, technical efficiency score 𝑇𝐸𝑖𝑡 is measured by the expectation
of exp(−𝑢𝑖𝑡 ) conditional on 𝜀𝑖𝑡 = 𝑣𝑖𝑡 − 𝑢𝑖𝑡 .
Estimates of 𝑇𝐸𝑖𝑡 are obtained by evaluating Equation (18) with the ML estimates.
In the above, 𝑦𝑖𝑡 , 𝐿𝑖𝑡 , and 𝐾𝑖𝑡 are the output, labor input, and capital input, respectively. 𝛽𝑙 and 𝛽𝑘 are unknown
estimated parameters of labor and capital, respectively. As in Glass et al. (2016) and Ramajo and Hewings (2018),
we assume a Hicks-neutral technical change and add a linear time trend variable 𝑡 and its square (𝑡 is 0, with 2002
as the benchmark year, and increases by 1 for each year) in Equation (19).
The output is the value added (million yen) in manufacturing establishments with 30 or more employees
from the Census of Manufacture by the Ministry of Economy, Trade and Industry. Labor input is the number of
workers multiplied by working hours per capita. The number of employees in manufacturing establishments with
30 or more employees is taken from the Industrial Statistical Survey. The working hours are monthly average
total hours worked per capita by regular employees of manufacturing establishments with 30 or more employees;
they are taken from the Monthly Labor Survey (Regional Survey) by the Ministry of Health, Labor and Welfare.
Capital input is the “value of tangible fixed assets other than land” (million yen) in manufacturing establishments
with 30 or more employees from the Census of Manufacture.4
It is natural to think that spatial interdependence, such as externalities, diminishes with geographical
distance.5 In fact, in many urban economic studies, the property of knowledge spillover decaying with distance
4
The index variables of monetary value are not nominal, but real. We apply a chain-linked deflator for the
manufacturing industry to the value added and for private enterprise equipment to the value of tangible fixed
assets. These deflators are from the “National Accounts for 2015” (System of National Accounts 2008,
benchmark year = 2011)
5
As defined in the model section, our proposed model allows for a time-variant spatial weight matrix.
Stakhovych and Bijmolt (2009) have divided the specification of weights matrices into: (1) treating weights
matrices as completely exogenous constructs, (2) letting the data determine them, and (3) estimating them. In
fact, some studies define weight matrices using the strength of economic relations as (2). However, LeSage and
Pace (2011) argued that weight matrices should be exogenous and, they recommend using geographical
distance. Therefore, in this application, we use the time-invariant weight matrix based on geographical distance
secured exogenous.
6
Theoretically, the proposed model can reduce omitted-variable bias by introducing appropriate determinants
of technical inefficiency. Since we adopted variables that are statistically significant, we believe that our
specification of the determinants of technical inefficiency allows us successfully to remove omitted-variable
bias. However, our specification search is limited by availability of data. Thus, the possibility of
misspecification cannot be completely ruled out.
The proposed spatial autoregressive stochastic frontier model for panel data that incorporates a model
of technical inefficiency (hereinafter, SSFTE) nests many existing spatial and non-spatial stochastic frontier
models. Therefore, in addition to the proposed model, we estimate several models with constraints on parameters.
First, the model with 𝜌 = 𝛾 = 0 and 𝜹 = 𝟎 is a linear regression model. Second, the model with 𝛾 = 0 and 𝜹 =
𝟎 is a SAR regression model. Third, the model with 𝜌 = 0 and 𝜹 = 𝟎 is a non-spatial stochastic frontier model
with a half-normal distribution proposed by Aigner et al. (1977) (hereinafter, ALS). Fourth, the model with 𝜹 = 𝟎
is a spatial stochastic frontier model with a SAR structure and a half-normal distribution proposed by Glass et al.
(2016) (hereinafter, GKS). Fifth, the model with 𝜌 = 0 is a non-spatial stochastic frontier model that incorporates
the model of technical inefficiency proposed by Battese and Coelli (1995) (hereinafter, BC95).
4. Estimation results
Table 3 shows the estimation results. In the models with a spatial lag, the coefficient 𝜌 of the spatial lag is
statistically significant at the 1% significance level, with a positive sign. This indicates that production activities
of the Japanese manufacturing industry are spatially dependent and have mutually positive externality effects.
Thus, the Japanese government’s industrial cluster policy is supported. The magnitude of the coefficient varies
depending on the models. The coefficient in SSFTE is smaller than that in the other models (𝜌 = 0.3129 in SAR
and 𝜌 = 0.3329 in GKS, whereas 𝜌 = 0.2115 in SSFTE). In models that do not consider determinants of technical
inefficiency, 𝜌 is considered to be overestimated because 𝜌 absorbs some of the heteroskedasticity of technical
inefficiency. This indicates that consideration of the determinants of technical inefficiency is also important in
spatial models.
Table 4 shows the labor elasticity of production, capital elasticity of production, degree of returns to
scale, and average annual rate of the Hicks-neutral technical change, which are calculated from the estimation
results. In the model with the spatial lag term, these values vary over observations, so their maximum, minimum,
and average values are displayed. The average values are equivalent to the summary statistics of the direct effect
in LeSage and Pace (2009). The degree of returns to scale is the sum of labor elasticity and capital elasticity of
production. The degree of returns to scale that is greater (less) than 1 indicates increasing (decreasing) returns to
scale technology.
Labor coefficient and labor elasticity in the models with spatial lag are lower than those in models
without spatial lag. This suggests that the labor elasticity value in the model without spatial lag is overestimated,
as labor input correlates with spatial spillover effects, including externality. Degrees of returns to scale indicate
the economy of scale in not only linear regression and SAR but also stochastic frontier models without a model
of technical efficiency. For example, the estimates by ALS and GKS are 1.08 and 1.15, respectively, indicating
increasing returns to scale. However, BC95 and SSFTE show almost constant returns to scale technology, as their
estimates of the degree of returns to scale are 1.005 and 1.01, respectively. This suggests that the coefficients of
input quantities in such models that ignore the determinants of technical inefficiency are overestimated because
of the correlation between the determinants of technical inefficiency and input amount, especially capital input.
As Table 3 shows, in all models, the coefficient of the time trend in the PPF is statistically significant
at the 5% significance level and its sign is positive. The sign of the coefficient of the squared time trend is negative,
but insignificant for all the models except BC95. The results indicate that the PPF shifts upward through technical
change during the analysis period. The annual rate of the Hicks-neutral technical change is positively constant in
all models except BC95. Looking at the average annual rate of the Hicks-neutral technical change in Table 4, the
rate in models that consider spatial dependence (SAR, GKS and SSFTE) is lower than that in the models that do
not take it into consideration.
In both BC95 and SSFTE, the coefficient on population density is significantly negative at the 1%
significance level and the sign of the coefficient of the square term of population density is positive. In BC95, the
latter is statistically significant at the 10% significance level, while it is not significant in the case of SSFTE.
Eventually, it is implied that the increase in population density raises technical efficiency within the dataset.
Coef. z-stat Coef. z-stat Coef. z-stat Coef. z-stat Coef. z-stat Coef. z-stat
𝛼 -3.6458*** (-18.95) -7.0903*** (-16.89) -3.1177*** (-13.66) -6.5592*** (-16.15) -1.2218*** (-5.24) -3.9661*** (-7.95)
𝛽𝑙 0.5987*** (18.31) 0.5100*** (15.89) 0.5483*** (16.10) 0.4180*** (13.44) 0.5396*** (19.05) 0.4654*** (14.88)
𝛽𝑘 0.5501*** (17.95) 0.5943*** (20.67) 0.5854*** (19.17) 0.6613*** (25.63) 0.4651*** (15.91) 0.5382*** (16.25)
𝛽𝑡 0.0327*** (3.75) 0.0175** (2.17) 0.0322*** (4.23) 0.0151** (1.99) 0.0369*** (4.87) 0.0239*** (3.28)
𝛽𝑡 2 -0.0010 (-1.39) -0.0001 (-0.22) -0.0010 (-1.64) 0.0000 (0.00) -0.0014** (-2.28) -0.0006 (-1.04)
Note: Model of TE: model of technical inefficiency; OLS: Linear Regression; SAR: spatial autoregressive model;
ALS: non-spatial stochastic frontier model with a half-normal distribution proposed by Aigner et al. (1977); GKS:
spatial stochastic frontier model with a SAR structure and a half-normal distribution proposed by Glass et al.
(2016); BC95: non-spatial stochastic frontier model that incorporates a model of technical inefficiency proposed
by Battese and Coelli (1995); SSFTE: the proposed model; Hicks-neutral technical change rate is the mean annual
technical progress rate calculated based on the estimated time trends.
In BC95 and SSFTE, the coefficients of working hours and working hours squared are statistically
significant at the 1% significance level. The per capita working time to maximize technical efficiency was 167.1
hours for BC95 and 167.2 hours for SSFTE. This result is thus robust in the model’s specification. Coefficients of
the large-scale business establishment ratio are significantly negative at the 1% significance level in BC95 and
SSFTE. This suggests that economies of scale are present at establishment level.
Table 5 shows the results of testing the several nested models by the proposed model using the likelihood
ratio (LR) test.7 The null hypothesis of no spatial lag (i.e., Ho : 𝜌 = 0) is rejected at the 1% significance level. The
model with spatial lag is supported empirically. The null hypotheses of no determinants of technical inefficiency
(i.e., Ho : 𝜹 = 𝟎), and the null hypothesis of no spatial lag and no determinants of technical inefficiency (i.e.,
Ho : 𝜌 = 0 and 𝜹 = 𝟎) are both rejected at the 1% significance level. The modeling determinants of technical
inefficiency are statistically supported. The null hypothesis of no technical inefficiency (i.e., Ho : 𝛾 = 0 and 𝜹 =
𝟎) is rejected at the 1% significance level. This supports the composed error structure peculiar to the stochastic
frontier model. The null hypothesis of no spatial lag and no determinants of technical inefficiency and no technical
inefficiency (i.e., Ho : 𝜌 = 𝛾 = 0 and 𝜹 = 𝟎) is decisively rejected at the 1% significance level. As a result of the
LR test, all the existing nested models are rejected, which indicates that SSFTE is preferable.
7
The LR test statistic is defined as 𝐿𝑅𝜆 = −2{𝐿𝐿[𝐻1 ] − 𝐿𝐿[𝐻0 ]}, where 𝐿𝐿[𝐻1 ] and 𝐿𝐿[𝐻0 ] are the log-
likelihood function under 𝐻1 and 𝐻0 , respectively. This test statistic asymptotically follows the chi-square
distribution with degrees of freedom equal to the number of constraints.
10
Next, we compare the technical efficiency score (TE score) in the several models. Although there are
several definitions of the TE score, in order to compare the effects of the estimation models, we unify them by
defining them as in Equation (18). Figure 1 shows the TE scores’ histogram using 611 observations. The average
of the TE scores in SSFTE, BC95, GKS, and ALS are 0.8047, 0.7530, 0.8494, and 0.8348, respectively. In the
model that considers the determinants of technical inefficiency, the distribution of TE is dispersed. By considering
the spatial dependence, that is, removing the constraint of 𝜌 = 0, we found that the TE score tends to approach 1.
Table 6 Spearman’s rank correlation coefficient (SRCC) and maximum rank difference ratio (MRDR)
Mean SRCC Mean MRDR
SSFTE BC95 GKS ALS SSFTE BC95 GKS ALS
SSFTE 1.000 0.980 0.814 0.798 SSFTE 0.000 0.177 0.512 0.604
BC95 1.000 0.789 0.720 BC95 0.000 0.506 0.622
GKS 1.000 0.936 GKS 0.000 0.383
ALS 1.000 ALS 0.000
Note: ALS: non-spatial stochastic frontier model with half-normal distribution proposed by Aigner et al. (1977);
GKS: spatial stochastic frontier model with a SAR structure and a half-normal distribution proposed by Glass et
al. (2016); BC95: non-spatial stochastic frontier model that incorporates a model of technical inefficiency proposed
by Battese and Coelli (1995); SSFTE: the proposed model.
Figure 2 compares the TE score in each model. For example, in the upper left diagram, the horizontal
axis represents the TE scores in SSFTE and the vertical axis represents the TE scores in BC95. If the relative rank
is the same, the points will be on one line. In addition, Table 6 shows the mean of the Spearman’s rank correlation
coefficient (SRCC) matrix and the mean of the maximum rank difference ratio (MRDR) during the sample period.
𝐾
Let the TE score ranking of the 𝑖th producer in period 𝑡 in model 𝐾 be 𝑅𝑖𝑡 ; then, the MRDR of models A and B
are defined as follows:
max|𝑅𝑖𝑡𝐴 − 𝑅𝑖𝑡
𝐵
|
𝑀𝑅𝐷𝑅𝐴𝐵𝑡 ≔ 𝑖
. (21)
𝑁
As expected, there are positive correlations between the TE score ranking in all models. However, this
ranking changes significantly between models that use variables explaining the determinants of technical
inefficiency (SSFTE and BC95) and models that do not use those variables (GKS and ALS). As the variables
describing the determinants of technical inefficiency are statistically significant, the estimation considering
determinants of technical inefficiency is important. The presence or absence of the spatial lag does not lead to
dramatic change in rank order. The mean SRCC of GKS and ALS is 0.936 and that of SSFTE and BC95 is 0.980.
Since SSFTE and BC95 specify the determinants of technical inefficiency, the TE score ranking is similar, but the
mean MRDR of these models is 0.177, which means that there is a difference of up to 17.7% in rank order on
average. This indicates that the TE score ranking varies depending on the presence of the spatial lag. Considering
both these and the statistical test results, it is clear that the introduction of the spatial lag is significant.
Figure 3 shows the regional mean of the TE scores. As the overall mean of the scores varies by model,
we map them using six quantiles because the distribution’s shape is significantly different depending on the models.
Considering the discussion so far, TE scores are different depending on the presence or absence of spatial lag, as
well as determinants of technical inefficiency.
11
To clarify the influence of the spatial lag term, Figure 4 shows difference in rank of prefectural TE
scores averaged over 2002–2014 (rank in SSFTE minus that in BC95). We consider the area around Aichi
Prefecture, where the automobile industry gathers and the value added is the largest. The TE score rankings of the
prefectures around Aichi Prefecture, such as Gifu Prefecture and Mie Prefecture, are lower in SSFTE than that in
BC95. This tendency also applies to the surroundings of Kanagawa Prefecture, where added value is the third
largest. In these areas, the TE scores are considered to decrease because of positive spatial spillover effects, which
makes the PPF shift upward. On the other hand, the areas where is far away from these high-value-added
prefectures, such as around Fukuoka Prefecture, is less affected by the positive spatial spillover effects and the
PPF is low. Thus, in these areas, the ranking of TE score is higher.8
5. Conclusions
We developed a spatial stochastic frontier model with the SAR term and the feature of Battese and Coelli’s (1995)
model, which simultaneously estimates the determinants of technical inefficiency. Then, we conducted empirical
analysis using data on the Japanese manufacturing industry. Statistical tests support the proposed model. We found
that production activities of the Japanese manufacturing industry are spatially dependent and produce mutually
positive externality effects. This implies that the Japanese government’s industrial cluster policy is justified. Our
findings suggest that the estimates, such as labor elasticity, capital elasticity, and spatial dependence, in the existing
spatial and non-spatial models are biased because of a lack of technical inefficiency determinants and the spatial
lag. This bias also affected the TE score and its ranking.
In particular, it is a significant conclusion that the scale parameter of spatial dependence, 𝜌, in models
without determinants of technical inefficiency is overestimated because 𝜌 absorbs some of the heteroskedasticity
of technical inefficiency. This finding is important because overestimation of 𝜌 indicates overestimates of spatial
spillover effects such as externalities and can lead to erroneous policy judgment. Thus, in this respect, our model
is superior to existing models, because it can measure spatial spillover while controlling for the heteroskedasticity
of technical inefficiency.
Using the proposed model, we can statistically test whether there is spatial dependence as well as
whether the determinants of technical inefficiency are necessary. If the test supports spatial independence, the
existing non-spatial stochastic frontier models such as BC95 can be used. If the test supports the idea that the
determinants of technical inefficiency are not required, an existing spatial stochastic frontier model such as GKS
can be used. There is no positive reason that models without considering either the spatial dependence or
determinants of technical inefficiency are first chosen.
Our model has some extensibility. We can easily introduce a spatial lag of explanatory variables into
our model.9 Since these added variables are all exogenous, we can estimate this model directly using our estimation
method. In addition, we can potentially introduce SEM structure into the error term in our model. The SEM
structure can address spatial dependence in the error term. In addition, by adding an additional weight matrix, we
can extend our model to higher order spatial econometric models (Lacombe, 2004, Elhorst et al., 2012).
Recently, there have been many studies on how to deal with endogenous explanatory variables in
stochastic frontier models (Kutlu, 2010; Amsler et al.; 2016, 2017; Karakaplan and Kutlu, 2017; Tsukamoto
2018). However, it is difficult to apply such methods to spatial dependence models, including our model. This is
because there is an enormous number of variables and parameters in the reduced-form equations under the
spatially dependent situation. This remains a future research task.
In this study, we proposed a useful spatial stochastic frontier model, but several challenges and
applicability possibilities remain from the viewpoint of application. First, we confirmed the existence of spatial
dependence by using prefectural data because it is important from a policy standpoint to show spatial dependence
across prefectures. Our proposed model also allows various other analyses on spatial spillovers. For example, if
an analyst is interested in interdependence relationships among firms, that analyst may obtain new findings by
conducting firm-level analysis using the proposed model. Second, we used geographical distance for the spatial
weight matrix, whereas by creating a spatial weight matrix based on the economic distance calculated using Input-
Output tables, it is possible to conduct an analysis while considering technological proximity (Dietzenbacher et
al., 2005, Yamada and Kawakami, 2016). As described above, there is room for empirical studies. The proposed
model is expected to be applied to empirical analysis in many fields, including regional science and productivity
analysis.
8
These tendencies are robust throughout the period.
9
In the field of spatial econometrics, a model that adds both spatial lag of explained variables and spatial lag of
explanatory variables is called a spatial Durbin model. Therefore, this extended model can be called a spatial
Durbin stochastic frontier model that incorporates a model of technical inefficiency.
12
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-
profit sectors.
References
Adetutu, M., Glass, A.J., Kenjegalieva, K., Sickles, R.C., 2015. The effects of efficiency and TFP growth on
pollution in Europe: a multistage spatial analysis. Journal of Productivity Analysis 43(3), 307–326.
https://doi.org/10.1007/s11123-014-0426-7
Affuso, E., 2010. Spatial autoregressive stochastic frontier analysis: an application to an impact evaluation study.
Auburn University Working Papers. https://dx.doi.org/10.2139/ssrn.1740382
Aigner, D., Lovell, C.K., Schmidt, P., 1977. Formulation and estimation of stochastic frontier production
function models. Journal of Econometrics 6(1), 21–37. https://doi.org/10.1016/0304-4076(77)90052-5
Amsler, C., Prokhorov, A., Schmidt P., 2016. Endogeneity in Stochastic Frontier Models. Journal of
Econometrics 190(2), 280–288. https://doi.org/10.1016/j.jeconom.2015.06.013
Amsler, C., Prokhorov, A., Schmidt, P., 2017. Endogenous environmental variables in Stochastic Frontier
Models. Journal of Econometrics 199(2), 131–140. https://doi.org/10.1016/j.jeconom.2017.05.005
Anselin. L., 1988. Spatial Econometrics: Methods and Models. Dordrecht: Kluwer.
Arbia, G., 2014. A primer for spatial econometrics: with applications in R. Berlin: Springer.
Battese, G.E., Coelli, T.J., 1988. Prediction of firm-level technical efficiencies with a generalized frontier
production function and panel data. Journal of Econometrics 38(3), 387–399.
https://doi.org/10.1016/0304-4076(88)90053-X
Battese, G.E., Coelli, T.J., 1992. Frontier production functions, technical efficiency and panel data: with
application to paddy farmers in India. Journal of Productivity Analysis 3(1–2), 153–169.
https://doi.org/10.1007/BF00158774
Battese, G.E., Coelli, T.J., 1995. A model for technical inefficiency effects in a stochastic frontier production
function for panel data. Empirical Economics 20(2), 325–332. https://doi.org/10.1007/BF01205442
Caudill, S.B., Ford, J.M., 1993. Biases in frontier estimation due to heteroscedasticity. Economics Letters 41(1),
17–20. https://doi.org/10.1016/0165-1765(93)90104-K
Caudill, S.B., Ford, J.M., Gropper, D.M., 1995. Frontier estimation and firm-specific inefficiency measures in
the presence of heteroscedasticity. Journal of Business & Economic Statistics 13(1), 105–111.
https://doi.org/10.2307/1392525
Dietzenbacher, E., Romero Luna, I., Bosma, N.S., 2005. Using average propagation lengths to identify
production chains in the Andalusian economy. Estudios de Economía Aplicada 23(2), 405–422.
Druska, V., Horrace, W.C., 2004. Generalized moments estimation for spatial panel data: Indonesian rice
farming. American Journal of Agricultural Economics 86(1), 185–198.
http://www.jstor.org/stable/3697883
Elhorst, J.P., 2010. Applied spatial econometrics: raising the bar. Spatial Economic Analysis 5(1), 9–28.
https://doi.org/10.1080/17421770903541772
Elhorst, J.P., 2014. Spatial Econometrics from Cross-Sectional Data to Spatial Panels. Heidelberg: Springer
Elhorst, J.P., Lacombe, D.J., Piras, G. 2012. On model specification and parameter space definitions in higher
order spatial econometric models. Regional Science and Urban Economics 42(1–2), 211–220.
https://doi.org/10.1016/j.regsciurbeco.2011.09.003
Fingleton, B., López-Bazo, E., 2006. Empirical growth models with spatial effects. Papers in Regional Science
85(2), 177–198. https://doi.org/10.1111/j.1435-5957.2006.00074.x
Fries, S., Taci, A., 2005. Cost efficiency of banks in transition: evidence from 289 banks in 15 post-communist
countries. Journal of Banking & Finance 29(1), 55–81. https://doi.org/10.1016/j.jbankfin.2004.06.016
Fusco, E., Vidoli, F., 2013. Spatial stochastic frontier models: controlling spatial global and local heterogeneity.
International Review of Applied Economics 27(5), 679–694.
https://doi.org/10.1080/02692171.2013.804493
Glass, A.J., Kenjegalieva, K., Sickles, R.C., 2016. A spatial autoregressive stochastic frontier model for panel
data with asymmetric efficiency spillovers. Journal of Econometrics 190(2), 289–300.
https://doi.org/10.1016/j.jeconom.2015.06.011
13
14
1 𝑣𝑖𝑡2
𝑓𝑣 (𝑣𝑖𝑡 ) = exp (− ), (A1)
√2𝜋𝜎𝑣2 2𝜎𝑣2
1 (𝑢𝑖𝑡 − 𝜇𝑖𝑡 )2
𝑓𝑢 (𝑢𝑖𝑡 ) = 𝜇 exp (− ) , 𝑢𝑖𝑡 ≥ 0. (A2)
√2𝜋𝜎𝑢2 ⋅ 𝛷 ( 𝜎𝑖𝑡 ) 2𝜎𝑢2
𝑢
𝑁 𝑇𝑖 −1
1 1 1 𝜇𝑖𝑡 + 𝜀𝑖𝑡 2 𝜇𝑖𝑡 (1 − 𝛾) − 𝜀𝑖𝑡 𝛾 𝜇𝑖𝑡
𝑓𝜺 (𝜺) = ∏ ∏ [ ⋅ exp {− ( ) }⋅𝛷( ) ⋅ (𝛷 ( )) ] . (A5)
𝜎 √2𝜋 2 𝜎 𝜎√(1 − 𝛾)𝛾 𝜎 √𝛾
𝑖=1 𝑡=1
Since 𝑑𝜺/𝑑𝒚 = (𝑰𝑁𝑇 − 𝜌𝑾), the joint probability density function of 𝒚 = {𝑦𝑖𝑡 } is
15
𝐿𝐿(𝜷, 𝜹, 𝛾, 𝜌, 𝜎 2 ; 𝒚)
𝑁
1
= ln|𝑰𝑁𝑇 − 𝜌𝑾| − (∑ 𝑇𝑖 ) [ln 𝜎 2 + ln 2𝜋]
2
𝑖=1
2
𝑁 𝑇𝑖 ′ ′
1 𝒛 𝜹 + 𝑦𝑖𝑡 − 𝒙𝑖𝑡 𝜷 − 𝜌 ∑𝑁 𝑡
𝑗=1 𝑤𝑖𝑗 𝑦𝑗𝑡
− ∑ ∑ ( 𝑖𝑡 )
2 𝜎
𝑖=1 𝑡=1
𝑁 𝑇𝑖 ′ (A8)
𝒛 𝜹
− ∑ ∑ ln 𝛷 ( 𝑖𝑡 )
𝜎 √𝛾
𝑖=1 𝑡=1
[
′ ′
𝒛𝑖𝑡 𝜹(1 − 𝛾) − (𝑦𝑖𝑡 − 𝒙𝑖𝑡 𝜷 − 𝜌 ∑𝑁 𝑡
𝑗=1 𝑤𝑖𝑗 𝑦𝑗𝑡 ) 𝛾
− ln 𝛷 .
𝜎√(1 − 𝛾)𝛾
( )]
16
17
18
Okinawa
Gifu
Fukuoka
Aichi Tokyo
Osaka Shizuoka
19
Chiba
Tokyo
Fukuoka
Kanagawa
Aichi
Mie
Fig. 4 Difference in rank of prefectural TE scores averaged over 2002–2014 (SSFTE—BC 95)
20