Professional Documents
Culture Documents
y = x + v - u,
where y is the observed outcome (goal attainment), x + v is the optimal, frontier goal (e.g.,
maximal production output or minimum cost) pursued by the individual, x is the deterministic part
of the frontier and v ~ N[0,v2] is the stochastic part. The two parts together constitute the
„stochastic frontier.‟ The amount by which the observed individual fails to reach the optimum (the
frontier) is u, where
u = |U| and U ~ N[0,u2]
(change to v + u for a stochastic cost frontier or any setting in which the optimum is a minimum). In
this context, u is the „inefficiency.‟ This is the normal-half normal model which forms the basic
form of the stochastic frontier model.
Many varieties of the stochastic frontier model have appeared in the literature. A major
survey that presents an extensive catalog of these formulations is Kumbhakar and Lovell (2000).
(See, as well, Bauer (1990), Greene (2008) and several other surveys, many of which are cited in
Kumbhakar and Lovell and in Greene.) The estimator in LIMDEP computes parameter estimates for
most single equation cross section and panel data variants of the stochastic frontier model.
A large number of variants of the stochastic frontier model based on different assumptions
about the distribution of the „inefficiency‟ term, u have been proposed in the received literature.
Most of these are available in LIMDEP, as suggested in the list below. The bulk of the received
technology centers on cross section style modeling. However, recent advances include many
extensions that take advantage of the features of panel data. A large array of panel data estimators
are also supported by LIMDEP as well.
E62: Stochastic Frontier Models and Efficiency Analysis E-2
In this area of study, unlike most others, estimation of the model parameters is usually not the
primary objective. Estimation and analysis of the inefficiency of individuals in the sample and of the
aggregated sample are usually of greater interest. This part of the development will present tools for
estimation of inefficiency.
Typically, the production or cost model is based on a Cobb-Douglas, translog, or other form
of logarithmic model, so that the essential form is
log y = x + v - u
where the components of x are generally logs of inputs for a production model or logs of output and
input prices for a cost model, or their squares and/or cross products. In this form, then, at least for
relatively small variation, u represents the proportion by which y falls short of the goal, and has a
natural interpretation as proportional or percentage inefficiency. The numerous examples below will
demonstrate. Users are also referred to the various survey sources listed earlier.
The results one obtains are, of course, critically dependent on the model assumed. Thus,
specification and estimation of model parameters, while perhaps of secondary interest, are
nonetheless a major first step in the model building process. In nearly all received formulations, the
random component, v, is assumed to be normally distributed with zero mean. In some models, v may
be heteroscedastic. But, in either form, the large majority of the different frontier models that have
been proposed result from variations on the distribution of the inefficiency term, u. The range of
specifications examined in this chapter includes the following:
NOTE: One must be the first variable in the Rhs list in all model specifications.
The default specification is Aigner, Lovell and Schmidt‟s canonical normal-half normal model. The
default form is a production frontier model,
y = x + v - u, u = |U|.
That is, the right hand side of the equation specifies the maximum goal attainable. To specify a cost
frontier model or other model in which the frontier represents a minimum, so that
y = x + v + u, u = |U|,
use
; Cost
This specification is used in all forms of the stochastic frontier model. As noted below, one
additional specification you may find useful is
(The meanings of the parameters are developed below.) ALS also developed the normal-exponential
model, in which u has an exponential distribution rather than a half normal distribution. To request
the exponential model, use
in the FRONTIER command. For this model, the parameters are (,,v). Further details appear
below. There are also several model forms, and numerous modifications such as heteroscedasticity
that are developed below.
E62: Stochastic Frontier Models and Efficiency Analysis E-4
This is the full list of general specifications that are applicable to this model estimator.
; Covariance Matrix displays estimated asymptotic covariance matrix (normally not shown),
same as ; Printvc.
; Choice uses choice based sampling (sandwich with weighting) estimated matrix.
; Cluster = spec requests computation of the cluster form of corrected covariance estimator.
ei yi ˆ xi
This residual is usually not of interest in itself. It is, however, the crucial ingredient in the efficiency
estimator discussed in Section E62.8. The estimator of ui that we will use is computed by the
Jondrow formula E[u|v-u] or E[u|v+u] if based on a cost frontier,
( w)
Eˆ [u | ] 2
w , v u , w = /,
1 1 ( w)
u
v2 u2 , .
v
In the JLMS formula, ei is the estimator of εi. The formulas and computations are discussed in
Section E62.8.
The frontier model is, save for its involved disturbance term, a linear regression model. The
conditional mean in the model is
In most cases, E[ui|xi]is not a function of xi, so the derivatives of E[yi|xi] with respect to xi are just .
In other cases, we will consider, the conditional mean of ui does depend on xi or other variables, so
the partial effects in the model might be more involved than this. Once again, however, these will
usually not be of direct interest in the study. But, in all cases, Eˆ [u | ] will be an involved function of
xi and any other variables that appear anywhere else in the model. We will examine the partial
effects on the efficiency estimators in Section E62.8.
Use ; Par to add the ancillary parameters to these. The ancillary parameters that are estimated for
the various models are as follows, including the scalars saved by the estimation program:
(The data were analyzed in Greene (2004a,b). Some of the variables, such as popden and gdpc, were
augmented from other sources in these studies.) Although the data are a five year panel – a few
countries were observed for fewer than five years – there is almost no cross year variation in any
variable. (The proportion of total variation that is within groups is less than 1% for the four time
varying variables.) We have created a cross section from these data as follows: First, we discarded the
data on internal political units. We then averaged comp, dale, hexp and educ across the five years. We
retained a sample of 191 cross sectional (country) units. The following command set creates the data set.
SAMPLE ; 1-840 $
REJECT ; small > 0 $
SETPANEL ; Group = country ; Pds = ti $
RENAME ; hc3 = educ $
CREATE ; lpubthe = log(pubthe) $
CREATE ; dalebar = Group Mean(dale, Pds = ti) $
CREATE ; compbar = Group Mean(comp, Pds = ti) $
CREATE ; educbar = Group Mean(educ, Pds = ti) $
CREATE ; hexpbar = Group Mean(hexp, Pds = ti) $
CREATE ; logdbar = Log(dalebar) ; logcbar = Log(compbar) $
CREATE ; logebar = Log(educbar) ; loghbar = Log(hexpbar) $
CREATE ; loghbar2 = loghbar^2 $
REJECT ; year # 1997 $
E62: Stochastic Frontier Models and Efficiency Analysis E-8
CALC ; Ran(12345) $
SAMPLE ; 1-500 $
CREATE ; u = Abs(Rnn(0,2))
; v = Rnn(0,1)
; x = Rnn(0,1)
;y=x+v+u$
REGRESS ; Lhs = y ; Rhs = one,x
; Res = e $
FRONTIER ; Lhs = y ; Rhs = one,x $
KERNEL ; Rhs = e $
The CREATE command generates y exactly according to the model, except note that u is not
subtracted, it is added. Thus, we should expect this model to perform poorly. The estimation results
from the FRONTIER command are shown below. Note the string of warnings. Estimation is
allowed to proceed, but the results are not a „frontier‟ as such. The final estimate of is essentially
zero, with a huge standard error and the reported estimate of u2 in the box above the results is
0.0000. The other estimates are, in fact, the same as OLS. The kernel density estimator for the OLS
residuals is clearly skewed in the positive, that is, the wrong direction. Once again, we emphasize,
this is a failure of the data to conform to the model.
Error 315: Stoch. Frontier: OLS residuals have wrong skew. OLS is MLE.
WARNING! OLS residuals have the wrong skewness for SFM
Other forms of the model models may also behave poorly.
In this case, one MLE for the half normal model is OLS
for beta and sigma and zero for the inefficiency term.
Warning 141: Iterations:current or start estimate of sigma nonpositive
Warning 141: Iterations:current or start estimate of sigma nonpositive
Warning 141: Iterations:current or start estimate of sigma nonpositive
Warning 141: Iterations:current or start estimate of sigma nonpositive
Warning 141: Iterations:current or start estimate of sigma nonpositive
Line search at iteration 30 does not improve fn. Exiting optimization.
E62: Stochastic Frontier Models and Efficiency Analysis E-9
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable Y
Log likelihood function -921.33848
Estimation based on N = 500, K = 4
Inf.Cr.AIC = 1850.7 AIC/N = 3.701
Variances: Sigma-squared(v)= 2.33375
Sigma-squared(u)= .00000
Sigma(v) = 1.52766
Sigma(u) = .00000
Sigma = Sqr[(s^2(u)+s^2(v)]= 1.52766
Gamma = sigma(u)^2/sigma^2 = .00000
Stochastic Production Frontier, e = v-u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 -921.33851
Chi-sq=2*[LogL(SF)-LogL(LS)] = .000
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
Y| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| 1.61107 165.2912 .01 .9922 -322.35365 325.57580
X| 1.00746*** .07057 14.28 .0000 .86914 1.14578
|Variance parameters for compound error
Lambda| .10897D-05 135.6070 .00 1.0000 -.26578D+03 .26578D+03
Sigma| 1.52766*** .00242 630.99 .0000 1.52292 1.53241
--------+--------------------------------------------------------------------
Unfortunately, the Waldman result is a sufficient condition, not a necessary one. That is, it
has been shown that when the OLS residuals have the „right‟ skewness, then the MLE for the frontier
model is unique, and you will have no trouble in estimation. When they have the „wrong‟ skewness,
it is only shown that the OLS results are a local stationary point of the log likelihood, not that they
are the global maximizers. There may be another point that is yet better than OLS. Our airline data
used below provide an example. Consider the following results, where we present both the
stochastic frontier estimates and OLS. (The model, itself, is developed later, so we show only the
useful results here.) As above, we receive the initial warning about the skewness of the OLS
residuals. Then, estimation proceeds and an apparently routine solution emerges that is different
from, and better than (has a higher log likelihood) OLS.
Error 315: Stoch. Frontier: OLS residuals have wrong skew. OLS is MLE.
WARNING! OLS residuals have the wrong skewness for SFM
Other forms of the model models may also behave poorly.
In this case, one MLE for the half normal model is OLS
for beta and sigma and zero for the inefficiency term.
Normal exit: 11 iterations. Status=0, F= -105.0617
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LQ
Log likelihood function 105.06169
Variances: Sigma-squared(v)= .02411
Sigma-squared(u)= .00457
Sigma(v) = .15527
Sigma(u) = .06757
Stochastic Production Frontier, e = v-u
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -1.05847*** .02333 -45.37 .0000 -1.10419 -1.01274
LF| .38355*** .07045 5.44 .0000 .24547 .52163
LE| .21961*** .07300 3.01 .0026 .07653 .36270
LM| .71667*** .07654 9.36 .0000 .56666 .86668
LL| -.41139*** .06382 -6.45 .0000 -.53647 -.28630
LP| .18973*** .02960 6.41 .0000 .13171 .24775
|Variance parameters for compound error
Lambda| .43515** .20117 2.16 .0305 .04086 .82944
Sigma| .16933*** .00057 295.74 .0000 .16821 .17045
--------+--------------------------------------------------------------------
Ordinary least squares regression ............
Diagnostic Log likelihood = 105.05876
Standard error of e = .16244
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error t |t|>T* Interval
--------+--------------------------------------------------------------------
Constant| -1.11237*** .01015 -109.57 .0000 -1.13227 -1.09247
LF| .38283*** .07116 5.38 .0000 .24335 .52231
LE| .21922*** .07389 2.97 .0033 .07441 .36404
LM| .71924*** .07732 9.30 .0000 .56769 .87078
LL| -.41015*** .06455 -6.35 .0000 -.53665 -.28364
LP| .18802*** .02980 6.31 .0000 .12961 .24643
--------+--------------------------------------------------------------------
E62: Stochastic Frontier Models and Efficiency Analysis E-11
There is no simple bullet proof strategy for handling this situation. You can try different
starting values with ; Start = values for , , that differ from OLS, but it is hard to know where
these will come from. Moreover, it is likely that you will end up at OLS anyway. As Waldman
points out, this is a potentially ill behaved log likelihood function. We offer the preceding as a
caution for the practitioner. For the particular data set used here, we can identify a specific culprit.
The „failure‟ of the model emerges in the presence of the variable lm, and does not occur when lm is
omitted from the equation. We have no theory, however, for why this should be the case. Simply
deleting variables from the model until one which does not have the skewness problem emerges does
not seem like an effective strategy.
We do note, the failure might signal a misspecified model. For example, for our airlines
example, the specification above omits the capital variable. When lk = log(k) is added to the model, we
obtain the following quite routine results (albeit with the wrong signs on capital and labor inputs).
Normal exit: 13 iterations. Status=0, F= -108.4392
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LQ
Log likelihood function 108.43918
Estimation based on N = 256, K = 9
Inf.Cr.AIC = -198.9 AIC/N = -.777
Variances: Sigma-squared(v)= .01902
Sigma-squared(u)= .01692
Sigma(v) = .13791
Sigma(u) = .13007
Sigma = Sqr[(s^2(u)+s^2(v)]= .18957
Gamma = sigma(u)^2/sigma^2 = .47074
Var[u]/{Var[u]+Var[v]} = .24425
Stochastic Production Frontier, e = v-u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 108.07431
Chi-sq=2*[LogL(SF)-LogL(LS)] = .730
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
| Deterministic Component of Stochastic Frontier Model
Constant| -2.98823*** .72136 -4.14 .0000 -4.40206 -1.57439
LF| .37257*** .07038 5.29 .0000 .23463 .51052
LE| 2.09473*** .68790 3.05 .0023 .74647 3.44299
LM| .69910*** .07580 9.22 .0000 .55054 .84766
LL| -.42909*** .06315 -6.79 .0000 -.55287 -.30530
LP| .44533*** .09498 4.69 .0000 .25917 .63149
LK| -2.09806*** .76556 -2.74 .0061 -3.59853 -.59759
| Variance parameters for compound error
Lambda| .94309*** .16870 5.59 .0000 .61244 1.27373
Sigma| .18957*** .00064 297.81 .0000 .18832 .19082
--------+--------------------------------------------------------------------
We emphasize, the Waldman result, and this particular theoretical outcome, is specific to the
normal-half normal model. However, when it occurs, problems of a similar sort will often, but not
always, show up in other models. Thus, in spite of a warning, your fitted exponential, or panel data
model, may be quite satisfactory.
E62: Stochastic Frontier Models and Efficiency Analysis E-12
y = x + v - u, u =|U|
in which contains a constant term and both v and U are homoscedastic and have zero means, i.e., in
the original half normal or exponential models, the OLS estimator of all elements of except the
constant term are consistent. It is convenient to rewrite the model as
y = 0 + 1x1 + v - u.
Under the assumptions, we can write the model as
saves the residuals from the deterministic frontier. These are the estimates of ui. Note in Figure E62.2,
for a cost frontier, all values of ui are positive. If you fit a production frontier, then all points will lie
below the regression and all residuals will be negative. The estimated inefficiency that is saved will be
-ei. Thus, in both cases, the values saved by ; Eff = variable are the positive estimates of the size of
the deviation of the observation from the frontier. The estimator saved by ; Eff = variable name is the
inefficiency estimate, in this model, a direct estimate of ui. The estimator of technical or cost efficiency
is
Efficiency = exp (uˆi )
E62: Stochastic Frontier Models and Efficiency Analysis E-13
The following shows computation of a COLS estimator for the airlines. The FRONTIER
command requests both the inefficiency estimates, ui, and the cost efficiency estimates, eui_cost.
The kernel density estimate for the cost efficiency is shown in Figure E62.3. The results for the
estimator begin with the standard output for least squares regression. The second panel includes
some preliminary results for the stochastic frontier model, including the chi squared test for zero
skewness (which is rejected); 2 = (n/6)(m3/s3)2. The standard normal statistic is the signed (based on
m3) square root of 2. The third panel presents descriptive statistics for ui and exp(-ui).
CREATE ; lc = Log(cost/pp)
; lpkp = Log(pk/pp)
; lplp = Log(pl/pp)
; lpmp= Log(pm/pp)
; lpep = Log(pe/pp)
; lpfp = Log(pf/pp) $
CREATE ; lk = Log(k) $
CREATE ; ly = Log(output) ; ly2 = .5*ly*ly $
FRONTIER ; Lhs = lc ; Rhs = one,ly,ly2,lpkp,lplp,lpmp,lpep,lpfp
; Cost ; Model = COLS
; Costeff = Eui_cost ; Eff = ui $
KERNEL ; Rhs = eui_cost
; Title = Estimated Cost Efficiency Based on COLS Estimator $
E62: Stochastic Frontier Models and Efficiency Analysis E-14
-----------------------------------------------------------------------------
Corrected OLS Deterministic Frontier Cost Function
LHS=LC Mean = 2.84024
Standard deviation = 1.09256
No. of observations = 256 Degrees of freedom
Regression Sum of Squares = 300.028 7
Residual Sum of Squares = 4.36487 248
Total Sum of Squares = 304.393 255
Standard error of e = .13267
Fit R-squared = .98566 R-bar squared = .98526
Model test F[ 7, 248] = 2435.25310 Prob F > F* = .00000
Diagnostic Log likelihood = 157.91523 Akaike I.C. = -4.00909
Restricted (b=0) = -385.41031 Bayes I.C. = -3.89830
Chi squared [ 7] = 1086.65108 Prob C2 > C2* = .00000
--------------------------------------------------
Skewness test for inefficiency based on residuals
Normalized skewness = m3/s^3 = .21340
Chi squared test (1 degree of freedom) 1.94294 Critical value= 3.84000
Standard normal test statistic 1.39389 Test value = +/- 1.96000
Estimated Efficiency Values Based on e(i)+Min e(i)
--------+-----------------------------------------
| Mean Std.Dev. Minimum Maximum
CostInef| .357 .133 .000 .773
Cost Eff| .706 .091 .462 1.000
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LC| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic COLS Frontier Function
Constant| 19.4363 27.45697 .71 .4790 -34.3783 73.2510
LY| .94303*** .01809 52.12 .0000 .90757 .97849
LY2| .08248*** .01236 6.67 .0000 .05825 .10671
LPKP| 1.42385 2.14849 .66 .5075 -2.78711 5.63480
LPLP| .01915 .10169 .19 .8506 -.18016 .21847
LPMP| .04504 1.41721 .03 .9746 -2.73264 2.82272
LPEP| -.57070 .67904 -.84 .4007 -1.90159 .76019
LPFP| -.04811** .01986 -2.42 .0154 -.08704 -.00919
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
since v is symmetric. The left hand sides can be consistently estimated using the OLS residuals:
m2 = (1/n)i ei2
and m3 = (1/n)iei3.
Both of the functions on the right hand side are known for the half normal and exponential models.
In particular, for the half normal model, the moment equations are
m2 = v2 + [1 - 2/]u2 ,
m3 = (2/)1/2[1 - 4/]u3.
1/ 3
m / 2
The solutions are: ˆ u 3 and ˆ v m2 (1 2/ )ˆ u2 .
1 4 /
Note that there is no solution for u if m3 is not negative, which is the problem discussed in Section
E62.5. Assuming that this problem does not arise, the corrected constant term is
̂ a + Est.E[u] = a +
ˆu 2/ .
This is the „modified least squares‟ (MOLS) estimator that is discussed in a number of sources, such
as Greene (2005). These are the values used for starting values for the MLE, as well. Looking
ahead, note that there is no natural method of moments estimator for the mean parameter in the
truncated normal model discussed in Section E63.3. For this model, we use
̂ /u = 0.
For the normal-exponential model, the moment equations that correspond to the preceding are
m2 = v2 + 1/2
3
m3 = -2/ .
1/ 3
2
Therefore, ˆ and ˆ v m2 1/ ˆ 2
m3
and ̂ a + 1/ ˆ .
E62: Stochastic Frontier Models and Efficiency Analysis E-16
The header information in the results table will display the decomposition of the variance of
the composed error in two parts. In the case of the half normal model,
Var[u] = [(-2)/]u2
not u2. Therefore, the estimated parameters might be a bit misleading as to the relative influence of
u on the total variation in the structural disturbance.
We note, these estimators are sometimes quite far from the maximum likelihood estimators,
particularly when the sample is small. But, they are generally quite satisfactory as starting values for
the MLE. The following demonstrates these results for the airline data, where we use MOLS and
MLE to fit a normal-half normal cost frontier. (Note, the signs of the OLS residuals are reversed
because we are fitting a cost function.) In the results below, we have imposed the assumption of
linear homogeneity in prices in the cost function by normalizing the six input prices, pk, pl, pe, pp,
pm, pf, by the property price, pp. The model contains log(pj/pp). To complete the constraint, we
have also normalized total cost by pp before taking logs.
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LC| Coefficient Error t |t|>T* Interval
--------+--------------------------------------------------------------------
Constant| 19.7932 27.45697 .72 .4717 -34.0214 73.6079
LY| .94303*** .01809 52.12 .0000 .90757 .97849
LY2| .08248*** .01236 6.67 .0000 .05825 .10671
LPKP| 1.42385 2.14849 .66 .5081 -2.78711 5.63480
LPLP| .01915 .10169 .19 .8508 -.18016 .21847
LPMP| .04504 1.41721 .03 .9747 -2.73264 2.82272
LPEP| -.57070 .67904 -.84 .4015 -1.90159 .76019
LPFP| -.04811** .01986 -2.42 .0161 -.08704 -.00919
--------+--------------------------------------------------------------------
[CALC] SU = .1296481
[CALC] SV = .1046056
[CALC] A = 19.8966785
[CALC] LAMBDA = 1.2393989
[CALC] SGMA = .1665862
Calculator: Computed 5 scalar results
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LCN
Log likelihood function 159.20743
Estimation based on N = 256, K = 10
Inf.Cr.AIC = -298.4 AIC/N = -1.166
Variances: Sigma-squared(v)= .01021
Sigma-squared(u)= .01890
Sigma(v) = .10103
Sigma(u) = .13746
Sigma = Sqr[(s^2(u)+s^2(v)]= .17059
Gamma = sigma(u)^2/sigma^2 = .64927
Var[u]/{Var[u]+Var[v]} = .40216
Stochastic Cost Frontier Model, e = v+u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 157.91523
Chi-sq=2*[LogL(SF)-LogL(LS)] = 2.584
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LCN| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| 19.8020 25.91115 .76 .4447 -30.9829 70.5869
LY| .95577*** .01781 53.68 .0000 .92088 .99067
LY2| .09086*** .01198 7.58 .0000 .06738 .11435
LPKP| 1.43400 2.02750 .71 .4794 -2.53982 5.40783
LPLP| .01242 .09676 .13 .8979 -.17722 .20205
LPMP| .05744 1.33747 .04 .9657 -2.56396 2.67883
LPEP| -.56860 .64356 -.88 .3770 -1.82995 .69275
LPFP| -.06002*** .01993 -3.01 .0026 -.09907 -.02096
|Variance parameters for compound error
Lambda| 1.36059*** .20306 6.70 .0000 .96261 1.75857
Sigma| .17059*** .00058 294.50 .0000 .16946 .17173
--------+--------------------------------------------------------------------
E62: Stochastic Frontier Models and Efficiency Analysis E-18
The default form is the normal-half normal model. In this form, model estimates consist of ,
v2 u2 and = u/v, and the usual set of diagnostic statistics for models fit by maximum
likelihood. The other basic form in the ALS model is the exponential model,
u ~ exp(-u), u> 0,
which has mean inefficiency E[u] = 1/ and standard deviation, u= 1/. The parameters estimated in
the exponential specification are (,,v). The estimate of u is reported in the results as well.
The following illustrate the estimator, with a normal-half normal cost frontier and a normal-
exponential production frontier. The coefficient estimates for the exponential cost frontier are shown
as well.
The stochastic frontier results include the standard output for MLEs The derived estimates of u, v,
u2, v2 and are shown as well. The value of = u2/2 is given for comparability with other parts
of the literature. This ratio, which lies in (0,1) is sometimes reported as a variance decomposition of
. However, the variance of u = |U| is (1 - 2/)u2, so the appropriate decomposition is (1 -
2/)u2/[v2 + (1 - 2/)u2]. This is the value shown next under in the results.
A likelihood ratio test against the hypothesis of no inefficiency follows the variance
estimates. The degrees of freedom for the test are accumulated in the table.. The first is for u in the
base case. The second is for the heteroscedasticity terms in Var[u] when they are introduced in the
model. Heteroscedasticity is developed in Chapter E63. The third term is for the truncation
parameters in the normal-truncated normal model, also developed in the next chapter. The “degrees
of freedom for the inefficiency model” are the sum of these three terms. The likelihood ratio statistic
is presented next. This is a nonstandard test because the null value of u is on the boundary of the
parameter space. Appropriate tables for the mixed chi squared test used here are given in Kodde and
Palm (1986). (A copy of the relevant parts of the table is kept internally by the program. (See, also,
Coelli, Rao and Battese (1998) for further details.)
E62: Stochastic Frontier Models and Efficiency Analysis E-19
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LCN
Log likelihood function 159.20743
Estimation based on N = 256, K = 10
Inf.Cr.AIC = -298.4 AIC/N = -1.166
Variances: Sigma-squared(v)= .01021
Sigma-squared(u)= .01890
Sigma(v) = .10103
Sigma(u) = .13746
Sigma = Sqr[(s^2(u)+s^2(v)]= .17059
Gamma = sigma(u)^2/sigma^2 = .64927
Var[u]/{Var[u]+Var[v]} = .40216
Stochastic Cost Frontier Model, e = v+u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 157.91523
Chi-sq=2*[LogL(SF)-LogL(LS)] = 2.584
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LCN| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| 19.8020 25.91115 .76 .4447 -30.9829 70.5869
LY| .95577*** .01781 53.68 .0000 .92088 .99067
LY2| .09086*** .01198 7.58 .0000 .06738 .11435
LPKP| 1.43400 2.02750 .71 .4794 -2.53982 5.40783
LPLP| .01242 .09676 .13 .8979 -.17722 .20205
LPMP| .05744 1.33747 .04 .9657 -2.56396 2.67883
LPEP| -.56860 .64356 -.88 .3770 -1.82995 .69275
LPFP| -.06002*** .01993 -3.01 .0026 -.09907 -.02096
|Variance parameters for compound error
Lambda| 1.36059*** .20306 6.70 .0000 .96261 1.75857
Sigma| .17059*** .00058 294.50 .0000 .16946 .17173
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
Results for the normal-exponential model appear below. It is not possible to use a LR test to
choose between these two models. The test has zero degrees of freedom – neither model is obtained
by a restriction on the other. One possibility might be a Vuong (1989) statistic, which would be
computed as
nm
V , mi log( fi | normal ) log( f i | exponential ) .
sm
Results of the test are shown below the model results. The statistic is well inside the inconclusive
region.
E62: Stochastic Frontier Models and Efficiency Analysis E-20
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LCN
Log likelihood function 159.89917
Estimation based on N = 256, K = 10
Inf.Cr.AIC = -299.8 AIC/N = -1.171
Exponential frontier model
Variances: Sigma-squared(v)= .01147
Sigma-squared(u)= .00568
Sigma(v) = .10709
Sigma(u) = .07539
Stochastic Cost Frontier Model, e = v+u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 157.91523
Chi-sq=2*[LogL(SF)-LogL(LS)] = 3.968
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LCN| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| 22.6569 25.48354 .89 .3740 -27.2899 72.6038
LY| .96069*** .01892 50.77 .0000 .92360 .99777
LY2| .09281*** .01249 7.43 .0000 .06832 .11729
LPKP| 1.65439 1.99409 .83 .4067 -2.25395 5.56272
LPLP| -.00962 .09785 -.10 .9217 -.20140 .18216
LPMP| -.06595 1.31569 -.05 .9600 -2.64465 2.51275
LPEP| -.62841 .63243 -.99 .3204 -1.86795 .61114
LPFP| -.06397*** .02033 -3.15 .0017 -.10381 -.02412
|Variance parameters for compound error
Theta| 13.2651*** 2.90719 4.56 .0000 7.5671 18.9630
Sigmav| .10709*** .00980 10.93 .0000 .08788 .12629
--------+--------------------------------------------------------------------
E62.7.1 Log Likelihoods for the Half Normal and Exponential Models
As will be evident below, different formulations of the log likelihood are most convenient
for estimation of the different forms of the frontier models. (And, different authors sometimes
parameterize the models differently.) The base case is the normal-half normal model. In this form,
vi~ N[0,v2] and ui = |Ui| where Ui ~ N[0,u2]. It follows that f(ui) = 2(ui/u), ui> 0. The density of
i = vi- ui has been shown to be
f(i) = (2/)(i/)(-i/).
The most common form of the individual term in the log likelihood function (and the one used in
LIMDEP) is
log Li = ½ log(2/) - log - ½(i/)2 + log[-Si/]
where i = yi - xi
= u / v,
2
= u2 + v2, v2 = 2 / (1 + 2), u2 = 22 / (1 + 2)
S = +1 for production frontier, -1 for cost frontier
Olsen‟s transformation is used for maximizing the log likelihood. We reparameterize the function in
terms of = 1/ and = (1/). Then,
xi xi 0
log Li / i yi i S yi 1 /
0 0
i
xi xi 0 0
2 log Li / yi xi yi2 0
0 0
0
2 xi xi 2 yi xi i xi 0 0 i Sxi
i 2 yi xi 2 yi2 i yi 0 1 / 2 i Syi
i x i yi i2 i Sxi i Syi 0
E62: Stochastic Frontier Models and Efficiency Analysis E-22
Sx / Sxi
i v
2
log Li / i v 1 / v S i
v 2
S i / v 2 v
x x / 2 Sx ai Sxi / v
i i v i
2 log Li / i Sxi v2 av
v v ai Sxi / v ai v ai2
0 Sx i i Sxi
Sxi 1 / 2 v2 2v i
i Sxi 2v i 2i S i / v
2 3
.
1
The parameterization in terms of is more convenient but does not produce different results.
(This is an indirect estimator of u. Unfortunately, it is not possible to estimate ui directly from any
observed sample information. The various surveys noted earlier discuss the computation of and
properties of this estimator.) The counterpart for the normal-exponential model is
( w)
Eˆ [u | ] v w , w = (S/v + v).
1 ( w)
These are computed and saved as new variables in your data set with
The ; List specification will also request a listing of this variable. This form is used for all
distributions and all variations of the stochastic frontier model.
By adding ; Eff = u to the frontier command, then
KERNEL ; Rhs = u $
we obtain the results below. (We also added the title to the command with ; Title = …) Note an
important element of the estimation. The „Standard Deviation‟ reported below is 0.054895, whereas
the estimate of u is 0.13746. The difference arises because the 0.054895 is an estimate of the
standard deviation of E[u|], not the standard deviation of u.
+---------------------------------------+
| Kernel Density Estimator for U |
| Observations = 256 |
| Points plotted = 256 |
| Bandwidth = .016298 |
| Statistics for abscissa values---- |
| Mean = .109394 |
| Standard Deviation = .054895 |
| Minimum = .030722 |
| Maximum = .350422 |
| ---------------------------------- |
| Kernel Function = Logistic |
| Cross val. M.S.E. = .000000 |
| Results matrix = KERNEL |
+---------------------------------------+
E62: Stochastic Frontier Models and Efficiency Analysis E-24
log y = x + v - u.
y
EFF = Exp(u )
Optimal y
if you estimate a cost frontier instead. You may compute both inefficiencies and efficiency measures
in the same command. Figure E62.5 was obtained by adding
; Costeff = ecu
to the FRONTIER command, then requesting the kernel density estimator as before (with the title
changed accordingly).
E62: Stochastic Frontier Models and Efficiency Analysis E-25
Then, if the elements were the true parameters, the region [LBi,UBi] would encompass 100(1-)% of
the distribution of ui|i. For constructing „confidence intervals‟ for technical efficiency, TEi|i, it is
necessary only to compute TEUBi = exp(-LBi) and TELBi = exp(-UBi).
E62: Stochastic Frontier Models and Efficiency Analysis E-26
We note two caveats about the estimator. First, the received papers based on classical
methods have labeled this a confidence interval for ui. However, it is a range that encompasses
100(1-)% of the probability in the conditional distribution of ui|i. based on E[ui|i], not ui, itself.
The interval is „centered‟ at the estimator of the conditional mean, E[ui|i], not the estimator of ui,
itself, as a conventional „confidence interval‟ would be. The estimator is actually characterizing the
conditional distribution of ui|i, not constructing any kind of interval that brackets a particular ui –
that is not possible. Second, these limits are conditioned on known values of the parameters, so they
ignore any variation in the parameter estimates used to construct them. Thus, we regard this as a
minimal width interval.
You can request computation of these lower and upper bounds by adding
where 100(1-) is one of 90, 95, or 99 and lower, upper are names for two variables that will be
created. You may use this feature with ; Eff = variable or ; Techeff = variable (or ; Costeff =
variable for a cost frontier). If you have both ; Eff and ; Techeff in the command, the confidence
intervals are computed for ; Techeff. (You can obtain the interval for ; Eff in this case by computing
the negatives of the logs with CREATE.)
We obtained these bounds for our cost function with
The centipede plot is also a useful device in this context. The following redraws Figure E62.6 using
a different view for the lower and upper bounds
In this case, it might be interesting to examine how increased load factor, route complexity, or stage
length impact efficiency.
Expressions for the technical inefficiency values appear at the beginning of Section E62.8.
In those expressions, we will use
Efficiency = exp{- Eˆ [u | ] }.
The two expressions for the normal and exponential models are functions of a w() that is specific to
the model. Each may be written as
Efficiency = exp{-mA[wm()]}
E62: Stochastic Frontier Models and Efficiency Analysis E-28
Where m = half normal or exponential, m = /(1+2) for the half normal and 1/v for the
exponential, and wm is defined earlier. We now suppose that
= y - x - z
where x is the theoretical inputs to the goal and z are the environmental variables. We require the
derivatives with respect to z. For convenience, let W = -w and exploit the symmetry of the normal
density. Then, A[wm()] = [(W)/(W) + W]. The derivative is
The two terms that we need to complete the derivation are wm/ = S/ for the half normal model
and S/v for the exponential model and
1 D(W ).
dW (W ) (W )
Collecting terms,
2 /(1 2 )
Efficiency
Efficiency D(W ) or S ( )
z
1
We can sign this result, though the magnitude will be empirical. The first three terms are all between
zero and one, as is their product. S is either +1 for a production frontier or -1 for a cost frontier.
Thus, in total, the derivative is a fraction of the corresponding coefficient, which takes the same sign
for a cost frontier and the opposite sign for a production frontier.
Partial derivatives and simulations are computed with PARTIALS and SIMULATE. The
general approach would be
The command might also contain ; Eff = variable, ; Techeff = variable or ; Costeff = variable.
Then, you may follow it with
The function analyzed in these two commands is the technical or cost efficiency,
Efficiency = exp{- Eˆ [u | ] }.
E62: Stochastic Frontier Models and Efficiency Analysis E-29
The following demonstrates using the cost frontier, with variables z = (load factor, log stage length,
points served). Data on z are missing for one of the firms.
---------------------------------------------------------------------
Model Simulation Analysis for JLMS efficiency estimator in SF model
---------------------------------------------------------------------
Simulations are computed by average over sample observations
---------------------------------------------------------------------
User Function Function Standard
(Delta method) Value Error |t| 95% Confidence Interval
---------------------------------------------------------------------
Avrg. Function .93354 .00635 147.07 .92110 .94598
LOADFCTR= .40 .95844 .00346 277.19 .95166 .96522
LOADFCTR= .43 .95502 .00344 277.54 .94827 .96176
LOADFCTR= .45 .95123 .00357 266.70 .94424 .95822
LOADFCTR= .48 .94706 .00392 241.56 .93937 .95474
LOADFCTR= .50 .94247 .00456 206.48 .93353 .95142
LOADFCTR= .53 .93746 .00552 169.87 .92664 .94828
(some rows omitted)
LOADFCTR= .83 .84622 .03145 26.91 .78458 .90786
LOADFCTR= .85 .83696 .03384 24.73 .77063 .90329
LOADFCTR= .88 .82763 .03616 22.89 .75676 .89850
LOADFCTR= .90 .81827 .03839 21.32 .74303 .89352
LOADFCTR= .93 .80892 .04053 19.96 .72947 .88836
LOADFCTR= .95 .79958 .04259 18.78 .71611 .88305
LOADFCTR= .98 .79029 .04455 17.74 .70296 .87761
---------------------------------------------------------------------
Partial Effects Analysis for JLMS efficiency estimator in SF model
---------------------------------------------------------------------
Effects on function with respect to LOADFCTR
Results are computed by average over sample observations
Partial effects for continuous LOADFCTR computed by differentiation
Effect is computed as derivative = df(.)/dx
---------------------------------------------------------------------
df/dLOADFCTR Partial Standard
(Delta method) Effect Error |t| 95% Confidence Interval
---------------------------------------------------------------------
APE. Function -.22444 .06690 3.35 -.35557 -.09331
LOADFCTR= .40 -.13020 .02575 5.06 -.18067 -.07973
LOADFCTR= .43 -.14405 .03134 4.60 -.20547 -.08263
LOADFCTR= .45 -.15900 .03766 4.22 -.23281 -.08519
LOADFCTR= .48 -.17497 .04464 3.92 -.26246 -.08748
(Some rows omitted)
LOADFCTR= .85 -.37205 .09615 3.87 -.56051 -.18359
LOADFCTR= .88 -.37392 .09265 4.04 -.55551 -.19234
LOADFCTR= .90 -.37452 .08896 4.21 -.54887 -.20017
LOADFCTR= .93 -.37403 .08524 4.39 -.54109 -.20697
LOADFCTR= .95 -.37265 .08160 4.57 -.53259 -.21271
LOADFCTR= .98 -.37054 .07813 4.74 -.52368 -.21739
---------------------------------------------------------------------
Partial Effects for JLMS efficiency estimator in SF model
Partial Effects Averaged Over Observations
* ==> Partial Effect for a Binary Variable
---------------------------------------------------------------------
Partial Standard
(Delta method) Effect Error |t| 95% Confidence Interval
---------------------------------------------------------------------
LOADFCTR -.25723 .07389 3.48 -.40205 -.11240
LOGSTAGE -.04620 .01292 3.58 -.07153 -.02088
POINTS .00035 .00012 2.95 .00012 .00058
---------------------------------------------------------------------
E62: Stochastic Frontier Models and Efficiency Analysis E-32
It was noted that partial effects with respect to x are not likely to be particularly interesting.
Nonetheless, they could be computed.
NOTE: Partial effects of variables in the stochastic frontier efficiency models may be computed
with respect to any variable in any model, regardless of where those variables appear in the model.
That includes x in the original frontier model, z in the means of the truncated regression formats, and
z in the variances of the heteroscedasticity models.
To continue the earlier example, the partial effect of LogQ could be computed in the cost function
using
NAMELIST ; x = one,lq,lq^2,lpmpp,lpfpp,lpepp,lplpp,lpkpp $
NAMELIST ; z = loadfctr,logstage,points $
FRONTIER ; Cost ; Lhs = lcp ; Rhs = x,z $
PARTIALS ; Effects : lq ; summary $
Note that the specification will correctly account for the fact that the square of LogQ appears in the
cost function when it computes the partial effects.
The Rnk function sorts the data for you and creates the ranking variable. The observation with the
highest value gets the rank of one. The lowest gets a rank of n. Note, tied observations do not get the
same rank. Tied observations are ranked in the order in which they appear in the data. For example, in
a sample of 100, if 10 observations are tied for third place, they will receive ranks 3 through 12.
Two CALC functions provide descriptive measures for ranks. For two sets of ranks, the
Spearman rank correlation coefficient is computed as
The rank correlation is a correlation coefficient, so it has a natural range of measurement. (See the
application below.) For more than two sets of ranks, a useful statistic is Kendall‟s coefficient of
concordance,
W = 12 i1 (Si - S )2/[nK2(n2 - 1)]
n
where Si = Σkrankk,i.
The concordance coefficient is not a correlation coefficient, so its magnitude is ambiguous. It can be
used for a large sample test of discordance. Under the null hypothesis that the sets of ranks are
independent, the statistic has a large sample chi squared distribution. In particular,
K(n-1)W → χ2[K(n-1)].
To illustrate these computations, we have analyzed the WHO data described in Section
E62.4.2. We have fit identical stochastic frontier models for the two attainment variables, lcomp, the
log of the composite measure, and ldale, the log of disability adjusted life expectancy. We then
computed the ranks for the 191 countries and plotted the ranks for the two measures as well as the
raw efficiency measures. The simple correlation for the efficiency measures and the rank correlation
for the ranks are displayed. The commands are as follows:
NAMELIST ; x = one,logebar,loghbar,loghbar2 $
NAMELIST ; z = gini,lpopden,lgdpc,geff,voice,oecd,lpubthe,tropics $
FRONTIER ; Lhs = logdbar ; Rhs = x,z
; Eff = udale ; Techeff = edale $
FRONTIER ; Lhs = logcbar ; Rhs = x,z
; Eff = ucomp ; Techeff = ecomp $
CREATE ; dalerank = 192 - Rnk(edale) $
CREATE ; comprank = 192 - Rnk(ecomp) $
PLOT ; Lhs = dalerank ; Rhs = comprank
; Endpoints = 0,200 ; Limits = 0,200
; Title = Ranks of Efficiencies: DALE vs. COMP $
PLOT ; Lhs = edale ; Rhs = ecomp ; Endpoints = .8,1 ; Grid
; Title = Efficiencies: DALE vs. COMP $
CALC ; List ; Rkc(dalerank,comprank) $
CALC ; List ; Cor(edale,ecomp) $
E62: Stochastic Frontier Models and Efficiency Analysis E-34
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LOGDBAR
Log likelihood function 155.83849
Estimation based on N = 191, K = 14
Inf.Cr.AIC = -283.7 AIC/N = -1.485
Variances: Sigma-squared(v)= .00145
Sigma-squared(u)= .03288
Sigma(v) = .03808
Sigma(u) = .18134
Sigma = Sqr[(s^2(u)+s^2(v)]= .18529
Gamma = sigma(u)^2/sigma^2 = .95777
Var[u]/{Var[u]+Var[v]} = .89180
Stochastic Production Frontier, e = v-u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 141.59006
Chi-sq=2*[LogL(SF)-LogL(LS)] = 28.497
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LOGDBAR| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| 2.60812*** .18255 14.29 .0000 2.25034 2.96590
LOGEBAR| .11227*** .01869 6.01 .0000 .07564 .14891
LOGHBAR| .30118*** .05072 5.94 .0000 .20177 .40059
LOGHBAR2| -.02710*** .00455 -5.96 .0000 -.03601 -.01818
GINI| -.30417*** .10600 -2.87 .0041 -.51192 -.09642
LPOPDEN| .00213 .00402 .53 .5955 -.00574 .01001
LGDPC| .07541*** .02424 3.11 .0019 .02789 .12293
GEFF| -.00673 .01551 -.43 .6642 -.03714 .02367
VOICE| .02093* .01113 1.88 .0601 -.00089 .04275
OECD| .01608 .03055 .53 .5987 -.04381 .07596
LPUBTHE| .00974 .01497 .65 .5150 -.01959 .03908
TROPICS| -.03703** .01714 -2.16 .0307 -.07063 -.00344
|Variance parameters for compound error
Lambda| 4.76248*** 1.22054 3.90 .0001 2.37026 7.15470
Sigma| .18529*** .00086 214.30 .0000 .18360 .18698
--------+--------------------------------------------------------------------
E62: Stochastic Frontier Models and Efficiency Analysis E-35
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LOGCBAR
Log likelihood function 248.18065
Estimation based on N = 191, K = 14
Inf.Cr.AIC = -468.4 AIC/N = -2.452
Variances: Sigma-squared(v)= .00142
Sigma-squared(u)= .00888
Sigma(v) = .03768
Sigma(u) = .09421
Sigma = Sqr[(s^2(u)+s^2(v)]= .10147
Gamma = sigma(u)^2/sigma^2 = .86207
Var[u]/{Var[u]+Var[v]} = .69429
Stochastic Production Frontier, e = v-u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 241.57767
Chi-sq=2*[LogL(SF)-LogL(LS)] = 13.206
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LOGCBAR| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| 3.21081*** .10704 30.00 .0000 3.00101 3.42060
LOGEBAR| .06590*** .01319 4.99 .0000 .04004 .09177
LOGHBAR| .18617*** .03763 4.95 .0000 .11240 .25993
LOGHBAR2| -.01509*** .00328 -4.61 .0000 -.02151 -.00867
GINI| -.25334*** .07579 -3.34 .0008 -.40189 -.10478
LPOPDEN| .00523* .00281 1.86 .0628 -.00028 .01073
LGDPC| .05747*** .01681 3.42 .0006 .02453 .09040
GEFF| .00290 .01068 .27 .7858 -.01803 .02384
VOICE| .02082** .00872 2.39 .0170 .00373 .03791
OECD| .01699 .01946 .87 .3827 -.02115 .05513
LPUBTHE| .01798** .00903 1.99 .0466 .00027 .03568
TROPICS| -.02365** .01191 -1.99 .0471 -.04700 -.00031
|Variance parameters for compound error
Lambda| 2.50000*** .41784 5.98 .0000 1.68104 3.31896
Sigma| .10147*** .00045 224.53 .0000 .10058 .10235
--------+--------------------------------------------------------------------
The estimator is based on the locally linear regression in Section E9.5. The underlying logic is the
result that in the stochastic frontier model, apart from the constant term, OLS consistently estimates
the slope parameters of the model and estimates the constant term with a known bias. For the
constant, a, the bias is E[u], the unconditional mean, which in the stochastic frontier model is
E[u] = u 2 / .
Continuing this approach, then, the least squares residuals estimate i + E[u]. In addition, the least
squares residual variance, ee/n, consistently estimates Var[i] = 2 = v2 + [(1 – 2/)u2]. The
implication is that the only parameter remaining to estimate is u2. In Section E62.6.2, we used the
third moment of the OLS residuals and the method of moments to estimate u, then used this
estimate to estimate , the constant term in the frontier function.
The approach proposed here uses this same method with three differences.
1. The residuals used to compute the variance estimator are based on a locally linear,
nonparametric estimator of the deterministic function.
2. The remaining parameter to be estimated in this case is rather than u. We will base the
estimation on the result u2 2 2 / (1 2 ).
3. The approach will be based on a maximum likelihood estimator rather than the method of
moments.
Estimation uses the following steps: We begin with estimation of the conventional normal-half
normal frontier model with a linear frontier function in order to obtain an initial estimator of and of
2. The LOWESS estimator developed in Chapter E9.5 is then employed to estimate g(x,z) for each
point in the sample. The residuals from the estimated functions are used with the estimate of 2 for
estimation of . With 2 and in hand, we can compute the constant term, a set of residuals, and the
JLMS estimators of technical or cost efficiency. Technical details appear in Section E62.9.2.
E62: Stochastic Frontier Models and Efficiency Analysis E-38
E62.9.1 Application
We have reestimated the airlines cost frontier with the semiparametric estimator. The
frontier functions differ noticeably, primarily in the parameter estimates that are statistically
insignificant. The kernel estimators suggest, however, that the difference in the estimates of
inefficiency are quite modest. The descriptive statistics suggest the same pattern. The final plot
shows more graphically how the nonparametric function has changed the estimates. The fact that
most of the estimates from the nonparametric estimator lie below the 45 degree line is consistent
with the appearance that generally, they are smaller than the parametric values. The last set of
results are the ordinary (Pearson) correlation and Kendall‟s tau.
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LC
Log likelihood function 215.15699
Estimation based on N = 256, K = 13
Variances: Sigma-squared(v)= .00820
Sigma-squared(u)= .00753
Sigma(v) = .09054
Sigma(u) = .08676
Sigma = Sqr[(s^2(u)+s^2(v)]= .12539
Gamma = sigma(u)^2/sigma^2 = .47870
Var[u]/{Var[u]+Var[v]} = .25020
Stochastic Cost Frontier Model, e = v+u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 214.75424
Chi-sq=2*[LogL(SF)-LogL(LS)] = .806
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
-----------------------------------------------------------------------------
E62: Stochastic Frontier Models and Efficiency Analysis E-39
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LC| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| 9.19939 21.64273 .43 .6708 -33.21957 51.61835
LY| .97398*** .01751 55.63 .0000 .93966 1.00829
LY2| .05123*** .01029 4.98 .0000 .03106 .07140
LPKP| .49455 1.69257 .29 .7701 -2.82283 3.81193
LPLP| .13721* .08121 1.69 .0911 -.02195 .29637
LPMP| .45863 1.11624 .41 .6812 -1.72915 2.64642
LPEP| -.10302 .53634 -.19 .8477 -1.15422 .94818
LPFP| -.02090 .01794 -1.16 .2441 -.05607 .01427
LOADFCTR| -.99466*** .17446 -5.70 .0000 -1.33660 -.65273
LOGSTAGE| -.17940*** .02531 -7.09 .0000 -.22902 -.12979
POINTS| .00164*** .00031 5.20 .0000 .00102 .00225
|Variance parameters for compound error
Lambda| .95827*** .16869 5.68 .0000 .62763 1.28890
Sigma| .12539*** .00039 321.29 .0000 .12463 .12616
--------+--------------------------------------------------------------------
+-----------------------------------------------+
| Locally linear weighted regression estimation |
| Sample size 256 |
| Model size 11 |
| Band width .500000 |
| LOESS Sum of Squared Residuals 1.69637 |
| OLS Sum of Squared Residuals 2.79975 |
| Derivatives Matrix LOCLBETA |
+-----------------------------------------------+
Reestimating lambda using residuals based on LOWESS regression
Normal exit: 3 iterations. Status=0, F= -337.3385
-----------------------------------------------------------------------------
Partially Nonparametric Stochastic Frontier Fit by LOWESS
Dependent variable LC
Estmation based on N = 256, K = 11
Variances: Sigma-squared(u)= .00438 Sigma(u) = .06616
Sigma-squared(v)= .00504 Sigma(v) = .07096
Sigma = Sqr[(s^2(u)+s^2(v)]= .09702 Lambda = .93233
Stochastic Cost Frontier Model, e = v+u
-----------------------------------------------------------------------------
Statistical results are for the sample means of the LOWESS estimated betas.
They are not moments of an asymptotic distribution.
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LC| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
Constant| 34.8551 23.42958 1.49 .1368 -11.0661 80.7762
LY| .98897*** .05040 19.62 .0000 .89018 1.08775
LY2| .04598*** .01677 2.74 .0061 .01310 .07885
LPKP| 2.48149 1.78813 1.39 .1652 -1.02319 5.98616
LPLP| .09976 .10851 .92 .3579 -.11292 .31244
LPMP| -.85374 1.34656 -.63 .5261 -3.49295 1.78547
LPEP| -.71103 .43514 -1.63 .1023 -1.56389 .14183
LPFP| -.02183 .03324 -.66 .5114 -.08698 .04332
LOADFCTR| -.78691 .65061 -1.21 .2265 -2.06208 .48826
LOGSTAGE| -.20490* .11308 -1.81 .0700 -.42653 .01672
POINTS| .00225 .00205 1.10 .2710 -.00176 .00627
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E62: Stochastic Frontier Models and Efficiency Analysis E-40
Descriptive Statistics
--------+---------------------------------------------------------------------
Variable| Mean Std.Dev. Minimum Maximum Cases Missing
--------+---------------------------------------------------------------------
EUP| .933537 .025027 .812486 .975689 256 0
EUNP| .948487 .019528 .844732 .983878 256 0
--------+---------------------------------------------------------------------
The value of 2= v2 + [(1 – 2/)u2]is estimated using the squared LOWESS residuals; it is the
sample variance = q2. The LOWESS residuals, themselves, are estimates of i + E[ui]. With q2 and
the residuals in hand, the log likelihood is a function only of . During the iteration, we compute
a = /(1+2)1/2,
s2 = q2 / (1 – (2/)a2), then s
m = as 2 /
ei = residuali - m.
These residuals and s are used to compute logLi and the derivative with respect to . This estimation
step provides the estimator of that we need to compute the efficiencies. After estimation of ,
computation of the JLMS estimates of inefficiency is done the same as in the parametric form of the
model, using the LOWESS residuals.
P exp(ui )uiP 1
ui ~ , ui 0, P 0, 0.
( P )
This model is more flexible than the half normal or exponential model in that with two parameters, it
allows the both the shape and location to vary independently. (The truncation model does likewise,
but it is considerably more difficult to estimate.) To specify the gamma model, use
-----------------------------------------------------------------------------
Log likelihood function 159.89917
Exponential frontier model
Variances: Sigma-squared(v)= .01147
Sigma-squared(u)= .00568
Sigma(v) = .10709
Sigma(u) = .07539
Stochastic Cost Frontier Model, e = v+u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 157.91523
Chi-sq=2*[LogL(SF)-LogL(LS)] = 3.968
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LC| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| 22.6569 25.48354 .89 .3740 -27.2899 72.6038
LY| .96069*** .01892 50.77 .0000 .92360 .99777
LY2| .09281*** .01249 7.43 .0000 .06832 .11729
LPKP| 1.65439 1.99409 .83 .4067 -2.25395 5.56272
LPLP| -.00962 .09785 -.10 .9217 -.20140 .18216
LPMP| -.06595 1.31569 -.05 .9600 -2.64465 2.51275
LPEP| -.62841 .63243 -.99 .3204 -1.86795 .61114
LPFP| -.06397*** .02033 -3.15 .0017 -.10381 -.02412
|Variance parameters for compound error
Theta| 13.2651*** 2.90719 4.56 .0000 7.5671 18.9630
Sigmav| .10709*** .00980 10.93 .0000 .08788 .12629
--------+--------------------------------------------------------------------
Figure E62.13 Kernel Density Estimates for Gamma and Exponential Inefficiencies
E65: Data Envelopment Analysis E-44
The normal-exponential model results if P = 1. Computation of the function h(r,i) is the obstacle to
estimation. Beckers and Hammond (1987) derived a closed form expression, but the result has never
been operationalized – it is complex in the extreme. Greene (1990) attempted estimation by using a
crude approximation with Simpson‟s rule, but failed to obtain reasonable results. (See Ritter and
Simar (1997).)
A satisfactory solution is produced by the technique of maximum simulated likelihood. The
integral and its derivatives can be estimated consistently by Monte Carlo simulation. The crucial
result is that h(r,i) is the expectation of a random variable;
h(r,i) = E[zr | z 0]
where z ~ N[i, v2]
i = -i- v2
Therefore, h(r,i) is the expected value of zr where z has a truncated at zero normal distribution.
Thus, we estimate h(r,i) by using the mean of a sample of draws from this distribution. For given
values of i and i (i.e., yi, xi, , v, , r), h(r,i) is consistently estimated by
1 Q
hˆi q 1 ziqr
Q
where ziq is a random draw from the truncated normal distribution with mean parameter i and
variance parameter v. This produces the simulated log likelihood function
which for a given set of draws is a smooth and continuous function of the parameters.
E65: Data Envelopment Analysis E-45
Random draws from the truncated distribution are obtained using Geweke‟s method as
follows: Let
L = truncation point = 0 for this application
= the mean of untruncated distribution = -i - v2
= the standard deviation of the untruncated distribution = v
PL = [(L - ) / ]
F = one draw from U[0,1]
z = + -1[PL + F(1 - PL)]
Then, z = the draw from the truncated distribution.
Collecting all terms, then, this produces the simulated log likelihood function:
+ n[(P-1)log - log(P)]
P 1
1 1 i
q1 i v Fiq (1 Fiq )
Q
+ i log
Q v
i = yi - xi
i = -i- v2
and Fiq is a fixed set of Q draws from U[0,1] specific to the individual. Derivatives of h(r,i) and log
h(r,i) are also estimated by simulation. The JLMS efficiency measure has the simple form
The final consideration is the method of obtaining the draws. The default method is to use
the random number generators. Since this is a very computation intensive model, it is usually more
efficient to use Halton draws – you can use many fewer Halton draws than random draws to obtain
the same quality results. Halton draws are discussed in Section R24.7. To use Halton draws with
this estimator, add
; Halton
to the command. The number of points for either method is specified with
Thus, the selection operates through the heterogeneity component of the production model, not the
inefficiency. (Thus, observation is not viewed as a function of the level of inefficiency.)
The model is fit by maximum simulated likelihood. To request it, use LIMDEP‟s usual
format for sample selection models,
The model must be the base case, half normal, with no panel data application, no truncation, or
heteroscedasticity, etc. You may control the simulations with ; Halton and ; Pts for the simulation.
Efficiency and inefficiency estimates are saved as with other models with ; Eff and ; Techeff.
However, observations in the nonselected part of the sample are given missing values (-999) for any
of these computations. The PARTIALS and SIMULATE commands do not inherit the selection
model – these commands are not available after fitting this model.
E62.11.1 Application
The following creates a data set that conforms exactly to the assumptions of the model.
CALC ; Ran(123457) $
SAMPLE ; 1-2000 $
CREATE ; z1 = Rnn(0,1) ; z2 = Rnn(0,1) $
CREATE ; v1 = Rnn(0,1) ; v2 = Rnn(0,1) $
CREATE ; e1 = v1 ; e2 = .7071 * (v1+v2) $
CREATE ; ds = z1 + z2 + e1 ; d = ds > 0 $
CREATE ; u = Abs(Rnn(0,1)) ; x1 = Rnn(0,1) ; x2 = Rnn(0,1) $
CREATE ; y = x1 + x2 + e2 - u $
PROBIT ; Lhs = d ; Rhs = one,z1,z2 ; Hold $
FRONTIER ; Lhs = y ; Rhs = one,x1,x2 ; Selection $
E65: Data Envelopment Analysis E-47
-----------------------------------------------------------------------------
Binomial Probit Model
Dependent variable D
Log likelihood function -825.27526
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
D| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Index function for probability
Constant| .03616 .03525 1.03 .3051 -.03294 .10525
Z1| .96314*** .04604 20.92 .0000 .87291 1.05338
Z2| 1.01534*** .04702 21.59 .0000 .92318 1.10750
--------+--------------------------------------------------------------------
Warning 141: Iterations:current or start estimate of sigma nonpositive
Normal exit: 14 iterations. Status=0, F= 1916.202
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable Y
Log likelihood function -1916.20216
Estimation based on N = 2000, K = 6
Inf.Cr.AIC = 3844.4 AIC/N = 1.922
Variances: Sigma-squared(v)= 1.00545
Sigma-squared(u)= 1.07396
Sigma(u) = 1.03632
Sigma(v) = 1.00272
Sigma = 1.44202
Lambda = 1.03351
Sample Selection/Frontier Model
Murphy/Topel Corrected VC Matrix
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 -1662.32532
Chi-sq=2*[LogL(SF)-LogL(LS)] = -507.754
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
-----------------------------------------------------------------------------
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
Y| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -.04492 .10971 -.41 .6822 -.25994 .17011
X1| 1.00102*** .03357 29.82 .0000 .93522 1.06682
X2| .95627*** .03195 29.93 .0000 .89364 1.01890
Sigma(u)| 1.03632*** .13217 7.84 .0000 .77728 1.29537
Sigma(v)| 1.00272*** .05471 18.33 .0000 .89549 1.10995
Rho(w,v)| .77553*** .06187 12.54 .0000 .65427 .89679
--------+--------------------------------------------------------------------
E65: Data Envelopment Analysis E-48
(Note for convenience later, we have moved the scale parameters into the structural model.) To set
up the estimator, we now write w in its conditional on v form,
z v
Then, Prob[d = 1 or 0 | z,v] = (2d 1)
1 2
For the selected observations, d = 1, conditioned on v, the joint density for y and d is the product of
the marginals since conditioned on v, y and d are independent;
where u is the truncation at zero of a standard normal variable, so f(u) = 2(u), u>0. The Jacobian of
the transformation from u to y is 1/u, so by the change of variable, the conditional density is
2 (x v v) y
f ( y | x, v) ,(x v v) y 0.
u u
2 (x v v) y z v
f ( y, d 1| x, z, v) .
u u 1
2
E65: Data Envelopment Analysis E-49
To obtain the unconditional density, it is necessary to integrate v out of the conditional density.
Thus,
2 v v ( y x)) z v
f ( y, d 1| x, z) f (v)dv .
1
v u
2
u
The relevant term in the log likelihood is log f(y,d=1|x,z). For the nonselected observations, the
contribution to the log likelihood is the log of the unconditional probability of nonselection, which is
z v
Prob(d = 0|z) = v
f (v)dv .
1 2
The integrals do not exist in closed form, so these terms cannot be evaluated as is. Before
proceeding, we note the additional complication, x + vv - y = uu> 0, so the density f(v) is not the
standard normal that intuition might suggest; it is a truncated normal.
The integrals can be computed by simulation. By construction,
so by sampling from the distribution of v, we can compute the function of v and average to obtain the
integrals. In order to sample the draws on v, we note the implied truncation,
Draws from the truncated normal can be obtained using result (E-1) in Greene (2011). Let A equal a
draw from the uniform (0,1) population. The desired draw from the truncated normal distribution
will be
vr = -1 [(/v) + Ar(-/v)].
where the draws on vir are as shown above. Derivatives of this simulated log likelihood are obtained
numerically using finite differences.
E65: Data Envelopment Analysis E-50
Heteroscedasticity in v and/or u
Truncated normal with nonzero, heterogeneous mean in the underlying U
Heterogeneity in the parameter of the exponential or gamma distribution
Amsler et al.‟s „scaling model‟
This is a common approach. (See, e.g., Greene (2004a,b).) In this chapter, we present two other
methods of introducing observed heterogeneity in the frontier model, in the variance parameters and
in the mean of the underlying inefficiency.
A like result emerges in the truncated normal model. In the exponential model, the mean of ui equals its
standard deviation, while in the gamma model, it is a multiple, P1/2, of it. Thus, in all cases, as regards
ui, the term heteroscedasticity, while not inappropriate, is nonetheless ambiguous. These models cannot
be heteroscedastic without also having a heterogeneous mean. In what follows, therefore, we continue
to use the familiar terminology, but we emphasize the nature of the model as well.
E65: Data Envelopment Analysis E-51
The models of scale heterogeneity may extend either variance parameter with the
specification of the variance functions
There is no requirement that the same variables enter the two functions, and either or both may be
heterogeneous. The model specification is
; Heteroscedasticity or ; Het
and either or both of
; Hfv = variables in the variance of v
; Hfu = variables in the variance of u
If either variance is not given, it is assumed to be constant. The variance function is the exponential
format used throughout LIMDEP If either variance is unspecified, the implied model is ji2 = exp(
or ) which is the same as
is the default, normal-half normal stochastic frontier model. It provides identical estimates. (Try it.)
A constant (one) is automatically inserted into both lists if you do not include it. This form may be
used with the normal-half normal and normal-truncated normal models.
where P = 1 in the exponential model. The exponential heteroscedasticity model for ui is extended to
these two models by using
i = exp(-zi).
With this parameterization, the estimates from this model will be comparable to those for the half
normal and truncated normal models. (See the examples below.) To request this form, use
The list should not contain a constant term, one. This may be used in all implementations of the
exponential gamma model. Note, however, that in the panel data settings, the parameter is assumed
to be time invariant. The values for zi are taken from the data record for the last period for firm i.
We will return to this subject below. The symmetric component, v, may also be heteroscedastic, as
in the other models, with
( w)
Eˆ [u | ] 2
w , v u , w =S/
1 1 ( w)
( w)
Eˆ [u | ] v w , w = (S/v + v)
1 ( w)
for the exponential models. These functions are evaluated for each observation at
i = u,i / v,i
and i2 = u,i2 + v,i2
for the half normal model and v,i and i likewise in the exponential and gamma models.
E63.2.4 Application
The estimates below show a production frontier based on the six inputs. The second set of
results presents the heteroscedastic model, with the variance of v a function of the log of the average
stage length and the variance of u depending on the load factor and the log of the number of points
served. We examine the efficiency results, then compute the average partial effects of the
environmental variables on technical efficiency.
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LQ
Log likelihood function 108.43918
Estimation based on N = 256, K = 9
Variances: Sigma-squared(v)= .01902
Sigma-squared(u)= .01692
Sigma(v) = .13791
Sigma(u) = .13007
Sigma = Sqr[(s^2(u)+s^2(v)]= .18957
Gamma = sigma(u)^2/sigma^2 = .47074
Var[u]/{Var[u]+Var[v]} = .24425
Stochastic Production Frontier, e = v-u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 108.07431
Chi-sq=2*[LogL(SF)-LogL(LS)] = .730
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -2.98823*** .72136 -4.14 .0000 -4.40206 -1.57439
LL| -.42909*** .06315 -6.79 .0000 -.55287 -.30530
LP| .44533*** .09498 4.69 .0000 .25917 .63149
LF| .37257*** .07038 5.29 .0000 .23463 .51052
LE| 2.09473*** .68790 3.05 .0023 .74647 3.44299
LM| .69910*** .07580 9.22 .0000 .55054 .84766
LK| -2.09806*** .76556 -2.74 .0061 -3.59853 -.59759
|Variance parameters for compound error
Lambda| .94309*** .16870 5.59 .0000 .61244 1.27373
Sigma| .18957*** .00064 297.81 .0000 .18832 .19082
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E65: Data Envelopment Analysis E-54
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LQ
Log likelihood function 149.30854
Estimation based on N = 256, K = 12
Inf.Cr.AIC = -274.6 AIC/N = -1.073
Variances: Sigma-squared(v)= .01292
Sigma-squared(u)= .03575
Sigma(v) = .11367
Sigma(u) = .18907
Sigma = Sqr[(s^2(u)+s^2(v)]= .22061
Gamma = sigma(u)^2/sigma^2 = .73450
Var[u]/{Var[u]+Var[v]} = .50132
Variances averaged over observations
Stochastic Production Frontier, e = v-u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 2
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 3
LogL when sigma(u)=0 108.07431
Chi-sq=2*[LogL(SF)-LogL(LS)] = 82.468
Kodde-Palm C*: 95%: 8.761, 99%: 12.483
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -3.29243*** .72664 -4.53 .0000 -4.71662 -1.86824
LL| -.47507*** .08890 -5.34 .0000 -.64932 -.30083
LP| .50435*** .10452 4.83 .0000 .29950 .70920
LF| .53204*** .07550 7.05 .0000 .38406 .68003
LE| 2.36654*** .69245 3.42 .0006 1.00936 3.72372
LM| .53413*** .08670 6.16 .0000 .36419 .70406
LK| -2.43136*** .77258 -3.15 .0016 -3.94558 -.91713
|Parameters in variance of v (symmetric)
Constant| -3.97891*** .86601 -4.59 .0000 -5.67626 -2.28155
LSTAGE| -.06406 .13359 -.48 .6315 -.32590 .19777
|Parameters in variance of u (one sided)
Constant| 9.96191** 4.51238 2.21 .0273 1.11781 18.80600
LOADFCTR| -25.9711*** 9.37571 -2.77 .0056 -44.3471 -7.5950
POINTS| -.00353 .01288 -.27 .7840 -.02877 .02171
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
The figure below displays the kernel density estimators for the two sets of estimated
inefficiencies. The upper one is for the heteroscedastic model. The figure shows clearly the
influence of the heterogeneity. The means of the two distributions are virtually the same, but the
variance in the heteroscedastic model is considerably higher.
E65: Data Envelopment Analysis E-55
Figure E63.1 Kernel Estimators for Density of E[u|] with and without Heteroscedasticity
---------------------------------------------------------------------
Partial Effects for JLMS Estimator in Normal/het SF Model
Partial Effects Averaged Over Observations
* ==> Partial Effect for a Binary Variable
---------------------------------------------------------------------
Partial Standard
(Delta method) Effect Error |t| 95% Confidence Interval
---------------------------------------------------------------------
LSTAGE -.00034 .00071 .48 -.00174 .00105
LOADFCTR .62934 .17576 3.58 .28485 .97382
POINTS .00009 .00031 .28 -.00052 .00069
---------------------------------------------------------------------
E65: Data Envelopment Analysis E-56
where i = ui
2
ui
2
i = ui / vi
ui2 = exp(zi)
vi 2
= exp(wi),
where S = +1 for a production frontier and -1 for a cost frontier. Likewise, for the truncation model,
We build the structure of the model with two freely varying variance parameters, u,i and v,i, rather
than the reduced form parameters and . The use of i as a free parameter would not be
appropriate because the numerator and denominator of i must be allowed to vary freely and
independently. A like consideration rules out the composed parameter i. The formulation of the
log likelihood and its derivatives follows the results given earlier for the homogeneous cases. Where
the derivatives with respect to and emerge, we use the chain rule to differentiate with respect to
u,i and v,i first. Note that the independent parameter u and v have been absorbed into the
exponential functions. Thus, v is exp(0). This ensures that the variances are always positive.
The normal-gamma and normal-exponential models are not reparameterized. The log
likelihood for the exponential model with variance heterogeneity is
The sign change in i is used to make the normal-exponential model comparable to the normal-half
normal model, since Var[ui] = 1/i2.
E65: Data Envelopment Analysis E-57
y = x + v - u, u = |U|
U ~ N[,u2]
v ~ N[0,v2]
(With a constant term in the model, no similar parameter can be introduced into the distribution of v.)
The command for estimating this model is
The specification of the cost frontier and the estimator of technical inefficiency are requested in the
same fashion,
; Cost
and ; Eff = variable name
Other optional parts of the command are the same as that for the normal-half normal model.
We note, this model is extremely volatile, owing to the rather weak identification of the
parameter . It is difficult to distinguish the mean from the variance parameter in this model. In the
truncation model,
E[ui] = + u(/u)/(/u).
This implies that u and can covary so as to produce little or no variation in the expectation of ui.
The likelihood is not a function of the square of ui, so this mean is the only source of information
about these two parameters. (By totally differentiating the expected value, one can solve for the
implicit relationship, d/du that produces dE[ui] = 0.) The example below suggests how this aspect
of the model influences (or fails to) the estimates of inefficiency. For purposes of the JLMS
estimator for the half normal model, when the mean of U is a nonzero , the argument to the
function is replaced with
w = S/ - /().
E63.3.1 Application
The results below show estimates of a stochastic cost frontier with the half normal then the
truncated normal specifications. The additional parameterization appears to have had a large impact
on the results; the estimates are noticeably different. The plot of the two sets of inefficiency
estimates suggest that the effect of the new specification has been little more than to double the
estimated values from the model – the dashed line in the figure shows the function uTN = 2uHN. The
extremely large estimates of and the standard error do suggest that something is amiss with the
model, however.
The commands are:
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LQ
Log likelihood function 109.49695
Estimation based on N = 256, K = 10
Variances: Sigma-squared(v)= .01896
Sigma-squared(u)= 2.48813
Sigma(v) = .13771
Sigma(u) = 1.57738
Sigma = Sqr[(s^2(u)+s^2(v)]= 1.58338
Gamma = sigma(u)^2/sigma^2 = .99244
Var[u]/{Var[u]+Var[v]} = .97946
Stochastic Production Frontier, e = v-u
Half Normal:u(i)=|U(i)|; frontier model
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 108.07431
Chi-sq=2*[LogL(SF)-LogL(LS)] = 2.845
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -3.11541*** .77143 -4.04 .0001 -4.62739 -1.60343
LL| -.44532*** .07797 -5.71 .0000 -.59814 -.29249
LP| .46908*** .11368 4.13 .0000 .24628 .69188
LF| .37437*** .07465 5.02 .0000 .22807 .52068
LE| 2.20830*** .73883 2.99 .0028 .76023 3.65637
LM| .67741*** .09341 7.25 .0000 .49433 .86048
LK| -2.20620*** .82402 -2.68 .0074 -3.82126 -.59115
|Offset [mean=mu(i)] parameters in one sided error
Mu| -31.5468 5061.203 -.01 .9950 -9951.3228 9888.2292
|Variance parameters for compound error
Lambda| 11.4545 907.8501 .01 .9899 -1767.8991 1790.8081
Sigma| 1.58338 124.7546 .01 .9899 -242.93113 246.09790
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
Descriptive Statistics
--------+---------------------------------------------------------------------
Variable| Mean Std.Dev. Minimum Maximum Cases Missing
--------+---------------------------------------------------------------------
U| .902312 .035500 .703534 .963108 256 0
UT| .925474 .039335 .608274 .972355 256 0
--------+---------------------------------------------------------------------
E65: Data Envelopment Analysis E-60
y = x + v - u, u = |U|
U ~ N[,u2]
v ~ N[0,v2]
is due to Stevenson (1980). Note that the inefficiency term is the absolute value of a normally
distributed variable with a nonzero mean. Battese and Coelli proposed an apparently different
formulation of the truncation model;
u = + w
w > -.
This is actually the same model. You can obtain the estimates using this alternative formulation with
; Model = BC95
in place of ; Model = T. The log likelihood for this formulation involves a one to one
reparameterization of the Stevenson model, which has slightly different numerical properties. You
can see this in the application below. The estimated inefficiency and efficiency values produced by
the two models are the same to five or six digits, however.
E65: Data Envelopment Analysis E-61
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LQ
Log likelihood function 109.48819
Variances: Sigma-squared(v)= .01918
Sigma-squared(u)= 2.25705
Sigma(v) = .13850
Sigma(u) = 1.50235
Sigma = Sqr[(s^2(u)+s^2(v)]= 1.50872
Gamma = sigma(u)^2/sigma^2 = .99157
Var[u]/{Var[u]+Var[v]} = .97715
Stochastic Production Frontier, e = v-u
Battese/Coelli 1995 truncated normal model
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 1
Deg. freedom for inefficiency model: 2
LogL when sigma(u)=0 108.07431
Chi-sq=2*[LogL(SF)-LogL(LS)] = 2.828
Kodde-Palm C*: 95%: 5.138, 99%: 8.273
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -3.09929*** .76919 -4.03 .0001 -4.60687 -1.59172
LL| -.44370*** .07771 -5.71 .0000 -.59600 -.29140
LP| .46535*** .11351 4.10 .0000 .24288 .68781
LF| .37430*** .07432 5.04 .0000 .22863 .51997
LE| 2.18991*** .73664 2.97 .0030 .74613 3.63369
LM| .67921*** .09322 7.29 .0000 .49651 .86191
LK| -2.18647*** .82171 -2.66 .0078 -3.79700 -.57594
|Offset [mean=z(i)*delta] parameters in one sided error
Constant| -29.6062 4821.053 -.01 .9951 -9478.6972 9419.4848
|Variance parameters for compound error
Gamma| .99157 1.34377 .74 .4606 -1.64216 3.62531
SigmaSqd| 2.27624 363.5754 .01 .9950 -710.31839 714.87086
--------+--------------------------------------------------------------------
(Stevenson formulation)
Log likelihood function 94.86417
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -3.11541*** .77143 -4.04 .0001 -4.62739 -1.60343
LL| -.44532*** .07797 -5.71 .0000 -.59814 -.29249
LP| .46908*** .11368 4.13 .0000 .24628 .69188
LF| .37437*** .07465 5.02 .0000 .22807 .52068
LE| 2.20830*** .73883 2.99 .0028 .76023 3.65637
LM| .67741*** .09341 7.25 .0000 .49433 .86048
LK| -2.20620*** .82402 -2.68 .0074 -3.82126 -.59115
|Offset [mean=mu(i)] parameters in one sided error
Mu| -31.5468 5061.203 -.01 .9950 -9951.3228 9888.2292
|Variance parameters for compound error
Lambda| 11.4545 907.8501 .01 .9899 -1767.8991 1790.8081
Sigma| 1.58338 124.7546 .01 .9899 -242.93113 246.09790
E65: Data Envelopment Analysis E-62
u = / 1 2 .
= /()
produces the log likelihood for this model,
i = zi
we simply replace with i= zi, then recover the parameter vector from the same transformation
as before, = .
For purposes of the JLMS estimator for the half normal model, when the mean of U is a
nonzero , the argument to the function is replaced with
w = S/ - /().
to specify the heterogeneity in mean model, Ui ~ N[zi, u2]. In formulating this model, though it is
not required, you should include a constant in zi (the Rh2 variables) so that the homogeneous model
becomes a special case. Also, if you are fitting a panel data version of this, note that the assumption
underlying the model is that the same ui occurs in every period. Therefore, the zi should be the
same in every period. LIMDEP will assume this is the case, and only use the Rh2 variables provided
for the first period.
E65: Data Envelopment Analysis E-63
Note that since both variance functions have a free multiplicative constant, you should not include
one in either variable list.
In the absence of the Rh2 list, the mean of the underlying truncated variable is taken to be a
constant to be estimated. This formulation encompasses all of Stevenson (1980), Reifschneider and
Stevenson (1991), Huang and Liu (1994), and Battese and Coelli (1995). (Notwithstanding the
assertion in the Battese and Coelli paper, the latter is not a panel data treatment as observations are
still assumed to be independent.)
To illustrate the truncated normal estimator, we have refit the stochastic frontier production
function with a complete set of firm dummy variables (less the last one) and the load factor variable
in the mean of the underlying distribution. In the second model below, we have made the variance
of v a function of the log of the average stage length. The command set begins with a small repair to
the data set. One of the firms has no observations for the load factor, stage length or points served
variables – they are coded as zero in the data. These observations are bypassed, then the firm
dummies for the fixed effects model are assembled.
SAMPLE ; All $
REJECT ; loadfctr = 0 $
CREATE ; i = Seq(firm) $
CREATE ; Expand(i,0) $
CREATE ; lk = Log(k) $
NAMELIST ; xp = one,lf,lm,le,ll,lp,lk $
FRONTIER ; Lhs = lq ; Rhs = xp ; Model = T ; Rh2 = loadfctr,_i_ $
FRONTIER ; Lhs = lq ; Rhs = xp ; Model = T ; Rh2 = loadfctr,_i_
; Het ; Hfv = lstage $
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LQ
Log likelihood function 196.20748
Estimation based on N = 256, K = 34
Inf.Cr.AIC = -324.4 AIC/N = -1.267
Model estimated: Aug 22, 2011, 22:29:09
Variances: Sigma-squared(v)= .00960
Sigma-squared(u)= .00389
Sigma(v) = .09799
Sigma(u) = .06241
Sigma = Sqr[(s^2(u)+s^2(v)]= .11618
Gamma = sigma(u)^2/sigma^2 = .28856
Var[u]/{Var[u]+Var[v]} = .12845
Stochastic Production Frontier, e = v-u
Half Normal:u(i)=|U(i)|; frontier model
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 25
Deg. freedom for inefficiency model: 26
LogL when sigma(u)=0 108.07431
Chi-sq=2*[LogL(SF)-LogL(LS)] = 176.266
Kodde-Palm C*: 95%:38.301, 99%: 45.026
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -2.92400*** .68225 -4.29 .0000 -4.26118 -1.58682
LF| .31938*** .09026 3.54 .0004 .14246 .49629
LM| .81647*** .08387 9.73 .0000 .65209 .98086
LE| 1.99934*** .64368 3.11 .0019 .73776 3.26092
LL| -.42790*** .10954 -3.91 .0001 -.64260 -.21321
LP| .42291*** .10529 4.02 .0001 .21654 .62929
LK| -2.07145*** .72267 -2.87 .0042 -3.48786 -.65503
|Offset [mean=mu(i)] parameters in one sided error
LOADFCTR| -.83124 6.87337 -.12 .9037 -14.30280 12.64031
I01| .63250 4.90139 .13 .8973 -8.97405 10.23904
I02| .58118 4.27763 .14 .8919 -7.80282 8.96519
(Firms 3-21 omitted)
I22| .45249 4.00889 .11 .9101 -7.40480 8.30977
I23| .64687 99.45841 .01 .9948 -194.28803 195.58176
I24| -.19804 7.26011 -.03 .9782 -14.42760 14.03152
|Variance parameters for compound error
Lambda| .63686** .28984 2.20 .0280 .06879 1.20494
Sigma| .11618*** .01008 11.53 .0000 .09643 .13593
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E65: Data Envelopment Analysis E-65
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LQ
Log likelihood function 215.58601
Estimation based on N = 256, K = 35
Variances: Sigma-squared(v)= .00634
Sigma-squared(u)= .01037
Sigma(u) = .10183
Sigma(v) = .07961
Sigma = Sqr[(s^2(u)+s^2(v)]= .12926
Variances averaged over observations
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 25
Deg. freedom for inefficiency model: 26
LogL when sigma(u)=0 108.07431
Chi-sq=2*[LogL(SF)-LogL(LS)] = 215.023
Kodde-Palm C*: 95%:38.301, 99%: 45.026
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -1.98442* 1.05055 -1.89 .0589 -4.04346 .07463
LF| .45669*** .11002 4.15 .0000 .24105 .67233
LM| .59013*** .10421 5.66 .0000 .38589 .79437
LE| 1.11856 1.00928 1.11 .2677 -.85959 3.09671
LL| -.29237*** .10923 -2.68 .0074 -.50646 -.07827
LP| .31311** .14333 2.18 .0289 .03220 .59402
LK| -1.14743 1.10875 -1.03 .3007 -3.32054 1.02568
|Mean of underlying truncated distribution
LOADFCTR| -2.20067*** .42161 -5.22 .0000 -3.02701 -1.37433
I01| 1.44767*** .25736 5.63 .0000 .94326 1.95208
I02| 1.39624*** .22401 6.23 .0000 .95718 1.83529
(Firms 3-22 omitted)
I24| 1.29355*** .24998 5.17 .0000 .80360 1.78349
|Scale parms. for random components of e(i)
ln_sgmaU| -2.28443*** .02100 -108.79 .0000 -2.32559 -2.24328
ln_sgmaV| -3.22203*** 1.20573 -2.67 .0075 -5.58522 -.85884
|Heteroscedasticity in variance of symmetric v(i)
LSTAGE| .11855 .19755 .60 .5485 -.26865 .50574
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E65: Data Envelopment Analysis E-66
The mean and standard deviation of the underlying truncated normal variable ui are scaled by the
same linear function of the data. We are skeptical of the linear scaling of the variance, and propose
our usual exponential form instead. The linear form may be natural for the mean, but it allows the
variance to be negative, which is unacceptable. The model used here is
The Alvarez model results if = . Otherwise, we allow these to be free and to produce another
variant of the frontier model. Note that as stated, this model is now merely a change of the normal-
truncated normal model with heteroscedasticity in which the variables enter the truncation mean
function in the exponential function rather than linearly.
The equality constrained scaling model is requested with
Note in this case, Rh2 and Hfu give the same list. To obtain the scaling model without forcing the
equality of and , use
Note, ; Model = Scaling in the equality constrained case and ; Model = S when the equality
constraint is relaxed. (In this formulation, the variable lists could differ.) To constrain = 0, which
just produces the heteroscedasticity model, use
To constrain = 0, you would use the available setup for the truncated normal form, but ; Model = S
rather than ; Model = T to obtain the exponential scaling of the mean.
Finally, with both = 0 and = 0, this is just the standard normal-truncated normal model.
Technical Details
The implementation of the scaling model in LIMDEP is just a version of the truncation
model with heteroscedasticity. The modifications of that model are:
The constant terms in the mean and variance are enforced by the program.
The mean function is exponential.
In the first form of the model, a constraint is imposed that the coefficients in the mean and
variance functions are the same.
As Alvarez et al. note in their paper, this model is not supported by any particular theory of the
frontier framework. They suggest it as a natural extension of the familiar model with truncation.
Rather, they argue that the unnatural form of the model would be the one with different scaling
factors in the mean and variance functions.
Application
To illustrate the scaling model, we use the airlines cost data. The cost function is fit with
truncation mean and variance functions that depend on the load factor and (log of) the average stage
length. The equality constraint is imposed in the first model and relaxed in the second.
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LC
Log likelihood function 172.27160
Estimation based on N = 256, K = 13
Variances: Sigma-squared(v)= .01528
Sigma-squared(u)= .00000
Sigma(v) = .12361
Sigma(u) = .00169
Sigma = Sqr[(s^2(u)+s^2(v)]= .12363
Stochastic Frontier Scaling Model
Mean scale factor for E[u] = .6996
Mean scale factor for V[u] = .6996
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 2
Deg. freedom for truncation mean: 2
Deg. freedom for inefficiency model: 5
LogL when sigma(u)=0 157.91523
Chi-sq=2*[LogL(SF)-LogL(LS)] = 28.713
Kodde-Palm C*: 95%:10.371, 99%: 14.325
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LC| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| 18.9477 27.00668 .70 .4829 -33.9844 71.8798
LY| .95234*** .02117 44.98 .0000 .91084 .99383
LY2| .07740*** .01534 5.04 .0000 .04733 .10747
LPKP| 1.50434 1.86479 .81 .4198 -2.15058 5.15926
LPLP| .12682 .08328 1.52 .1278 -.03640 .29003
LPMP| -.16640 1.21907 -.14 .8914 -2.55574 2.22294
LPEP| -.52809 .60356 -.87 .3816 -1.71105 .65488
LPFP| .00151 .02141 .07 .9436 -.04045 .04348
|Mean of Truncated Distribution, Mu then scale
Mu_0| 2.50985 11.12070 .23 .8214 -19.28633 24.30603
LOADFCTR| -.56559 3.85231 -.15 .8833 -8.11597 6.98479
LSTAGE| -.00823 .05624 -.15 .8837 -.11845 .10200
|Standard Deviation of u: Sigma(u) then scale
Sigmau_0| .00241 9.18604 .00 .9998 -18.00191 18.00673
LOADFCTR| -.56559 3.85231 -.15 .8833 -8.11597 6.98479
LSTAGE| -.00823 .05624 -.15 .8837 -.11845 .10200
|Standard deviation of v
Sigma(v)| .12361 .08711 1.42 .1559 -.04713 .29435
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E65: Data Envelopment Analysis E-69
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LC
Log likelihood function 173.52520
Estimation based on N = 256, K = 15
Variances: Sigma-squared(v)= .01334
Sigma-squared(u)= .00121
Sigma(v) = .11551
Sigma(u) = .03476
Sigma = Sqr[(s^2(u)+s^2(v)]= .19230
Stochastic Frontier Scaling Model
Mean scale factor for E[u] = .3459
Mean scale factor for V[u] = .2261
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 2
Deg. freedom for truncation mean: 2
Deg. freedom for inefficiency model: 5
LogL when sigma(u)=0 157.91523
Chi-sq=2*[LogL(SF)-LogL(LS)] = 31.220
Kodde-Palm C*: 95%:10.371, 99%: 14.325
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LC| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| 11.6452 24.94703 .47 .6406 -37.2501 60.5405
LY| .94078*** .02140 43.97 .0000 .89884 .98272
LY2| .06680*** .01579 4.23 .0000 .03585 .09776
LPKP| .85146 1.94378 .44 .6614 -2.95828 4.66120
LPLP| .16345** .07956 2.05 .0399 .00751 .31939
LPMP| .25417 1.26886 .20 .8412 -2.23275 2.74109
LPEP| -.34167 .62932 -.54 .5872 -1.57511 .89178
LPFP| .00164 .02164 .08 .9395 -.04078 .04406
|Mean of Truncated Distribution, Mu then scale
Mu_0| 1.92288*** .44030 4.37 .0000 1.05991 2.78584
LOADFCTR| -1.74305 4.08382 -.43 .6695 -9.74720 6.26110
LSTAGE| -.01930 .04649 -.42 .6781 -.11042 .07182
|Standard Deviation of u: Sigma(u) then scale
Sigmau_0| .15374 1.11571 .14 .8904 -2.03301 2.34049
LOADFCTR| -14.5014 10.21457 -1.42 .1557 -34.5216 5.5188
LSTAGE| 1.02454 1.26499 .81 .4180 -1.45479 3.50388
|Standard deviation of v
Sigma(v)| .11551*** .00793 14.56 .0000 .09996 .13106
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E65: Data Envelopment Analysis E-70
y = x + v - u,
where y is the observed outcome (goal attainment), x + v is the optimal, frontier goal (e.g.,
maximal production output or minimum cost) pursued by the individual, x is the deterministic part
of the frontier and v ~ N[0,v2] is the stochastic part. The two parts together constitute the
„stochastic frontier.‟ The amount by which the observed individual fails to reach the optimum (the
frontier) is u, where
u = |U| and U ~ N[0,u2]
(change to v + u for a stochastic cost frontier or any setting in which the optimum is a minimum). In
this context, u is the „inefficiency.‟ This is the normal-half normal model which forms the basic
form of the stochastic frontier model. Chapters E62 and E63 developed several versions of the
stochastic frontier model suitable for cross section and pooled data sets. This chapter will develop
versions of the model constructed specifically for panel data.
The panel models developed here will share features with other panel models in LIMDEP, as
presented in Chapters R22-R25. As in other settings, panels in all models may be unbalanced. Panels
are identified by
SETPANEL ; … $
then ; Panel
in the command, or ; Pds = group count
E65: Data Envelopment Analysis E-71
Nearly all of the models to be presented here actually require panel data, but a few will work, albeit
not as well as otherwise, with ; Pds = 1, i.e., with a cross section. This will be specifically noted
below when it is the case. Second, in all models, the cost form as opposed to the production form is
requested with
; Cost
This and other model specifications are generally the same as the cross sectional cases.
with S = +1 for a production model and -1 for a cost model. The inefficiency component is assumed
to be time invariant. The base case is the normal-half normal model
ui = |Ui|, Ui ~ N[0,2].
This is a direct extension of the cross section variant discussed earlier. Several model formulations
are grouped in this class. The command for the Pitt and Lee group of models is given by changing
the base case specifications to
Pitt and Lee is the default panel data model. The only necessary change for the default case is
specification of the panel with ; Panel. As in the cross section case, the normal-exponential case is
requested with
; Model = Exponential
(The ; Model = T is not needed.) The truncation model may not be combined with the exponential
specification; it is only supported for the normal-truncated normal form.
NOTE: The gamma model does not have a random effects (panel data) version. The model
extensions, such as the scaling model and sample selection described in Chapter E63 likewise do not
support a Pitt and Lee style random effects version.
There is an important consideration for the truncation version with heterogeneous mean. If
you are fitting a panel data version of this model, note that the assumption underlying the model is
that the same ui occurs in every period. Therefore, the zi must be the same in every period.
LIMDEP will assume this is the case, and only use the Rh2 variables provided for the first period.
E65: Data Envelopment Analysis E-72
When the random effects model is estimated, maximum likelihood estimates of the cross
section models are always computed first to obtain the starting values. This will produce a full set of
results which will ignore the panel nature of the data set. A second full set of results will then follow
for the random effects model.
The model estimates retained for all cases are
Use ; Par to retain the additional parameters in b and varb. As seen in the applications below, the
parameters estimated in each case will differ depending on the model formulation. The ancillary
parameters that are estimated for the various models are the same ones saved by the cross section
versions. All models save sy, ybar, nreg, kreg, and logl as well as s, b, varb, etc.
WARNING: Numerous experiments and applications have suggested that the normal-truncated
normal model is a difficult one to estimate. Identification appears to be highly variable, and small
variations in the data can produce large variation in the results. The model often fails to converge
even when convergence of the restricted model with zero underlying mean is routine.
NAMELIST ; x = one, … $
CREATE ; y = the outcome variable $
SETPANEL ; … $
Model 1 = pooled
FRONTIER ; Lhs = y ; Rhs = x $
Model 2 = random effects half normal
FRONTIER ; Lhs = y ; Rhs = x ; Panel $
Model 3 = random effects exponential
FRONTIER ; Lhs = y ; Rhs = x ; Panel ; Model = Exponential $
Model 4 = random effects normal heteroscedastic in u or v only
FRONTIER ; Lhs = y ; Rhs = x ; Panel ; Het ; Hfv = … $
FRONTIER ; Lhs = y ; Rhs = x ; Panel ; Het ; Hfu = … $
Model 5 = random effects normal doubly heteroscedastic
FRONTIER ; Lhs = y ; Rhs = x ; Panel ; Het ; Hfv = … ; Hfu = … $
Model 6 = random effects truncated normal
FRONTIER ; Lhs = y ; Rhs = x ; Panel ; Rh2 = one, … $
Model 7 = random effects truncated normal, singly or doubly heteroscedastic
FRONTIER ; Lhs = y ; Rhs = x ; Panel ; Rh2 = one, …
; Het ; Hfv = … ; Hfu = … $
The Pitt and Lee model forms assume that the inefficiency is time invariant. Thus, the
estimate of ui is repeated for each observation in the group. An example below illustrates.
E65: Data Envelopment Analysis E-73
E64.3.2 Applications
The following illustrates a few of the numerous formats of the random effects frontiers. The
data set used is the Swiss railroad data used in Greene (2011, Table F19.1). These data are provided
with the program as swissrailroads.lpj. The variables used here are
ct = total cost
pk = capital price
pe = electricity price
pl = labor price
q2 = passenger output – passenger km
q3 = freight output – ton km
rack = dummy variable for „rack rail‟ in network
tunnel = dummy variable for network with tunnels over 300 meters on average
virage = dummy variable for networks with narrow radius curvature
narrow_t = dummy variable for narrow track (1m as opposed to standard 1.435m).
Preparing the data set includes bypassing one firm for which there is only a single year of data. For
the remaining 49 firms, Ti is a mixture 3, 7, 10, 12 or 13. Figure E64.1 details the distribution of
group sizes.
Descriptive statistics for the data are shown below. Variables with names beginning with „M‟ are
firm means, repeated for each year for the firm.
We fit four models to illustrate the estimator, the pooled normal-half normal, pooled normal-
truncated (heterogeneous), basic Pitt and Lee and a full model with time invariant inefficiency,
truncation (heterogeneous) and double heteroscedasticity.
E65: Data Envelopment Analysis E-74
--------+---------------------------------------------------------------------
Variable| Mean Std.Dev. Minimum Maximum Cases Missing
--------+---------------------------------------------------------------------
ID| 25.48760 14.60037 1.0 51.0 605 0
YEAR| 90.91570 3.692372 85.0 97.0 605 0
NI| 12.58347 1.305259 1.0 13.0 605 0
STOPS| 20.42479 18.48285 4.0 121.0 605 0
NETWORK| 39431.66 56642.38 3898.0 376997.0 605 0
LABOREXP| 12801.95 26232.69 951.0 173549.0 605 0
STAFF| 170.3810 333.0317 11.0 1934.0 605 0
ELECEXP| 968.1521 1944.830 14.0 14737.0 605 0
KWH| 7602.221 15608.39 82.0 104923.0 605 0
TOTCOST| 22470.44 42283.57 1534.0 280871.0 605 0
NARROW_T| .676033 .468375 0.0 1.0 605 0
RACK| .234711 .424169 0.0 1.0 605 0
TUNNEL| .188430 .391379 0.0 1.0 605 0
T| 5.915702 3.692372 0.0 12.0 605 0
Q1| 813914.0 1083923 61000.0 6409000 605 0
Q2| .308145D+08 .550599D+08 409000.0 .311000D+09 605 0
Q3| .101934D+08 .527303D+08 150.0 .477000D+09 605 0
CT| 26728.37 49883.51 2120.968 307433.4 605 0
PL| 86051.77 6484.535 60932.91 104930.4 605 0
PE| .157485 .022766 .076344 .265182 605 0
PK| 4534.491 2128.307 1040.323 14466.06 605 0
VIRAGE| .715702 .451452 0.0 1.0 605 0
LABOR| 52.40245 9.598136 20.03025 73.11581 605 0
ELEC| 4.044504 1.422098 .568412 9.311660 605 0
CAPITAL| 43.55305 9.461303 23.88916 77.33154 605 0
LNCT| 11.30622 1.101691 9.462956 14.57019 605 0
LNQ1| 13.06322 1.010039 11.01863 15.67321 605 0
LNQ2| 16.31759 1.339167 12.92147 19.55500 605 0
LNQ3| 12.49439 2.716709 5.010635 19.98343 605 0
LNNET| 3.200860 .908512 1.360464 5.932237 605 0
LNPL| 13.21935 .163565 12.60449 13.77599 605 0
LNPE| -1.859557 .152870 -2.572503 -1.327338 605 0
LNPK| 10.17950 .438886 8.740266 11.37466 605 0
E65: Data Envelopment Analysis E-75
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LNC
Log likelihood function -209.42340
Estimation based on N = 604, K = 7
Inf.Cr.AIC = 432.8 AIC/N = .717
Variances: Sigma-squared(v)= .07332
Sigma-squared(u)= .12333
Sigma(v) = .27077
Sigma(u) = .35119
Sigma = Sqr[(s^2(u)+s^2(v)]= .44345
Gamma = sigma(u)^2/sigma^2 = .62716
Var[u]/{Var[u]+Var[v]} = .37937
Stochastic Cost Frontier Model, e = v+u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 -210.45352
Chi-sq=2*[LogL(SF)-LogL(LS)] = 2.060
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LNC| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -10.0907*** 1.14284 -8.83 .0000 -12.3306 -7.8507
LNQ2| .64179*** .01371 46.80 .0000 .61491 .66867
LNQ3| .06855*** .00655 10.46 .0000 .05570 .08139
LPLE| .53971*** .08858 6.09 .0000 .36610 .71333
LPKE| .26045*** .03260 7.99 .0000 .19655 .32435
|Variance parameters for compound error
Lambda| 1.29697*** .13854 9.36 .0000 1.02545 1.56850
Sigma| .44345*** .00056 789.05 .0000 .44235 .44455
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E65: Data Envelopment Analysis E-76
This is the original Pitt and Lee normal-half normal model with time invariant inefficiency.
In comparison to the pooled model above, u has tripled and v has decreased by two thirds. The
assumption of time invariance of the inefficiency produces a large reallocation of the random
components between noise and inefficiency. This is evident in the kernel estimate below as well.
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LNC
Log likelihood function 527.11659
Estimation based on N = 604, K = 7
Inf.Cr.AIC = -1040.2 AIC/N = -1.722
Stochastic frontier based on panel data
Estimation based on 49 individuals
Variances: Sigma-squared(v)= .00621
Sigma-squared(u)= .92297
Sigma(v) = .07879
Sigma(u) = .96071
Sigma = Sqr[(s^2(u)+s^2(v)]= .96394
Gamma = sigma(u)^2/sigma^2 = .99332
Var[u]/{Var[u]+Var[v]} = .98183
Stochastic Cost Frontier Model, e = v+u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 -210.45352
Chi-sq=2*[LogL(SF)-LogL(LS)] = 1475.140
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LNC| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -7.25643*** .24767 -29.30 .0000 -7.74185 -6.77101
LNQ2| .36259*** .01503 24.12 .0000 .33312 .39205
LNQ3| .01902*** .00240 7.94 .0000 .01432 .02372
LPLE| .64148*** .02112 30.38 .0000 .60009 .68287
LPKE| .30842*** .00700 44.08 .0000 .29471 .32214
|Variance parameters for compound error
Lambda| 12.1932** 5.55909 2.19 .0283 1.2975 23.0888
Sigma(u)| .96071*** .13303 7.22 .0000 .69998 1.22145
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E65: Data Envelopment Analysis E-77
This is the same model as immediately above, with the additional assumption that the
inefficiency is time invariant. Compared to the previous specification, u has now increased by a
factor of 30 while v has nearly vanished, falling from 0.27 to 0.005, that is, by a factor of 50.
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LNC
Log likelihood function 532.94237
Estimation based on N = 604, K = 11
Inf.Cr.AIC = -1043.9 AIC/N = -1.728
Variances: Sigma-squared(v)= .00003
Sigma-squared(u)= .76238
Sigma(u) = .87314
Sigma(v) = .00543
Sigma = Sqr[(s^2(u)+s^2(v)]= .87316
Variances averaged over observations
Stochastic frontier based on panel data
Estimation based on 49 individuals
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 1
Deg. freedom for truncation mean: 2
Deg. freedom for inefficiency model: 4
LogL when sigma(u)=0 -210.45352
Chi-sq=2*[LogL(SF)-LogL(LS)] = 1486.792
Kodde-Palm C*: 95%: 8.761, 99%: 12.483
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LNC| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -7.26117*** .25317 -28.68 .0000 -7.75738 -6.76496
LNQ2| .36162*** .01558 23.20 .0000 .33107 .39216
LNQ3| .01947*** .00257 7.58 .0000 .01444 .02451
LPLE| .64342*** .02165 29.72 .0000 .60099 .68584
LPKE| .30730*** .00727 42.24 .0000 .29305 .32156
|Mean of underlying truncated distribution
RACK| .81356 .52427 1.55 .1207 -.21399 1.84112
TUNNEL| 1.46353*** .47072 3.11 .0019 .54094 2.38613
|Scale parms. for random components of e(i)
ln_sgmaU| -.17921 .21781 -.82 .4106 -.60611 .24769
ln_sgmaV| -4.94678*** .20426 -24.22 .0000 -5.34711 -4.54644
|Heteroscedasticity in variance of truncated u(i)
VIRAGE| .06076 .04703 1.29 .1964 -.03142 .15294
|Heteroscedasticity in variance of symmetric v(i)
VIRAGE| -.37544 .44206 -.85 .3957 -1.24185 .49097
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E65: Data Envelopment Analysis E-79
The kernel estimator compares the estimated cost efficiency distributions for the pooled and
basic Pitt and Lee model. The pattern suggested earlier is clearly evident. The same comparison
appears for the truncated normal/heteroscedasticity models. (The estimated cost efficiency results
for the basic Pitt and Lee model and the expanded one are the same to three or four digits.) The
partial listing below shows the estimates for the four models, noting the time invariance of the Pitt
and Lee estimates.
= u2 / v2
i = i/u
i = zi for the heterogeneous mean model
, = a constant (0) for the simple truncated (half) normal model
Ai = 1 + Ti
hi = i / Ai– STi i /(u Ai)
Then, the contribution of individual i to the log likelihood function for the normal-half normal model
is
log Li = – (Ti/2)log 2–Ti logu– ½ log Ai – (Ti/2) log
t 1 it2
Ti
– ½( / u2) + ½ Aihi2 + ½ log(hi Ai )– ½ i2– log(i)
hi = – (v/Ti + d i /v)
t 1 it2
Ti
– ½(1/v2) + ½ Ti hi2 + log(hi Ti )
The Jondrow estimator, as formulated in Battese and Coelli (1988) in as follows: Let
i = 1 / (1 + 2Ti),
i2 = u2i,
Ei = i + (1 - i)( – i ),
and i = (1/Ti)tit.
(To change this to a cost frontier, change ui to [ai - min(ai)] This bears resemblance to a stochastic
frontier model, though in fact, it is a „deterministic‟ frontier model. The signature feature is that ui
equals zero for the „most efficient‟ firm in the sample. A natural interpretation of this is that what
we measure with the model is not the absolute inefficiency, but inefficiency of firm i relative to the
other firms in the sample. From the modeler‟s point of view, this approach has several substantive
advantages and disadvantages: The main advantage is
As illustrated in the results below, this approach tends to produce very large estimates of ui.
The invariance assumption about ui has been criticized elsewhere. Attempts to relax this assumption
are a recurrent theme in the literature, including the Battese and Coelli and true fixed and random
effects approaches described later. Other early work on the model suggested direct manipulation of
the fixed effects, for example,
Other more recent research (Han, Orea and Schmidt (2005)) has proposed factor analytic forms for
it. The sections to follow will include several of these different approaches.
E65: Data Envelopment Analysis E-82
Application
This Cornwell, Schmidt and Sickles (CSS) approach requires only a linear fixed effects
regression and a few instructions to manipulate the fixed effects. The following analyzes the airline
data with this approach. The following computes the CSS estimates and compares them to the
unstructured pooled estimates (using the normal-half normal model from Chapter E62) and the Pitt
and Lee model introduced above. The commands for the analysis are as follows:
SAMPLE ; All $
CREATE ; Railroad = id $
CREATE ; If(railroad > 20)railroad = railroad - 1 $ (There is a gap in the data)
HISTOGRAM ; Rhs = railroad
; Title = Number of Observations for Firms in Swiss Railroad Sample $
SETPANEL ; Group = id ; Pds = ti $
REJECT ; ti = 1 $
FRONTIER ; Lhs = lnc ; Cost ; Rhs = x ; Costeff = eusfpool $
CREATE ; pooled = Group Mean(eusfpool, Pds = ti) $
FRONTIER ; Lhs = lnc ; Cost ; Rhs = x ; Panel ; Costeff = pittlee $
REGRESS ; Lhs = lnc ; Rhs = x ; Panel ; Fixed Effects $
CREATE ; ai = alphafe(railroad) $
CALC ; minai = Min(ai) $
CREATE ; css = Exp((minai - ai)) $
CREATE ; Period = Ndx(id,1) $
REJECT ; period#1 $
PLOT ; Lhs = railroad ; Rhs = pooled,css ; Grid ; Fill ; Limits = 0,1
; Vaxis = Estimated Cost Efficiency
; Title = Half Normal vs. Cornwell, Schmidt, Sickles FE Cost Efficiencies $
PLOT ; Lhs = railroad ; Rhs = css,pittlee ; Grid ; Fill ; Limits = 0,1
; Vaxis = Estimated Cost Efficiency
; Title = Pitt and Lee RE vs. Cornwell, Schmidt, Sickles FE Cost Efficiencies $
The results below show the considerable differences in the parameter estimates produced by the
three models. Figure E64.4 demonstrates the expected quite large differences between the time
varying estimates (using the group means) and the time invariant results based on the CSS model.
Figure E64.5 also shows a striking, albeit commonly observed result – the CSS and Pitt and Lee
estimates are virtually identical.
E65: Data Envelopment Analysis E-83
-----------------------------------------------------------------------------
LSDV least squares with fixed effects ....
LHS=LNC Mean = 11.30305
Standard deviation = 1.09984
No. of observations = 604 Degrees of freedom
Regression Sum of Squares = 726.000 52
Residual Sum of Squares = 3.41179 551
Total Sum of Squares = 729.412 603
Standard error of e = .07869
Fit R-squared = .99532 R-bar squared = .99488
Model test F[ 52, 551] = 2254.77325 Prob F > F* = .00000
Diagnostic Log likelihood = 706.21504 Akaike I.C. = -5.00084
Restricted (b=0) = -914.01557 Bayes I.C. = -4.61443
Chi squared [ 52] = 3240.46122 Prob C2 > C2* = .00000
Estd. Autocorrelation of e(i,t) = .668792
--------------------------------------------------
Panel:Groups Empty 0, Valid data 49
Smallest 3, Largest 13
Average group size in panel 12.33
Variances Effects a(i) Residuals e(i,t)
.423441 .006192
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LNC| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
LNQ2| .29374*** .02850 10.31 .0000 .23789 .34959
LNQ3| .01612*** .00543 2.97 .0030 .00547 .02676
LPLE| .66452*** .03580 18.56 .0000 .59434 .73469
LPKE| .31777*** .01863 17.05 .0000 .28125 .35430
--------+--------------------------------------------------------------------
(These are the estimated parameters in the estimated pooled stochastic frontier model.)
Constant| -10.0907*** 1.14284 -8.83 .0000 -12.3306 -7.8507
LNQ2| .64179*** .01371 46.80 .0000 .61491 .66867
LNQ3| .06855*** .00655 10.46 .0000 .05570 .08139
LPLE| .53971*** .08858 6.09 .0000 .36610 .71333
LPKE| .26045*** .03260 7.99 .0000 .19655 .32435
|Variance parameters for compound error
Lambda| 1.29697*** .13854 9.36 .0000 1.02545 1.56850
Sigma| .44345*** .00056 789.05 .0000 .44235 .44455
(These are the estimated parameters in the estimated Pitt and Lee model.)
|Deterministic Component of Stochastic Frontier Model
Constant| -7.25643*** .24767 -29.30 .0000 -7.74185 -6.77101
LNQ2| .36259*** .01503 24.12 .0000 .33312 .39205
LNQ3| .01902*** .00240 7.94 .0000 .01432 .02372
LPLE| .64148*** .02112 30.38 .0000 .60009 .68287
LPKE| .30842*** .00700 44.08 .0000 .29471 .32214
|Variance parameters for compound error
Lambda| 12.1932** 5.55909 2.19 .0283 1.2975 23.0888
Sigma(u)| .96071*** .13303 7.22 .0000 .69998 1.22145
E65: Data Envelopment Analysis E-84
Figure E64.5 Estimated Inefficiencies from Cornwell et al. and Pitt and Lee Models
E65: Data Envelopment Analysis E-85
Several formulations are available. In Battese and Coelli‟s original formulation, the distribution was
half normal and the base specification was
where T is the number of periods in their balanced panel. (Here it would be Ti.) They also suggested
The first (linear) form is taken to be the default case for this model. The second is not provided in
this package. The BC92 model is requested with
We note a warning to practitioners. When the data are very consistent with the model, the
Battese and Coelli model produces quite satisfactory results. The framework has been employed in
many recent empirical applications. But, when the data are not of particularly good quality, or this
is the wrong model, extreme results can emerge. The airline data examined in Chapter E63 (and the
WHO data), for example, are a poor fit to this model.
We have labeled this model as „time dependent‟ rather than time varying. While the
inefficiency component in the model does vary through time, the variation is systematic with respect
to time. A question pursued in the ongoing literature is the extent to which this model actually
moves away from the time invariant specification of Pitt and Lee. Since there is actual variation, the
result is clearly somewhere between Pitt and Lee and what we have labeled the unstructured „pooled‟
model. If equals zero, Pitt and Lee emerges, so it depends entirely on this parameter. We have
found in some investigations that the end result is actually closer to Pitt and Lee than it is to the
pooled model – that is, there is quite a lot of structure involved in the BC92 model. The example
below illustrates.
E65: Data Envelopment Analysis E-86
E64.5.1 Application
To illustrate the Battese and Coelli models, we return to the railroad data used previously.
The base case is the pooled data stochastic cost frontier. This is followed by the Pitt and Lee model
and, finally, by the original Battese Coelli „time decay‟ model,
SAMPLE ; All $
REJECT ; ti = 1 $
FRONTIER ; Lhs = lnc ; Cost ; Rhs = x ; Costeff = eusfpool $
FRONTIER ; Lhs = lnc ; Cost ; Rhs = x ; Model = BC ; Panel ; Costeff = eucbc92 $
DSTAT ; Rhs = eucbc92,eusfpool $
KERNEL ; Rhs = eucbc92,eusfpool
; Title = Estimated Cost Efficiencies - Battese-Coelli 1992 vs. Pooled $
KERNEL ; Rhs = eucbc92,pittlee
; Title = Estimated Cost Efficiencies - Battese-Coelli 1992 vs. Pitt and Lee $
The kernel density estimators are used to compare the efficiency estimates from the pooled data
model to the Battese and Coelli model. The estimates of exp(-E[uit|εi]) from the Battese and Coelli
model are far larger than those from the pooled model. The assumption of time invariance of the
random term is a major component of this model. The second kernel estimator below compares
Battese-Coelli to Pitt-Lee. The correspondence of the two results is striking, albeit to be expected
given the small estimated value of .
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LNC
Log likelihood function -209.42340
Estimation based on N = 604, K = 7
Inf.Cr.AIC = 432.8 AIC/N = .717
Variances: Sigma-squared(v)= .07332
Sigma-squared(u)= .12333
Sigma(v) = .27077
Sigma(u) = .35119
Sigma = Sqr[(s^2(u)+s^2(v)]= .44345
Gamma = sigma(u)^2/sigma^2 = .62716
Var[u]/{Var[u]+Var[v]} = .37937
Stochastic Cost Frontier Model, e = v+u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 -210.45352
Chi-sq=2*[LogL(SF)-LogL(LS)] = 2.060
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
E65: Data Envelopment Analysis E-87
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LNC| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -10.0907*** 1.14284 -8.83 .0000 -12.3306 -7.8507
LNQ2| .64179*** .01371 46.80 .0000 .61491 .66867
LNQ3| .06855*** .00655 10.46 .0000 .05570 .08139
LPLE| .53971*** .08858 6.09 .0000 .36610 .71333
LPKE| .26045*** .03260 7.99 .0000 .19655 .32435
|Variance parameters for compound error
Lambda| 1.29697*** .13854 9.36 .0000 1.02545 1.56850
Sigma| .44345*** .00056 789.05 .0000 .44235 .44455
--------+--------------------------------------------------------------------
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LNC
Log likelihood function 530.16177
Estimation based on N = 604, K = 8
Inf.Cr.AIC = -1044.3 AIC/N = -1.729
Stochastic frontier based on panel data
Estimation based on 49 individuals
Variances: Sigma-squared(v)= .00613
Sigma-squared(u)= .97581
Sigma(v) = .07828
Sigma(u) = .98783
Sigma = Sqr[(s^2(u)+s^2(v)]= .99093
Gamma = sigma(u)^2/sigma^2 = .99376
Var[u]/{Var[u]+Var[v]} = .98301
Stochastic Cost Frontier Model, e = v+u
Battese-Coelli Models: Time Varying uit
Time dependent uit=exp[-eta(t-T)]*|U(i)|
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 -210.45352
Chi-sq=2*[LogL(SF)-LogL(LS)] = 1481.231
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LNC| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -6.83502*** .27362 -24.98 .0000 -7.37130 -6.29873
LNQ2| .35459*** .01636 21.68 .0000 .32254 .38665
LNQ3| .02183*** .00238 9.17 .0000 .01716 .02649
LPLE| .61516*** .02092 29.40 .0000 .57415 .65617
LPKE| .30931*** .00701 44.09 .0000 .29556 .32306
|Variance parameters for compound error
Lambda| 12.6195*** .01188 1062.18 .0000 12.5962 12.6428
Sigma(u)| .98783*** .15275 6.47 .0000 .68845 1.28721
|Eta parameter for time varying inefficiency
Eta| -.00248*** .00086 -2.89 .0039 -.00416 -.00080
--------+--------------------------------------------------------------------
E65: Data Envelopment Analysis E-88
--------+---------------------------------------------------------------------
Variable| Mean Std.Dev. Minimum Maximum Cases Missing
--------+---------------------------------------------------------------------
EUCBC92| .514566 .231680 .085140 .982112 604 0
EUSFPOOL| .760991 .095229 .478178 .906348 604 0
--------+---------------------------------------------------------------------
Figure E64.6 Kernel Density Estimates for Inefficiencies from Battese and Coelli Model
2
T
log 1 t i 1 g it2 1
1
2
1 A2
i log i i log ( Ai )
2 2
2 u2 2v
u2 / 2
it yit xit
i 0 or or w i
git exp[(t Ti )] or exp( z it )
S 1 for a production model and -1 for a cost model
(1 )i S Tt i 1 git it
Ai
(1 ) 1 Tti 1 git2 1
Derivatives of this function are complicated in the extreme, and are omitted here. (Some useful
results for obtaining them are found in Battese and Coelli (1992, 1995).)
The Jondrow estimator of uit is
(1 )2
2i =
(1 ) Tti 1 git2
E65: Data Envelopment Analysis E-90
The default form used earlier is g(zit) = exp[-(t – Ti)]. You may also use a more general form,
g(zit) = exp(zit)
where zit contains any desired set of variables. For this extension, use
As before, the truncated normal version of the model is also supported. For an example, we have
used
FRONTIER ; Lhs = lnc ; Cost ; Rhs = x ; Model = BC ; Panel ; Costeff = eucbc92h
; Hfu = rack,virage,tunnel $
The estimates of cost efficiency produced by this model are identical to those from the base model in
the previous section.
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LNC
Log likelihood function 529.63533
Stochastic frontier based on panel data
Estimation based on 49 individuals
Variances: Sigma-squared(v)= .00615
Sigma-squared(u)= .94808
Sigma(v) = .07840
Sigma(u) = .97369
Sigma = Sqr[(s^2(u)+s^2(v)]= .97685
Gamma = sigma(u)^2/sigma^2 = .99356
Var[u]/{Var[u]+Var[v]} = .98247
Stochastic Cost Frontier Model, e = v+u
Battese-Coelli Models: Time Varying uit
Time varying uit=exp[eta*z(i,t)]*|U(i)|
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 3
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 4
LogL when sigma(u)=0 -210.45352
Chi-sq=2*[LogL(SF)-LogL(LS)] = 1480.178
Kodde-Palm C*: 95%: 8.761, 99%: 12.483
E65: Data Envelopment Analysis E-91
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LNC| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -6.89845*** .32923 -20.95 .0000 -7.54374 -6.25316
LNQ2| .35751*** .01591 22.47 .0000 .32632 .38870
LNQ3| .02149*** .00236 9.10 .0000 .01686 .02613
LPLE| .61741*** .02430 25.40 .0000 .56977 .66504
LPKE| .30892*** .00759 40.71 .0000 .29405 .32380
|Variance parameters for compound error
Lambda| 12.4202*** .01108 1120.76 .0000 12.3984 12.4419
Sigma(u)| .97369*** .13513 7.21 .0000 .70884 1.23855
|Coefficients in u(i,t)=[exp{eta*z(i,t)}]*|U(i)|
RACK| .00024 .01743 .01 .9889 -.03392 .03441
VIRAGE| -.02096 .01321 -1.59 .1126 -.04685 .00493
TUNNEL| .00219 .01625 .14 .8926 -.02966 .03405
--------+--------------------------------------------------------------------
(Parameter estimates from base case Battese and Coelli)
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -6.83502*** .27362 -24.98 .0000 -7.37130 -6.29873
LNQ2| .35459*** .01636 21.68 .0000 .32254 .38665
LNQ3| .02183*** .00238 9.17 .0000 .01716 .02649
LPLE| .61516*** .02092 29.40 .0000 .57415 .65617
LPKE| .30931*** .00701 44.09 .0000 .29556 .32306
|Variance parameters for compound error
Lambda| 12.6195*** .01188 1062.18 .0000 12.5962 12.6428
Sigma(u)| .98783*** .15275 6.47 .0000 .68845 1.28721
|Eta parameter for time varying inefficiency
Eta| -.00248*** .00086 -2.89 .0039 -.00416 -.00080
--------+--------------------------------------------------------------------
ui = | N[0, u2] |.
E65: Data Envelopment Analysis E-92
This model (as are the others) is fit by maximum likelihood, not least squares. The normal-half
normal model is applied to the stochastic part of the model. Note that the inefficiency term in this
model is time varying. The heterogeneity may appear in Stevenson‟s truncated normal model as
follows. This is a true fixed effects, normal-truncated normal model.
In this form, the heterogeneity is still retained in the production function part of the model. Another
possibility is to allow the heterogeneity to enter the mean of the inefficiency distribution rather than
the production function – this seems the most natural of the three forms. In this case,
The mean of the inefficiency distribution shifts in time, but also has a firm specific component.
Finally, the heterogeneity may be shifted to the variance of the inefficiency distribution. In this
form, we have
yit = xit + vit - uit,
uit = | N[0, ui2] |
uit2 = u2 exp(i +zit).
The variables in the variance term may be omitted if only a groupwise heteroscedastic model is
desired. Note this is a half normal model. A model with nonzero underlying mean and variation in
the variance appears to be inestimable. Note that in order to secure identification, this model must
have time varying inefficiency, induced by time variation in the variance.
NOTE: We have had extremely limited success with the second and third forms of the model. The
likelihood function is quite volatile in the parameters of the underlying mean of the truncated
distribution with the result that the estimated variance parameters and generally become negative
in the early iterations and estimation must be halted. This occurs even when very good starting
values are used, which suggests that estimation of this model as stated is likely to be extremely
problematic in all but the most favorable of cases. An alternative approach which is simple, but can
be used only with small panels (up to 100 groups), is suggested below.
In terms of implementation, we note that these forms of the models, though they are new
with LIMDEP, have long been feasible. The panels typically used by researchers in this setting are
often fairly small – our airline data for example have only 25 units and the Swiss railroad data has 49
firms. It would always have been possible to create these models simply by adding dummy variables
to the familiar model. However, LIMDEP‟s implementation of the model obviates this by using the
methodology described in Chapter R23. In principle, this allows up to 100,000 firms in the data set.
E65: Data Envelopment Analysis E-93
Matrices: b = estimate of
varb = asymptotic covariance matrix for estimate of .
alphafe = estimated fixed effects (if ; Par is in the command)
The model must be fit twice. The first model is a pooled data model which provides the starting values
for the second. The second command is identical to the first save for the addition of the panel data
specification. In order to set up the initial values correctly, it is essential that your initial model include
the constant term first in the Rhs list and that the second model specification be identical to the first.
Other options and specifications for the fixed effects models are the same as in other applications. (See
Chapter R23 for details.) The fixed effects command also contains the constant term, but this will be
removed by the command processor later. See the example below for the operation of the command.
NOTE: Starting values must be provided by the first estimator. The specification ; Start = list of
values is not available for this model. You must fit both models each time you fit an FEM. The
starting values are not retained after the FEM is estimated.
All fixed effects forms are estimated by maximum likelihood. You may also fit a two way
fixed effects model
yit = i+ t + xit + vit - ui, (change to v + u for a stochastic cost frontier),
ui = | N[0, u2] |
where t is an additional, time (period) specific effect. The time specific effect is requested by adding
; Time
For the unbalanced panel, we assume that overall, the sample observation period is
t = 1,2,..., Tmax and that the time variable gives for the specific group, the particular values of t that
apply to the observations. Thus, suppose your overall sample is five periods. The first group is three
observations, periods 1, 2, 4, while the second group is four observations, 2, 3, 4, 5. Then, your
panel specification would be
; Covariance Matrix displays estimated asymptotic covariance matrix (normally not shown),
same as ; Printvc.
This command recovers the estimated fixed effects from the Cornwell et al. model. then replicates
them for each year in the data set. This is used to create the plot of the two sets of estimates of u i
shown below.
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LQ
Log likelihood function 108.43918
Estimation based on N = 256, K = 9
Inf.Cr.AIC = -198.9 AIC/N = -.777
Model estimated: Aug 17, 2011, 06:36:42
Variances: Sigma-squared(v)= .01902
Sigma-squared(u)= .01692
Sigma(v) = .13791
Sigma(u) = .13007
Sigma = Sqr[(s^2(u)+s^2(v)]= .18957
Gamma = sigma(u)^2/sigma^2 = .47074
Var[u]/{Var[u]+Var[v]} = .24425
Stochastic Production Frontier, e = v-u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 108.07431
Chi-sq=2*[LogL(SF)-LogL(LS)] = .730
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
E65: Data Envelopment Analysis E-96
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -2.98823*** .72136 -4.14 .0000 -4.40206 -1.57439
LF| .37257*** .07038 5.29 .0000 .23463 .51052
LM| .69910*** .07580 9.22 .0000 .55054 .84766
LE| 2.09473*** .68790 3.05 .0023 .74647 3.44299
LL| -.42909*** .06315 -6.79 .0000 -.55287 -.30530
LP| .44533*** .09498 4.69 .0000 .25917 .63149
LK| -2.09806*** .76556 -2.74 .0061 -3.59853 -.59759
|Variance parameters for compound error
Lambda| .94309*** .16870 5.59 .0000 .61244 1.27373
Sigma| .18957*** .00064 297.81 .0000 .18832 .19082
--------+--------------------------------------------------------------------
-----------------------------------------------------------------------------
LSDV least squares with fixed effects ....
LHS=LQ Mean = -1.11237
Standard deviation = 1.29728
No. of observations = 256 Degrees of freedom
Regression Sum of Squares = 426.103 30
Residual Sum of Squares = 3.04876 225
Total Sum of Squares = 429.152 255
Standard error of e = .11640
Fit R-squared = .99290 R-bar squared = .99195
Model test F[ 30, 225] = 1048.21999 Prob F > F* = .00000
Diagnostic Log likelihood = 203.84835 Akaike I.C. = -4.18825
Restricted (b=0) = -429.37729 Bayes I.C. = -3.75896
Chi squared [ 30] = 1266.45126 Prob C2 > C2* = .00000
Estd. Autocorrelation of e(i,t) = .575211
--------------------------------------------------
Panel:Groups Empty 0, Valid data 25
Smallest 2, Largest 15
Average group size in panel 10.24
Variances Effects a(i) Residuals e(i,t)
.030410 .013550
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error t |t|>T* Interval
--------+--------------------------------------------------------------------
LF| .14860 .09677 1.54 .1259 -.04107 .33828
LM| .80497*** .07843 10.26 .0000 .65125 .95868
LE| .68672 .67075 1.02 .3069 -.62792 2.00136
LL| -.15977 .11829 -1.35 .1780 -.39162 .07208
LP| .16227 .09973 1.63 .1050 -.03320 .35774
LK| -.37897 .74689 -.51 .6123 -1.84284 1.08490
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
Figure E64.8 plots the Jondrow estimates of exp(-E[uit|it]) from the true fixed effects model
and the estimates of ui from the Cornwell, Schmidt and Sickles model of Section E64.4 for each
firm. Since the true FE estimates vary by period, we have plotted the group means. The implication
of the regression based model is clear in the figure. The estimates of technical efficiency from the
true FEM are generally considerably larger than those from the deterministic model.
use FRONTIER ; Lhs = ... ; Rhs = one, ... ; Rh2 = one, ...
; Model = T $
FRONTIER ; Lhs = ... ; Rhs = one, ... ; Rh2 = one, ...
; FEM ; Panel $
The Rh2 is optional in the first equation if you have only a constant term in the mean of the truncated
distribution. But, you should include it nonetheless so as to insure the match between the first and
second commands. Also, it is essential that both Rhs and Rh2 include constant terms in the first
positions.
To move the heterogeneity to the mean of the underlying truncated normal distribution,
use FRONTIER ; Lhs = ... ; Rhs = one, ... ; Rh2 = one, ...
; Model = T $
FRONTIER ; Lhs = ... ; Rhs = one, ... ; Rh2 = one, ...
; Model = T
; FEM ; Panel $
Note that this version differs from the earlier one only in the presence of ; Model = T in the second
form and its absence in the first. Again, the variable specifications in the two commands must be
identical, and both must include constant terms in the first position in both lists. As before, you may
use ; Rh2 = one if you do not require variables zit in the mean. (This constant term will be removed
from the fixed effects model, but this common value is used as the starting value for the firm specific
estimates.)
We note, we have had scant success with this model even with a carefully constructed data
set and good starting values. The problem appears to be Newton‟s method, which must be used for
the general fixed effects program which this is part of. If you have a small panel with no more than
100 groups, an alternative approach appears to work better. You may provide a stratification
variable in the cross section template to request that a set of dummy variables be inserted directly
into the function.
E65: Data Envelopment Analysis E-99
The stratification variable must take the full set of values from 1 to N up to 100 and all groups must
have at least two observations. For the second form, with the heterogeneity embedded in the mean
of the truncated normal distribution, add
; Mean
to the command.
This provides four possible forms of the model, which we illustrate with the airline data:
NAMELIST ; x = one,lf,lm,le,ll,lp,lk $
This is a true fixed effects model with normal-truncated normal structure for uit.
This model is the same as the preceding one except now i= 1 + 2loadfctri.
This is a true fixed effects model with the fixed effects appearing in i rather than in the production
function.
This model is the same as the preceding model except that loadfctr now also appears in the mean of
the truncated variable.
is requested in the same fashion as the normal-truncated normal model, using a stratification variable
in the cross section formulation. (This likelihood function is likewise quite ill behaved, though less
so than the truncation form.) The command is
To continue the earlier example, the following fits a model of heteroscedasticity to the
airline data. The first model has heteroscedasticity and the fixed effects in the variance of ui. The
second is doubly heteroscedastic, again with the fixed effects in the variance of ui.
NAMELIST ; x = one,lf,lm,le,ll,lp,lk $
FRONTIER ; Lhs = lq ; Rhs = x
; Het ; Hfu = one,loadfctr ; Hfv = one ; Str = firm $
FRONTIER ; Lhs = lq ; Rhs = x
; Het ; Hfu = one,loadfctr ; Hfv = one,loadfctr ; Str = firm $
E65: Data Envelopment Analysis E-101
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LQ
Log likelihood function 182.50025
Variances: Sigma-squared(v)= .00876
Sigma-squared(u)= .04920
Sigma(v) = .09357
Sigma(u) = .22182
Sigma = Sqr[(s^2(u)+s^2(v)]= .24075
Gamma = sigma(u)^2/sigma^2 = .84892
Var[u]/{Var[u]+Var[v]} = .67126
Variances averaged over observations
Stochastic Production Frontier, e = v-u
Stratified by FIRM , 25 groups
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -3.70847*** .75902 -4.89 .0000 -5.19612 -2.22081
LF| .38142*** .08642 4.41 .0000 .21204 .55079
LM| .57659*** .09175 6.28 .0000 .39676 .75642
LE| 2.78934*** .72692 3.84 .0001 1.36459 4.21408
LL| -.41646*** .08641 -4.82 .0000 -.58582 -.24710
LP| .59190*** .11704 5.06 .0000 .36251 .82129
LK| -2.87861*** .80566 -3.57 .0004 -4.45767 -1.29956
|Parameters in variance of v (symmetric)
Constant| -4.73798*** .21921 -21.61 .0000 -5.16764 -4.30833
|Parameters in variance of u (one sided)
Constant| 8.11346 7.80244 1.04 .2984 -7.17903 23.40596
LOADFCTR| -23.6678*** 6.88328 -3.44 .0006 -37.1588 -10.1768
FIRM001| 1.35540 7.37739 .18 .8542 -13.10403 15.81482
FIRM002| .25791 7.25149 .04 .9716 -13.95476 14.47057
FIRM003| .68176 7.22190 .09 .9248 -13.47290 14.83643
(Firms 4-20 omitted)
FIRM021| .73089 7.21226 .10 .9193 -13.40488 14.86666
FIRM022| -.38963 7.46091 -.05 .9584 -15.01274 14.23347
FIRM023| -.63171 7.53984 -.08 .9332 -15.40952 14.14610
FIRM024| -7.77451 41.07339 -.19 .8499 -88.27688 72.72786
--------+--------------------------------------------------------------------
Note: nnnnn.D-xx or D+xx => multiply by 10 to -xx or +xx.
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E65: Data Envelopment Analysis E-102
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LQ
Log likelihood function 190.29998
Estimation based on N = 256, K = 35
Inf.Cr.AIC = -310.6 AIC/N = -1.213
Model estimated: Aug 22, 2011, 22:57:54
Variances: Sigma-squared(v)= .00906
Sigma-squared(u)= .04124
Sigma(v) = .09519
Sigma(u) = .20307
Sigma = Sqr[(s^2(u)+s^2(v)]= .22427
Gamma = sigma(u)^2/sigma^2 = .81986
Var[u]/{Var[u]+Var[v]} = .62318
Variances averaged over observations
Stochastic Production Frontier, e = v-u
Stratified by FIRM , 25 groups
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -3.00340*** .65319 -4.60 .0000 -4.28364 -1.72316
LF| .24071*** .07721 3.12 .0018 .08938 .39204
LM| .60992*** .07600 8.03 .0000 .46096 .75887
LE| 2.19046*** .62677 3.49 .0005 .96202 3.41890
LL| -.38679*** .07314 -5.29 .0000 -.53015 -.24344
LP| .49345*** .09820 5.03 .0000 .30098 .68591
LK| -2.09638*** .69385 -3.02 .0025 -3.45631 -.73646
|Parameters in variance of v (symmetric)
Constant| -13.5487*** 2.64897 -5.11 .0000 -18.7406 -8.3569
LOADFCTR| 15.5221*** 4.48367 3.46 .0005 6.7343 24.3099
|Parameters in variance of u (one sided)
Constant| 8.01865 5.60084 1.43 .1522 -2.95879 18.99609
LOADFCTR| -23.3031*** 6.88508 -3.38 .0007 -36.7976 -9.8086
FIRM001| .88200 5.06220 .17 .8617 -9.03972 10.80373
FIRM002| -.83198 4.67591 -.18 .8588 -9.99660 8.33264
FIRM003| -.18608 4.65296 -.04 .9681 -9.30573 8.93356
(Firms 4-20 omitted)
FIRM021| .35047 4.63405 .08 .9397 -8.73210 9.43303
FIRM022| -.68781 4.83235 -.14 .8868 -10.15903 8.78342
FIRM023| -.96206 4.88186 -.20 .8438 -10.53033 8.60622
FIRM024| -2.86357 4.82675 -.59 .5530 -12.32383 6.59670
--------+--------------------------------------------------------------------
E65: Data Envelopment Analysis E-103
At first look, this appears to be a model with a three part disturbance, which would surely be
inestimable. But, that is incorrect. It is a model with a traditional random effect, but with the
additional feature that the time varying disturbance is not normally distributed. Specifically, the
model may be written in our familiar form for the stochastic frontier model,
The model is estimable by maximum simulated likelihood, as shown below. Contrast this to the Pitt
and Lee form,
yit= + ′xit + vit + ui
vit~ N[0,v2]
ui = |Ui|, Ui ~ N[0,u2].
In this form, ui, the time invariant effect, is the inefficiency. In the true random effects model, uit is
the inefficiency, and it is time varying. The latent heterogeneity, the random effect, is wi. Thus, in
the Pitt and Lee model, the „inefficiency‟ term also contains all other time invariant unmeasured
sources of heterogeneity. In the true random effects model, these effects appear in wi, and uit picks
up the inefficiency. By this interpretation, we will expect (and always find) that estimated
inefficiencies from the Pitt and Lee are larger than those from the true random effects model,
sometimes far larger. The same result is at work in the difference between the Cornwell et al. fixed
effects model and the true fixed effects model. Figure E64.8 clearly shows the effect at work.
The true random effects model is estimated as a form of random parameters (RP) model, in
which the only random parameter in the model is the constant term. Thus, we write the model in the
canonical RP form
yit = i + ′xit + vit + uit
vit ~ N[0,v2]
uit = |Uit|, Uit~ N[0,u2]
i = + wi
wi ~ N[0,w2]
E65: Data Envelopment Analysis E-104
Details on estimating random parameters models appear in Chapter R24, so they will be omitted
here.
The command structure for the true random effects model is similar to that for the true fixed
effects model. The frontier model must be fit twice, first with no effects to generate the starting
values, then with the effect specified. The commands are
The computation of random parameters models is fairly time consuming because of the simulations.
You can control this in part with
For exploratory work (or for examples in program documentation), small values such as 25 or 50 are
sufficient. For final results destined for publication, larger values, in the range of several hundred
are advisable. Also, we advise using Halton sequences rather than pseudorandom numbers for the
simulations (see Chapter R24). The parameter is
; Halton
The random parameters formulation also allows a variety of specifications for the mean of the
underlying uit – the normal-truncated normal model – and for heteroscedasticity. These are
discussed in Section E64.9.
Application
To illustrate the true random effects model, we continue the analysis of the airline data. The
commands below estimate the pooled model, then the true RE model. In like fashion to the analysis
of fixed effects, we then compare the true random effects estimates of inefficiency to the Pitt and Lee
estimates. Figure E64.8 illustrates the general result that the estimated inefficiencies in the true fixed
effects model will differ considerably from those produced by the Cornwell et al. approach to fixed
effects. Figure E64.9 shows the same result for the two approaches to random effects. Numerous
studies in the literature (see Greene (2005) for discussion) have documented the similarity of the
random and fixed approaches – when the same overall structure is used. Thus, Figure E64.10 shows
similar results for the true fixed and random effects models and for the Pitt and Lee and Cornwell et
al. models.
E65: Data Envelopment Analysis E-105
NAMELIST ; x = one,lf,lm,le,ll,lp,lk $
FRONTIER ; Lhs = lq ; Rhs = x ; Panel ; Eff = uplre $
FRONTIER ; Lhs = lq ; Rhs = x ; Par $
FRONTIER ; Lhs = lq ; Rhs = x ; Panel ; RPM ; Eff = utre
; Fcn = one(n) ; Pts = 50 ; Halton $
FRONTIER ; Lhs = lq ; Rhs = x ; Par $
FRONTIER ; Lhs = lq ; Rhs = x ; Panel ; FEM ; Eff = utfe $
DSTAT ; Rhs = uplre,utre $
CREATE ; utrebar = Group Mean(utre, Str = firm) $
PLOT ; Lhs = uplre ; Rhs = utrebar ; Grid
; Title = Group Means of u(i,t) vs. Time Invariant u(i) $
PLOT ; Lhs = utfe ; Rhs = utre ; Grid
; Title = Time Varying FE u(i) vs. Time Varying RE u(i) $
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LQ
Log likelihood function 156.04955
Estimation based on N = 256, K = 9
Stochastic frontier based on panel data
Estimation based on 25 individuals
Variances: Sigma-squared(v)= .01342
Sigma-squared(u)= .06529
Sigma(v) = .11582
Sigma(u) = .25552
Sigma = Sqr[(s^2(u)+s^2(v)]= .28054
Gamma = sigma(u)^2/sigma^2 = .82955
Var[u]/{Var[u]+Var[v]} = .63879
Stochastic Production Frontier, e = v-u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 108.07431
Chi-sq=2*[LogL(SF)-LogL(LS)] = 95.950
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -1.70327*** .41761 -4.08 .0000 -2.52176 -.88477
LF| .19534** .09759 2.00 .0453 .00407 .38662
LM| .81312*** .06954 11.69 .0000 .67682 .94941
LE| 1.12741*** .34589 3.26 .0011 .44947 1.80534
LL| -.32931*** .07230 -4.55 .0000 -.47102 -.18760
LP| .22206*** .06265 3.54 .0004 .09927 .34485
LK| -.86072** .42646 -2.02 .0436 -1.69657 -.02488
|Variance parameters for compound error
Lambda| 2.20605* 1.31249 1.68 .0928 -.36639 4.77849
Sigma(u)| .25552** .10148 2.52 .0118 .05661 .45442
--------+--------------------------------------------------------------------
E65: Data Envelopment Analysis E-106
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable LQ
Log likelihood function 108.43918
Estimation based on N = 256, K = 9
Variances: Sigma-squared(v)= .01902
Sigma-squared(u)= .01692
Sigma(v) = .13791
Sigma(u) = .13007
Sigma = Sqr[(s^2(u)+s^2(v)]= .18957
Gamma = sigma(u)^2/sigma^2 = .47074
Var[u]/{Var[u]+Var[v]} = .24425
Stochastic Production Frontier, e = v-u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 108.07431
Chi-sq=2*[LogL(SF)-LogL(LS)] = .730
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| -2.98823*** .72136 -4.14 .0000 -4.40206 -1.57439
LF| .37257*** .07038 5.29 .0000 .23463 .51052
LM| .69910*** .07580 9.22 .0000 .55054 .84766
LE| 2.09473*** .68790 3.05 .0023 .74647 3.44299
LL| -.42909*** .06315 -6.79 .0000 -.55287 -.30530
LP| .44533*** .09498 4.69 .0000 .25917 .63149
LK| -2.09806*** .76556 -2.74 .0061 -3.59853 -.59759
|Variance parameters for compound error
Lambda| .94309*** .16870 5.59 .0000 .61244 1.27373
Sigma| .18957*** .00064 297.81 .0000 .18832 .19082
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
These are the estimates of the true random effects model. Note that the variation of the
random terms in the model has been rearranged. In the pooled model, sv = 0.138 and su = 0.130. In
the random effects model, we have sv = .099 and su= .100. But, sw = .140. The proportional
allocation of the total to u and v has stayed roughly the same, but some additional variation is now
attributed to the random effect. Note that the production function parameters have changed
substantially as well.
E65: Data Envelopment Analysis E-107
-----------------------------------------------------------------------------
Random Coefficients Frontier Model
Dependent variable LQ
Log likelihood function 160.58066
Restricted log likelihood .00000
Chi squared [ 1 d.f.] 321.16131
Significance level .00000
Estimation based on N = 256, K = 10
Inf.Cr.AIC = -301.2 AIC/N = -1.176
Model estimated: Aug 22, 2011, 23:15:44
Unbalanced panel has 25 individuals
Stochastic frontier (half normal model)
Simulation based on 50 Halton draws
Sigma( u) (1 sided) = .09962
Sigma( v) (symmetric) = .09857
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Production / Cost parameters, nonrandom first
LF| .20387*** .05183 3.93 .0001 .10229 .30545
LM| .79450*** .04660 17.05 .0000 .70318 .88583
LE| 1.10745*** .33573 3.30 .0010 .44943 1.76547
LL| -.32691*** .04277 -7.64 .0000 -.41074 -.24308
LP| .22812*** .05403 4.22 .0000 .12223 .33401
LK| -.84947** .38344 -2.22 .0267 -1.60101 -.09794
|Means for random parameters
Constant| -1.83727*** .35442 -5.18 .0000 -2.53191 -1.14263
|Scale parameters for dists. of random parameters
Constant| .11729*** .00934 12.56 .0000 .09898 .13559
|Variance parameter for v +/- u
Sigma| .14015*** .01373 10.21 .0000 .11325 .16705
|Asymmetry parameter, lambda
Lambda| 1.01064** .43792 2.31 .0210 .15234 1.86895
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
Descriptive Statistics
--------+---------------------------------------------------------------------
Variable| Mean Std.Dev. Minimum Maximum Cases Missing
--------+---------------------------------------------------------------------
UPLRE| .221170 .117670 .016992 .435912 256 0
UTRE| .078815 .031677 .026405 .305595 256 0
--------+---------------------------------------------------------------------
E65: Data Envelopment Analysis E-108
Figure E64.10 Comparison of Time Varying Fixed and Random Effects Estimates
E65: Data Envelopment Analysis E-109
The model allows, all at once, half normal or truncated normal distribution for ui and firmwise and/or
timewise heteroscedasticity in uit. The model form allows parameters to be random in all three parts
of the specification with the single restriction noted below. (Only the variance of the „disturbance,‟
vit is assumed to be constant. In addition, this model form does not accommodate heteroscedasticity
in vit.) As will be clear in what follows, the true random effects model developed in the previous
section is a special case of this model with nonrandom parameters in it and uit2 and
only a random constant term in i.
NOTE: The random parameters normal-truncated normal model with heteroscedasticity (in uit) at
the same time is not identified. Only one of these two should be specified. The command parser
will not prevent you from specifying such a model, but it will ultimately be impossible to obtain the
parameter estimates.
The general structure of the random parameters stochastic frontier model is based on the
conditional density
f(yit| xit, i) = f( ixit), i = 1,...,N, t = 1,...,Ti
where i = + zi + vi
and f(.) is the density for the stochastic frontier regression model. The model assumes that
parameters are randomly distributed with possibly heterogeneous (across individuals) means
Var[ i| zi] = .
As noted earlier, the heterogeneity term is optional. In addition, it may be assumed that some of the
parameters are nonrandom by placing rows of zeros in the appropriate places in and . The general
form of random parameter vector i is also extended to i and i. The general aspects of random
parameters model estimation in LIMDEP are described in Chapter R24.
E65: Data Envelopment Analysis E-110
The model command for the random parameters form of the stochastic frontier model is as
follows. The first FRONTIER command is mandatory, and is needed to obtain the starting values.
This is a pooled data version of the model. Note that it does not include the heteroscedasticity or
truncation specification, even if the second command does.
(Note, again, only one of the two optional specifications noted should be specified.)
NOTE: For this model, your Rhs list must include a constant term. Though not strictly necessary,
you should also include constants in Rh2 or Hfn if they are specified.
The ; Fcn = specification is used to define the random parameters. It is constructed from
the list of Rhs names as follows: Suppose your model is specified by
This involves five coefficients. Any or all of them may be random; any not specified as random are
assumed to be constant. For those that you wish to specify as random, use the following for
production (cost, profit) function parameters,
There are two other sets of parameters in the model, in the mean of and variance of the one sided
disturbance. To specify random parameters in the underlying mean of the truncated normal variable,
use the following:
(Note square brackets designate the terms in it.) For parameters in the computation of the variance
of uit, use
; Fcn = variable name <distribution>,
variable name <distribution>, ...
E65: Data Envelopment Analysis E-111
The difference in the three formulations is in the enclosures, ( ) for production function, [ ] for mean
of the truncated distribution, and <> for the variance of the one sided disturbance. This distinction
is necessary because the lists might have variables in common, and this is the only way to distinguish
them. In particular, it is likely that all three lists would include one, so this device is used to
distinguish the three functions.
Three distributions may be specified All random variables have mean 0.
Note that each of these is scaled as it enters the distribution, so the variance is only that of the
random draw before multiplication. (See Chapter R23 for discussion of this computation and for
other distributions that can be specified.) The latter two distributions are provided as one may wish
to reduce the amount of variation in the tails of the distribution of the parameters across individuals
and to limit the range of variation. (See Train (2010) for discussion.) For example, to specify that
the constant term and the coefficient on x1 are normally distributed with fixed mean and variance,
and a normally distributed constant in the mean of the truncated distribution, you might use
This specifies that the first and second coefficients are random while the remainder are not. The
parameters estimated will be the mean and standard deviations of the distributions of these two
parameters and the fixed values of the other three.
NOTE: If you use the wrong enclosures for the variables, a diagnostic will appear that the program
does not recognize a variable. For example:
The reason for the diagnostic is that the lf[n] would indicate a specification for the truncation model,
using ; Rh2 = list. But, this command specifies only heteroscedasticity, which is denoted with <>
enclosures. Hence, when the lf[n] is encountered, LIMDEP searches for lf in an Rh2 list, and finding
no such list, issues the diagnostic.
E65: Data Envelopment Analysis E-112
The stochastic frontier model does not support correlated random parameters. The model is
not identified with this extension.
The preceding examples have specified that the mean of the random variable is fixed over
individuals. If there is measured heterogeneity in the means, in the form of
E[ki] = k + mkmzmi
where zmi is a variable that is measured for each individual, then the command may be modified to
In the data set, these variables must be repeated for each observation in the group. Since the
coefficients are assumed to be time invariant, the variables in zi must be also.
The variances of the underlying random variables are given earlier, 1 for the normal
distribution, 1/3 for the uniform, and 1/6 for the tent distribution. The k parameters are only the
standard deviations for the normal distribution. For the other two distributions, k is a scale
parameter. The standard deviation is obtained as k / 3 for the uniform distribution and k / 6 for
the triangular distribution. When the parameters are correlated, the implied covariance matrix is
adjusted accordingly. The correlation matrix is unchanged by this.
Results saved by this estimator are:
Matrices: b = estimate of
varb = asymptotic covariance matrix for estimate of .
beta_i = individual specific parameters, if ; Par is requested.
; Covariance Matrix displays estimated asymptotic covariance matrix (normally not shown),
same as ; Printvc.
; Robust requests a „sandwich‟ estimator or robust covariance matrix for TSCS
and several discrete choice models.
Application
We continue the earlier application by fitting the stochastic frontier model with random
parameters. The random parameters truncation model appears to be unidentified in these data, so the
second model fit is with heteroscedasticity. In the first model, the constant and one of the production
coefficients is specified to be random. In the second, these two coefficients and the parameter on the
variable that enters the variance function are all taken to be random. The kernel density estimators
compare the efficiency estimates from the random parameters model to those from the simplest
pooled estimator.
E65: Data Envelopment Analysis E-114
NAMELIST ; x = one,lf,lm,le,ll,lp,lk $
FRONTIER ; Lhs = lq ; Rhs = x ; Eff = u $
FRONTIER ; Lhs = lq ; Rhs = x
; RPM ; Panel ; Pts = 50 ; Halton; Fcn = one(n),lf(n) ; Eff = urp1 $
KERNEL ; Rhs = urp1,u $
FRONTIER ; Lhs = lq ; Rhs = x $
FRONTIER ; Lhs = lq ; Rhs = x ; Hfn = one,loadfctr
; RPM ; Panel ; Pts = 50 ; Halton
; Fcn = one(n),lf(n),loadfctr<n> $
-----------------------------------------------------------------------------
Random Coefficients Frontier Model
Dependent variable LQ
Log likelihood function 161.33196
Restricted log likelihood .00000
Chi squared [ 2 d.f.] 322.66392
Significance level .00000
Estimation based on N = 256, K = 11
Inf.Cr.AIC = -300.7 AIC/N = -1.174
Model estimated: Aug 22, 2011, 23:28:18
Unbalanced panel has 25 individuals
Stochastic frontier (half normal model)
Simulation based on 50 Halton draws
Sigma( u) (1 sided) = .10598
Sigma( v) (symmetric) = .09399
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Production / Cost parameters, nonrandom first
LM| .81447*** .04526 18.00 .0000 .72577 .90317
LE| 1.16342*** .31391 3.71 .0002 .54817 1.77867
LL| -.33712*** .04111 -8.20 .0000 -.41769 -.25654
LP| .24213*** .04782 5.06 .0000 .14841 .33585
LK| -.94502*** .35520 -2.66 .0078 -1.64119 -.24886
|Means for random parameters
Constant| -1.89056*** .33140 -5.70 .0000 -2.54009 -1.24103
LF| .21430*** .05277 4.06 .0000 .11088 .31773
|Scale parameters for dists. of random parameters
Constant| .12526*** .00926 13.53 .0000 .10711 .14341
LF| .04979*** .00823 6.05 .0000 .03366 .06592
|Variance parameter for v +/- u
Sigma| .14165*** .01265 11.20 .0000 .11686 .16645
|Asymmetry parameter, lambda
Lambda| 1.12768*** .42335 2.66 .0077 .29792 1.95743
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
Figure E64.11 shows the distributions of the estimates of inefficiencies from the random parameters
model and the simple, pooled fixed parameters model. The figure suggests that the RP formulation
is moving some of the variation of the outcome variable out of the inefficiency term and into the
production model, in the form of parameter variation.
E65: Data Envelopment Analysis E-115
Figure E64.11 Kernel Density Estimator for Random Parameters Model Inefficiencies
-----------------------------------------------------------------------------
Random Coefficients FrntrTrn Model
Dependent variable LQ
Log likelihood function 199.14429
Estimation based on N = 256, K = 13
Unbalanced panel has 25 individuals
Stochastic frontier, truncation/hetero.
Simulation based on 50 Halton draws
Estimated parameters of efficiency dstn
s(u) = .189842 s(v)= .07165
avgE[u|e]= .10986 avgE[TE|e]= .90303
Lambda = su/sv = 2.64974
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
LQ| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Nonrandom parameters
LM| .62243*** .04223 14.74 .0000 .53966 .70521
LE| .38353 .28063 1.37 .1717 -.16649 .93355
LL| -.36579*** .03589 -10.19 .0000 -.43614 -.29544
LP| .15282*** .04217 3.62 .0003 .07017 .23547
LK| -.16125 .31392 -.51 .6075 -.77652 .45401
suONE| 9.05239*** 1.65934 5.46 .0000 5.80014 12.30464
|Means for random parameters
Constant| -1.17144*** .29799 -3.93 .0001 -1.75549 -.58739
LF| .49011*** .04904 9.99 .0000 .39398 .58623
suLOADFC| -16.4160*** 3.47560 -4.72 .0000 -23.2281 -9.6039
|Scale parameters for dists. of random parameters
Constant| .12591*** .00859 14.65 .0000 .10906 .14275
LF| .01186** .00593 2.00 .0456 .00023 .02350
suLOADFC| 1.47653*** .36192 4.08 .0000 .76718 2.18589
|Sigma(v) from symmetric disturbance.
Sigma(v)| .07165*** .00670 10.69 .0000 .05851 .08478
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E65: Data Envelopment Analysis E-116
where the unobservable, time invariant factor, „mi‟ is labeled „management‟ in their paper. By
treating the unobserved factor as a random component in the model, the authors develop a stochastic
frontier model in which the resultant functional form is such that all random parameters are functions
of the same single random effect, vi, and the vi appears in squared form in the equation as well. In
generic terms, this model is a random parameters stochastic frontier model with random constant
term and first order terms, and nonrandom second order terms in a translog model. The functional
form is
log yit i k 1 k ,i ln xit ,k k 1 m 1 km ln xit ,k ln xit ,m vit uit
K K K
i wi ( 12 wi2 )
k ,i k k wi
wi ~ N [0,1]
vit ~ N [0, v2 ]
uit | N [0, u2 ] |
This model is specified simply by creating the necessary variables, then building a random
parameters model with the two additional specifications,
; Common ; Mgt
The ; Common specification alone is generic, and applies to all random parameters models. Use it
to specify that the same random component appears in all random parameters. The ; Mgt
specification has no function outside the frontier model. It is used only with the frontier model to
specify this particular form. For example, consider the following three factor translog model:
(It is always necessary to fit the frontier model with fixed parameters first to generate the starting
values.)
E65: Data Envelopment Analysis E-117
An extension of this model that the authors considered was intended to ameliorate the
probable correlation between the random effect wi and the independent variables (factors). The
Mundlak approach to this problem is to incorporate the group means of the variables in the model.
For this model, they proposed
where fi is now the structural random variable that drives the random parameters. This extension is
requested with
; Means
(The program deduces internally which variables are nonconstant and should be used.)
Application
The following is the Alvarez, Arias and Greene application. The data consists of six years of
observations on 247 Spanish dairy farms. The output, yit is milk production. The four inputs, x1, x2,
x3 and x4 are feed, land, labor and cows. Commands for fitting the model are as follows: (We have
restricted the number of iterations and the number of replications for purpose of this numerical
illustration.) Both models (with and without the Mundlak adjustment) are shown.
The first set of results is the pooled stochastic frontier model with no extensions or
modifications.
E65: Data Envelopment Analysis E-118
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable YIT
Log likelihood function 851.16734
Estimation based on N = 1482, K = 15
Variances: Sigma-squared(v)= .00876
Sigma-squared(u)= .02831
Sigma(v) = .09359
Sigma(u) = .16825
Sigma = Sqr[(s^2(u)+s^2(v)]= .19253
Gamma = sigma(u)^2/sigma^2 = .76371
Var[u]/{Var[u]+Var[v]} = .54012
Stochastic Production Frontier, e = v-u
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 829.23705
Chi-sq=2*[LogL(SF)-LogL(LS)] = 43.861
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
YIT| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| 11.6942*** .00529 2209.86 .0000 11.6838 11.7046
X1| .60483*** .02133 28.35 .0000 .56302 .64664
X2| .02246** .01140 1.97 .0489 .00011 .04480
X3| .02336* .01245 1.88 .0606 -.00104 .04776
X4| .44945*** .01172 38.34 .0000 .42647 .47242
X11| .59297*** .13525 4.38 .0000 .32789 .85806
X12| -.17183*** .04842 -3.55 .0004 -.26673 -.07693
X13| .20033*** .06903 2.90 .0037 .06502 .33563
X14| -.32993*** .07299 -4.52 .0000 -.47297 -.18688
X23| .00386 .04203 .09 .9268 -.07852 .08624
X24| .06473** .03009 2.15 .0314 .00576 .12369
X34| -.07096* .03853 -1.84 .0655 -.14648 .00455
X44| .20854*** .04328 4.82 .0000 .12373 .29336
|Variance parameters for compound error
Lambda| 1.79780*** .10292 17.47 .0000 1.59608 1.99951
Sigma| .19253*** .00011 1715.95 .0000 .19231 .19275
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E65: Data Envelopment Analysis E-119
+---------------------------------------------+
| Random Coefficients Frontier Model |
| Dependent variable YIT |
| Log likelihood function 1327.58807 |
| Estimation based on N = 1482, K = 21 |
| Sample is 6 pds and 247 individuals |
+---------------------------------------------+
-----------------------------------------------------------------------------
All parameters have the same random effect
Alvarez/Arias/Greene Fixed Mgt. SF Model
Stochastic frontier (half normal model)
Simulation based on 25 Halton draws
Sigma( u) (1 sided) = .09355
Sigma( v) (symmetric) = .05799
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
YIT| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Production / Cost parameters, nonrandom first
X11| .19550** .08392 2.33 .0198 .03101 .35999
X12| -.00410 .02903 -.14 .8876 -.06100 .05279
X13| -.03972 .04116 -.96 .3346 -.12039 .04095
X14| -.08681** .04220 -2.06 .0397 -.16952 -.00410
X23| .02377 .02534 .94 .3483 -.02590 .07344
X24| -.01893 .01743 -1.09 .2775 -.05310 .01524
X34| .02550 .02305 1.11 .2684 -.01967 .07067
X44| .09988*** .02339 4.27 .0000 .05403 .14572
|Means for random parameters
Constant| 11.6506*** .00445 2620.80 .0000 11.6418 11.6593
X1| .65048*** .01227 53.03 .0000 .62643 .67452
X2| .03525*** .00681 5.17 .0000 .02190 .04861
X3| .04531*** .00759 5.97 .0000 .03043 .06019
X4| .40147*** .00646 62.16 .0000 .38881 .41413
|Coefficients on unobservable fixed management
Constant| .12579*** .00238 52.96 .0000 .12114 .13045
X1| -.02248* .01218 -1.85 .0649 -.04635 .00139
X2| .00767 .00851 .90 .3676 -.00902 .02436
X3| .00794 .00939 .85 .3979 -.01047 .02635
X4| -.00967 .00657 -1.47 .1410 -.02255 .00320
Alpha_mm| -.02835*** .00414 -6.85 .0000 -.03646 -.02024
|Variance parameter for v +/- u
Sigma| .11007*** .00289 38.04 .0000 .10439 .11574
|Asymmetry parameter, lambda
Lambda| 1.61332*** .11959 13.49 .0000 1.37893 1.84771
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E65: Data Envelopment Analysis E-120
+---------------------------------------------+
| Random Coefficients Frontier Model |
| Dependent variable YIT |
| Log likelihood function 1273.63070 |
| Sample is 6 pds and 247 individuals |
+---------------------------------------------+
-----------------------------------------------------------------------------
All parameters have the same random effect
Alvarez/Arias/Greene Fixed Mgt. SF Model
Stochastic frontier (half normal model)
Simulation based on 25 Halton draws
Sigma( u) (1 sided) = .12577
Sigma( v) (symmetric) = .05376
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
YIT| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Production / Cost parameters, nonrandom first
X11| -.06957 .08521 -.82 .4142 -.23658 .09743
X12| .00164 .02989 .05 .9562 -.05693 .06022
X13| .31592*** .04339 7.28 .0000 .23087 .40097
X14| -.08946* .04767 -1.88 .0606 -.18289 .00398
X23| -.02088 .02784 -.75 .4533 -.07545 .03369
X24| -.04357** .01912 -2.28 .0227 -.08103 -.00610
X34| -.15581*** .02350 -6.63 .0000 -.20187 -.10975
X44| .16310*** .02763 5.90 .0000 .10895 .21725
|Means for random parameters
Constant| 11.6829*** .00449 2601.72 .0000 11.6741 11.6917
X1| .60260*** .02198 27.41 .0000 .55951 .64569
X2| .05221*** .01636 3.19 .0014 .02015 .08427
X3| .10728*** .02775 3.87 .0001 .05290 .16166
X4| .39780*** .01047 38.00 .0000 .37728 .41832
|Coefficients on unobservable fixed management
Constant| .11398*** .00235 48.52 .0000 .10937 .11858
X1| -.05393*** .01134 -4.76 .0000 -.07616 -.03171
X2| .03061*** .00916 3.34 .0008 .01265 .04857
X3| .01309 .01202 1.09 .2760 -.01046 .03665
X4| .01621** .00707 2.29 .0218 .00236 .03007
Alpha_mm| -.03575*** .00368 -9.72 .0000 -.04296 -.02855
|Variance parameter for v +/- u
Sigma| .13678*** .00368 37.19 .0000 .12957 .14399
|Asymmetry parameter, lambda
Lambda| 2.33925*** .14491 16.14 .0000 2.05524 2.62326
|Variable Means in Unobserved Management
X1_bar| -.12466 .22073 -.56 .5722 -.55728 .30796
X2_bar| .00045 .15758 .00 .9977 -.30839 .30930
X3_bar| .01632 .25437 .06 .9489 -.48224 .51487
X4_bar| .15107 .11332 1.33 .1825 -.07102 .37316
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
E65: Data Envelopment Analysis E-121
where „j‟ indicates class j. The truncation and heteroscedasticity models are not supported by this
estimator. However, the Battese and Coelli model, in which
(As in other panel data settings, it is necessary to fit the pooled model first to compute the starting
values.)
The Battese and Coelli models may be specified here with
; Model = BC
; Model = BC
; Hfu = one, heteroscedasticity variables
For this model, you must fit the identical Battese and Coelli model without the latent class
specification first. The application below demonstrates.
The basic form of the latent class model assumes that the class probabilities are fixed values.
You may make them dependent on time invariant variables, wi with
Some particular variables computed for the latent class model are
; List
An example appears below. You can also use the ; Rst = list option to structure the latent class
model so that different variables appear in different classes or that certain coefficients are equal
across classes. Examples are given in Chapter E20.
Estimates retained by this model include:
Note that b and varb involve J(K+2) estimates. Two additional matrices are created,
Standard Model Specifications for the Latent Class Stochastic Frontier Model
This is the full list of general specifications that are applicable to this model estimator.
; Covariance Matrix displays estimated asymptotic covariance matrix (normally not shown),
same as ; Printvc.
; Robust requests a „sandwich‟ estimator or robust covariance matrix for TSCS and
several discrete choice models.
E65: Data Envelopment Analysis E-123
Application
The airline data used in the preceding examples are clearly not compatible with this model;
no configuration of the equation produces meaningful results. To illustrate the estimator, we have
borrowed the Spanish dairy data used in the previous section. The following commands fit a two
class, Battese and Coelli decay model.
NAMELIST ; x = one,x1,x2,x3,x4 $
FRONTIER ; Lhs = yit ; Rhs = x
; Model = BC
; Pds = 6 $
FRONTIER ; Lhs = yit ; Rhs = x
; Model = BC
; LCM ; Pts = 2 ; Pds = 6 ; List $
E65: Data Envelopment Analysis E-124
-----------------------------------------------------------------------------
Limited Dependent Variable Model - FRONTIER
Dependent variable YIT
Log likelihood function 1390.20024
Stochastic frontier based on panel data
Estimation based on 247 individuals
Variances: Sigma-squared(v)= .00549
Sigma-squared(u)= .03940
Sigma(v) = .07413
Sigma(u) = .19848
Sigma = Sqr[(s^2(u)+s^2(v)]= .21187
Gamma = sigma(u)^2/sigma^2 = .87759
Var[u]/{Var[u]+Var[v]} = .72263
Stochastic Production Frontier, e = v-u
Battese-Coelli Models: Time Varying uit
Time dependent uit=exp[-eta(t-T)]*|U(i)|
LR test for inefficiency vs. OLS v only
Deg. freedom for sigma-squared(u): 1
Deg. freedom for heteroscedasticity: 0
Deg. freedom for truncation mean: 0
Deg. freedom for inefficiency model: 1
LogL when sigma(u)=0 809.67610
Chi-sq=2*[LogL(SF)-LogL(LS)] = 1161.048
Kodde-Palm C*: 95%: 2.706, 99%: 5.412
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
YIT| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Deterministic Component of Stochastic Frontier Model
Constant| 11.7882*** .00716 1646.05 .0000 11.7742 11.8022
X1| .62230*** .01365 45.59 .0000 .59555 .64905
X2| .06001*** .01069 5.61 .0000 .03905 .08096
X3| .05708*** .01454 3.93 .0001 .02858 .08557
X4| .35510*** .00700 50.69 .0000 .34137 .36883
|Variance parameters for compound error
Lambda| 2.67761*** .02351 113.88 .0000 2.63152 2.72369
Sigma(u)| .19848*** .00060 332.72 .0000 .19731 .19965
|Eta parameter for time varying inefficiency
Eta| .08030*** .00432 18.60 .0000 .07184 .08877
--------+--------------------------------------------------------------------
E65: Data Envelopment Analysis E-125
+---------------------------------------------------+
| Stochastic Frontier Model Variance Parameters |
| Class Lambda Sigma Sigma(u) Sigma(v) |
| 1 .020709 .711607 .014734 .711454 |
| 2 .050840 .928393 .047139 .927195 |
+---------------------------------------------------+
=============================================================================
Predictions computed for the group with the largest posterior probability
Obs. Periods Estimated inefficiencies, E[u|v -/+ u]
=============================================================================
Ind.= 1 J* = 1 P(j)= .889 .111
01-06 .3105 .2554 .2100 .1727 .1421 .1168
Ind.= 2 J* = 2 P(j)= .295 .705
01-06 .0813 .0757 .0706 .0657 .0613 .0571
Ind.= 3 J* = 2 P(j)= .012 .988
01-06 .2254 .2100 .1957 .1824 .1699 .1584
Ind.= 4 J* = 1 P(j)= .955 .045
01-06 .1778 .1463 .1203 .0989 .0814 .0669
Ind.= 5 J* = 1 P(j)= .650 .350
01-06 .2453 .2018 .1659 .1365 .1122 .0923
Ind.= 6 J* = 2 P(j)= .138 .862
01-06 .0517 .0482 .0449 .0418 .0390 .0363
Ind.= 7 J* = 1 P(j)= .985 .015
01-06 .3010 .2476 .2036 .1674 .1377 .1132
Ind.= 8 J* = 2 P(j)= .165 .835
01-06 .0561 .0523 .0487 .0454 .0423 .0394
Ind.= 9 J* = 2 P(j)= .450 .550
01-06 .0134 .0125 .0116 .0108 .0101 .0094
Ind.= 10 J* = 1 P(j)= .999 .001
01-06 .1039 .0855 .0703 .0578 .0475 .0391
(Farms 11-247 omitted)
E65: Data Envelopment Analysis E-127
The optimization program seeks the optimal weights to maximize the „efficiency‟ of firm s subject to
the restriction that the efficiencies of all firms are less than or equal to one, and that all weights are
nonnegative. Because the objective function is homogeneous of degree zero – any multiple of the
weights produces the same solution – it is normalized with a restriction such as xi = 1.
Transforming and simplifying the problem a bit produces the equivalent program,
An equivalent form of the problem is the envelopment form (hence the name),
The value of i is the input oriented technical efficiency score for the ith firm
TEINPUT,i = i.
It measures the extent to which the firm could reduce inputs to obtain the same output – relative to
other firms in the sample. Note that the program is solved for each firm in the sample – an efficiency
score i is generated for each firm. For some firms in the sample, the efficiency score will be 1.0.
This indicates firms deemed to be technically efficient. Otherwise, i < 1.
The preceding formulation includes an implicit assumption of constant returns to scale
(CRS). The assumption is relaxed to variable returns to scale (VRS), by adding a restriction
s s = 1.
Variable returns to scale is the standard assumption in contemporary applications. This provides a
means by which the „scale efficiency‟ of the firm can be measured. Let iC denote the technical
efficiency measure obtained assuming constant returns and iV be the variable returns to scale
counterpart. Then, the „scale efficiency‟ may be measured by
This can be computed using the results of the two different programs after computation. A
„nonincreasing returns to scale‟ (NRS) version of the program can be obtained by changing the adding
up restriction to
s s < 1.
E65: Data Envelopment Analysis E-129
An alternative view of the optimization process is to consider the extent to which outputs
could conceivably be increased using the same inputs – again relative to the standard of other firms
in the sample. The linear program which produces this solution is
As before, to allow for variable returns to scale (VRS), we add s s = 1. In this program, i gives the
cost minimizing vector of inputs for output yi and input prices wi. The cost efficiency for the ith firm is
then the ratio
0 < CEi = wii / wixi < 1.
Allocative efficiency may be measured using
0 < AEi = CEi / TEINPUT,i < 1.
We will define the components for the three programs defined earlier. Note, first, for convenience,
we define the data matrices, Y and X. Y is an NM matrix of outputs whose ith row is the vector of
outputs for firm i; X is the NK matrix of inputs, defined likewise. For an individual firm, we define
yi to the M1 column vector of outputs for firm i; thus, yi is the transpose of the ith row of Y.
Likewise, xi is the column vector of K inputs for firm i, the transpose of the ith row of X. Finally,
the column vector of weights is = (1,...,N). Thus,
Finally, we note once again, the programs about to be defined are solved for each firm to obtain the
efficiency scores. (In fact, should be indexed by firm, since it is recomputed each time. For
convenience, we have omitted this subscript.) We use the symbol ∞K and ∞M to indicate a vector
whose each element equals infinity (or sometimes minus infinity) and boldface 1 or 0 to indicate a
vector of ones or zeros with a subscript to indicate the number of elements. Finally, our tableaus
include the VRS restriction, which may be suppressed by the user for the CRS form.
With all this in place, we can define the solutions to the optimization problems just by
identifying the components of the linear programming problems. These are as follows:
0 0 1
d L = N , c = N , = , dU = N
0 1 i 1
- K X -xi 0K
b L = y i , A = Y 0 M , bU = M
1 1N 0 1
0 0 1
d L N , c N , , dU N
1 1 i
K X 0 K xi
b L 0M , A Y -y i , bU M
1 1N 0 1
E65: Data Envelopment Analysis E-131
Allocative Efficiency
0 0 1
d L N , c N , , dU N
0K wi i
K X -I K 0K
b L -y i , A Y 0M K , bU M
1 1N 0K 1
One final note, DEA requires a fair amount of computation. The linear program involves
M+K+1 constraints and N+1 activities, and it is computed once for each of the N firms in the sample.
The amount of computation increases with the square of N. The particular computations are quite
fast, however
; CRS
to the command. The nonincreasing returns to scale form (Σi i < 1) is requested with
; NRS
The program computes the DEA efficiency scores (input and output oriented, and economic
efficiency), and stores them as variables and as matrices. (See the description in the next section.) If
you would like to see a listing of the scores on your screen, in the output window, add
; List
to the command. The list of „peer‟ firms for each observation (see Section E65.5.1 below) may be
requested by adding
; Peers
to the command. Finally, to obtain bootstrapped confidence limits for the estimator, add
+---------------------------------------------------------------------------+
| Data Envelopment Analysis |
| Output Variables: MILK |
| Input Variables: COWS LAND LABOR FEED |
| Underlying Technology assumes VARIABLE Returns to Scale. |
+---------------------------------------------------------------------------+
| Estimated Efficiencies: Mean Std.Deviation Minimum Maximum |
| Technical Efficiency ======= ============= ======= ======= |
| Input Oriented .8301 .1416 .4823 1.0000 |
| Output Oriented .7388 .1268 .3875 1.0000 |
| Sample Size: 1482 Observations. 1482 Complete observations |
| Efficiencies saved as variables DEAEFF_O, DEAEFF_I and DEAEFF_E |
| Efficiencies saved as matrices DEA_EFFO, DEA_EFFI and DEA_EFFE |
| Incomplete observations are filled with zeros for efficiency values. |
+---------------------------------------------------------------------------+
As noted, the computed efficiency scores are saved in two places, in the data area, as variables
deaeff_i and deaeff_o and deaeff_e if you provide input prices for the economic efficiency analysis.
The same results are saved as matrices, dea_effo, dea_effi, dea_effe. Note that in both occurrences,
the estimator is bypassing missing and bad (nonpositive) data. If any of the variables used in the
analysis are missing, the observation is assigned an efficiency score of 0.0. The matrices will have
row dimension equal to the original sample size, before the bypass of missing values.
The example below includes a listing of the efficiency scores. The observation identifier
shows I = the sequence number of the observation used in the analysis. The R = value shows,
instead, the actual location of the observation in the raw data set. I will not equal R if you have used
a subset of the data (e.g., with SAMPLE or REJECT), or if the program has bypassed missing data
– the listing will only show the complete observations. If you have included observation labels, e.g.,
firm names, in your data set, these observation and row identifiers will be replaced with the
observation names for your data set.
For a second example, the following analyzes the Christensen and Greene (1976) electricity
generation data. For these data, we have the input prices, so we do the full analysis.
+---------------------------------------------------------------------------+
| Data Envelopment Analysis |
| Output Variables: OUTPUT |
| Input Variables: LABOR CAPITAL FUEL |
| Price Variables: LPRICE CPRICE FPRICE |
| Underlying Technology assumes VARIABLE Returns to Scale. |
+---------------------------------------------------------------------------+
| Estimated Efficiencies: Mean Std.Deviation Minimum Maximum |
| Technical Efficiency ======= ============= ======= ======= |
| Input Oriented .7692 .1390 .3464 1.0000 |
| Output Oriented .7657 .1467 .2960 1.0000 |
| Economic Efficiency .4331 .1965 .1411 1.0000 |
| Allocative Effic. .5473 .1754 .1796 1.0000 |
| Sample Size: 123 Observations. 123 Complete observations |
| Efficiencies saved as variables DEAEFF_O, DEAEFF_I and DEAEFF_E |
| Efficiencies saved as matrices DEA_EFFO, DEA_EFFI and DEA_EFFE |
| Incomplete observations are filled with zeros for efficiency values. |
| Compute allocative efficiency as technical divided by economic efficiency |
+---------------------------------------------------------------------------+
It is always interesting to compare the DEA results with those obtained using the stochastic
frontier model. The following fits a translog stochastic frontier production function for the
Christensen and Greene data, computes the technical efficiencies, and plots them against the DEA
efficiency scores. As has been widely documented, the results are not so close to each other as one
might hope.
E65.5.2 Application
The following uses all the features of the routine save for the Malmquist TFP computation
and the allocative efficiency routine. The sample data are in an Excel spreadsheet:
+---------------------------------------------------------------------------+
| Data Envelopment Analysis |
| Output Variables: CAMERAS VIDEO WARRANTY |
| Input Variables: FLOOR STAFF |
| Underlying Technology assumes CONSTANT Returns to Scale. |
+---------------------------------------------------------------------------+
| Estimated Efficiencies: Mean Std.Deviation Minimum Maximum |
| Technical Efficiency ======= ============= ======= ======= |
| Input Oriented .9132 .1270 .6387 1.0000 |
| Output Oriented .9132 .1270 .6387 1.0000 |
| Sample Size: 11 Observations. 11 Complete observations |
| Efficiencies saved as variables DEAEFF_O, DEAEFF_I and DEAEFF_E |
| Efficiencies saved as matrices DEA_EFFO, DEA_EFFI and DEA_EFFE |
| Incomplete observations are filled with zeros for efficiency values. |
+---------------------------------------------------------------------------+
E65: Data Envelopment Analysis E-137
SAMPLE ; All $
REJECT ; Small > 0 $
CREATE ; dalebar = Group Mean(dale, Str = country) $
CREATE ; hexpbar = Group Mean(hexp, Str = country) $
CREATE ; educbar = Group Mean(educ, Str = country) $
REJECT ; year # 1997 $
CREATE ; logdbar = Log(dalebar) $
CREATE ; loghbar = Log(hexpbar) $
CREATE ; logebar = Log(educbar) $
FRONTIER ; Lhs = logdbar ; Rhs = one,loghbar,logebar ; Techeff = effsfa $
FRONTIER ; Lhs = dalebar ; Rhs = hexpbar,educbar ; Alg = DEA$
DSTAT ; Rhs = effsfa,deaeff_i,deaeff_o ; Output = 2 $
PLOT ; Lhs = effsfa ; Rhs = deaeff_i ; Grid
; Title = SFA Efficiencies vs. DEA Input Efficiencies $
PLOT ; Lhs = effsfa ; Rhs = deaeff_o ; Limits=.4,1.1 ; Grid
; Title = SFA Efficiencies vs. DEA Output Efficiencies $
CREATE ; sfarank = Rnk(effsfa) $
CREATE ; dearanki = Rnk(deaeff_i) $
CREATE ; dearanko = Rnk(deaeff_o) $
CALC ; List ; Rkc(sfarank,dearanki)
; Rkc(sfarank,dearanko)
; Rkc(dearanki,dearanko) $
PLOT ; Lhs = sfarank ; Rhs = dearanki
; Endpoints = 0,200 ; Limits = 0,200 ; Grid
; Title = Ranks of SFA Efficiencies vs. DEA Input Efficiencies $
PLOT ; Lhs = sfarank ; Rhs = dearanko
; Endpoints = 0,200 ; Limits = 0,200 ; Grid
; Title = Ranks of SFA Efficiencies vs. DEA Output Efficiencies $
E65: Data Envelopment Analysis E-139
+---------------------------------------------------------------------------+
| Data Envelopment Analysis |
| Output Variables: DALEBAR |
| Input Variables: HEXPBAR EDUCBAR |
| Underlying Technology assumes VARIABLE Returns to Scale. |
+---------------------------------------------------------------------------+
| Estimated Efficiencies: Mean Std.Deviation Minimum Maximum |
| Technical Efficiency ======= ============= ======= ======= |
| Input Oriented .6138 .2089 .2059 1.0000 |
| Output Oriented .8794 .1124 .5061 1.0000 |
| Sample Size: 191 Observations. 191 Complete observations |
| Efficiencies saved as variables DEAEFF_O, DEAEFF_I and DEAEFF_E |
| Efficiencies saved as matrices DEA_EFFO, DEA_EFFI and DEA_EFFE |
| Incomplete observations are filled with zeros for efficiency values. |
+---------------------------------------------------------------------------+
Descriptive Statistics
--------+---------------------------------------------------------------------
Variable| Mean Std.Dev. Minimum Maximum Cases Missing
--------+---------------------------------------------------------------------
EFFSFA| .882053 .059219 .801579 .982272 191 0
DEAEFF_I| .613836 .208905 .205870 1.0 191 0
DEAEFF_O| .879363 .112447 .506133 1.0 191 0
--------+---------------------------------------------------------------------
--------+--------------------------
Cor.Mat.| EFFSFA DEAEFF_I DEAEFF_O
--------+--------------------------
EFFSFA| 1.00000 .70610 .75911
DEAEFF_I| .70610 1.00000 .72559
DEAEFF_O| .75911 .72559 1.00000
Figure E65.4 Plot of Ranks of SFA Efficiency Scores vs. Ranks of DEA Scores
E65: Data Envelopment Analysis E-142
TEi (t + 1 | t ) ×TEi (t + 1 | t + 1)
M i,O (t,t + 1) =
TEi (t | t ) ×TEi (t | t + 1)
where TE(r|s) indicates the earlier defined output oriented technical efficiency index for firm i, using
inputs xi,r and producing outputs yi,r relative to production (and input usage) for firms based in period s.
This index is computed using the following program:
0 0 1
d L N , c N , , dU N
0 1 ir
X 0 K x
bL K , A s , bU i
0M Ys -y ir M
This uses the constant returns to scale form. Also, since the period r output and input vectors for firm i
will not appear in Ys and Xs when r does not equal s, ir need not be larger than one. Note that this
requires solution of four linear programs for each firm in each period, so the total number of programs to
solve will be 4NT. Each is quite fast, so overall, the computations do not take long. In the sample of
247 firms and six periods, the nearly 6,000 programs, each involving 248 activities and six constraints,
took about 10 seconds.
These computations are carried out for each firm in each period save the last one, and produce an
NT matrix of TFP values, one row for each firm, one column for each period. The TFP value for the last
period is recorded as 1.0, though this is just a space filler.
To compute the Malmquist TFP indices, you will require a panel of data, at least two periods, for
each of N firms. Unlike other panel data routines in LIMDEP, this computation always requires a
balanced panel. Every firm must be observed in the same T periods. Also, this routine has no procedures
for avoiding missing or invalid data such as zero values for inputs or outputs. The balanced panel must be
„clean‟ before computation begins. To request the computations, just add
; Pds = t, the fixed number of periods.
Nothing else need be changed. There is no bootstrap feature (; Nbt = 0); the computations assume
constant returns to scale (; CRS is the default and cannot be changed) and no allocative efficiency (; Rh2
is ignored).
E65: Data Envelopment Analysis E-143
To illustrate the Malmquist computations, we reexamine the sample of 247 Spanish dairy farms
observed for six years. The output is milk production. Inputs are cows, land, labor and feed.
The following results are displayed. In addition, a matrix containing the full table, named malmquist, is
created.
==============================================================================
Malmquist TFP Index for Productivity Change
Panel contained 247 firms each observed in 6 periods
Full Results saved as matrix MALMQIST
==============================================================================
Average results across firms, by period:
==============================================================================
Period: 1 2 3 4 5
TFP 1.0476 1.0233 1.0247 1.0298 1.0349
==============================================================================
Individual calculations by firm
(Only 8 periods can be displayed. TFP for the final period is not computed.)
==============================================================================
Observation 1 2 3 4 5 6 7 8
Firm = 1 1.1301 1.1002 .9736 1.0291 1.0901 1.
Firm = 2 1.0528 1.0343 1.0212 1.0109 1.0416 1.
Firm = 3 1.0525 1.0383 .9477 1.0465 1.0395 1.
Firm = 4 1.1418 1.0129 1.0079 .9829 1.0476 1.
Firm = 5 1.1192 1.0240 1.0082 1.0245 1.0641 1.
Firm = 6 .9871 1.0073 .9785 1.0322 1.0464 1.
Firm = 7 .9851 1.1484 1.1599 .8054 1.1110 1.
Firm = 8 1.0746 .9796 .9636 1.0671 .9753 1.
Firm = 9 .8977 1.1496 .9818 1.0500 .9867 1.
Firm = 10 1.0105 1.1507 .9751 1.0055 1.0469 1.
Firm = 11 1.1276 .9867 .9636 1.0826 .9873 1.
Firm = 12 1.0310 1.1020 .9822 1.0438 .9914 1.
Firm = 13 1.0549 1.1263 .9221 1.0723 1.1945 1.
Firm = 14 .9408 1.0740 .9938 .9739 1.0336 1.
Firm = 15 .8952 .7156 1.5056 .8614 .9204 1.
(Rows 66 – 247 omitted).