You are on page 1of 28

CHAPTER ONE

AN INTRODUCTION TO DATA ENVELOPMENT


ANALYSIS

VINCENT CHARLES AND MUKESH KUMAR

Abstract
The introductory chapter begins with a brief introduction to data
envelopment analysis (DEA) and its origin, followed by the basic
assumptions. Further, the concept of efficiency is introduced, through
worked examples and graphs, to researchers unfamiliar with the technique.
Finally, the basic models of DEA are provided for direct reference for
those interested in efficiency evaluation, ranking, and benchmarking of
decision-making units (DMUs).

T
1.1 Introduction
Economics and operations research have common interests in several
AF
research fields, including the analysis of the production possibilities for
micro units. The stochastic frontier approach (SFA) (parametric) and the
data envelopment analysis (DEA) (non-parametric) models have emerged
as two alternative developments of ideas that originated with Farrell
(1957). Grosskopf (1986) noted that the parametric approach has been
developed mainly by economists, whereas the nonparametric has been left
to those in operations research. The popularity of DEA over econometric
R

approaches lies in its flexibility to incorporate the existence of multiple


inputs and multiple outputs readily without any assumption on the
functional form.
The facts that make the DEA a relatively superior tool for the
D

evaluation of efficiency are as follow: Firstly, a typical statistical approach


is characterised as a central tendency approach and it evaluates producers
relative to an average producer. In contrast, DEA is an extreme-point
2 Chapter One

method and compares each producer with only the best producer(s).
Secondly, DEA does not require any underlying assumption of a
functional form relating to inputs and outputs. Given the set of inputs and
outputs of different firms, it constructs its own functional form. Thus, it
avoids the danger of misspecification of frontier technology. In contrast,
the econometric approach assumes a functional form, such as Cobb-Douglas
or Translog, relating to inputs and output. Thirdly, the parametric
approach estimates the efficiency of firms producing a single output with a
set of multiple inputs, whereas the DEA readily incorporates the existence
of multiple outputs. Fourthly, in the parametric approach, the decomposition
of the error term into two parts, one representing stochastic error, and the
other representing inefficiency is not useful for the datasets of fewer than
100 observations (Aigner, Lovell, & Smith, 1997). DEA, on the other
hand, works well with a small sample size. As a rule of thumb, the
minimum sample required for DEA analysis is just three times larger than
the sum of the number of inputs and outputs (Nunamaker, 1985; Raab &
Lichty, 2002).
Recent years have seen great varieties of application of DEA in almost
every field, including agriculture, banking1, benchmarking, education,
energy and environment2, economy, government, health, insurance,
information technology, marketing, operations, public policy, human
resources, manufacturing, retailing, regulation, services and tourism, and
others. Several authors have surveyed the general DEA literature3 and
provided scenarios for DEA methodology4 development in different time

T
periods for a range of issues. The strong growth of DEA research in recent
years has increased the DEA literature to a very large scale. The rapid
increase in its popularity can be inferred from the fact that Seiford (1994)
AF
in his DEA bibliography has listed not less than 472 published articles and
accepted Ph.D. dissertations, some dated even as early as 1992.
Emrouznejad et al. (2008) provided a detailed bibliography of DEA
literature published in various journals/book chapters/proceedings since
1978, which clearly shows an exponential growth in the literature. In one
of the very recent survey papers, Liu et al. (2012) reported that up to the
year 2009, the field has accumulated approximately 4,500 papers in the ISI
R

Web of Science Database.

1.2 The Origin of Data Envelopment Analysis


D

The specific research stand of efficiency measurement for production


units in the field of operations research took off with Measuring the
Efficiency of Decision Making Units by Charnes, Cooper, and Rhodes
An Introduction to Data Envelopment Analysis 3

(1978)5 as the seminal paper (Førsund & Sarafoglou, 2002). However, the
intellectual root of DEA in economics can be traced all the way back to the
early 1950s. In the aftermath of World War II, linear programming (LP)
came to be recognised as a powerful tool for economic analysis. The
papers in the Cowles commission monograph, Activity Analysis of
Production and Resource Allocation, edited by Koopmans (1951),
recognised the communality between existence of nonnegative prices and
quantities in a Walras-Cassel economy and the mathematical programming
problem of optimising the objective function subject to a set of linear
inequality constraints. Koopmans (1951) defined a point in the commodity
space as efficient whenever an increase in the net output of one product
can be achieved only at the cost of a decrease in the net output of another
product. In view of its obvious similarity with the condition for Pareto
optimality, this definition is known as the Pareto-Koopmans condition of
technical efficiency. In the same year, Debreu (1951) defined the
coefficient of resource utilisation as a measure of technical efficiency for
the economy as a whole, and any deviation of this measure from unity was
interpreted as deadweight loss suffered by society due to inefficient
utilisation of resources.
Farrell (1957) made a path-breaking contribution through the seminal
work The Measurement of Productive Efficiency by constructing an LP
model using actual input-output data for a sample of firms, the solutions of
which yield a numerical measure of technical efficiency of an individual
firm in the sample. He demonstrated that the economic efficiency could be

T
decomposed into allocative efficiency and technical efficiency. The
technical efficiency reflects the ability of a firm to obtain the maximal
output from a given set of inputs, whereas, the allocative efficiency
AF
reflects the ability of a firm to use the inputs in optimal proportion, given
the prices of the resources. The idea of Farrell (1957) can be illustrated
with a simple example involving firms, using two inputs (X1 and X2) to
produce a single output (Y) under the assumption of constant returns to
scale (CRS). The assumption of CRS implies that a radial increase in input
vector causes the same proportion of increase in the output vector
(doubling of all inputs leads to doubling of all outputs).
R

A further assumption is that the efficient production function is known.


The curve SS′ in Figure 1-1 represents the unit isoquant of an efficient
producer. To produce a unit of output, let a firm uses the quantities of
inputs denoted by the point A. If we draw a line from the origin to the
D

point A, it will intersect the efficient isoquant at the point B. This means
that if inputs can be reduced equiproportionately, the efficient point will be
at point B, which must lie on the efficient isoquant SS′. Thus, point B
4 Chapter One

represents the combination of inputs in the same proportion as in point A


but with a lesser amount of both inputs to produce a unit level of output.
An OB/OA fraction of inputs is now needed to produce the same level of
output or, in other words, OA/OB times of output can be produced from
the given level of both the inputs. The technical efficiency of the firm can
be measured by the ratio

TE = OA/OB

where TE ≤ 1. TE = 1 indicates that the firm is technically efficient,


whereas, TE < 1 indicates that the firm is technically inefficient.

Figure 1–1 Technical, allocative and economic efficiency

T
AF
In the above definition of efficiency, the role of input price in
measuring the efficiency is not considered. In order to access the efficient
allocation of inputs in terms of input price, let us introduce the price line
or isocost line PP′ in Figure 1-1. It represents the minimum cost required,
R

given the price of inputs for the use of same proportion of inputs as is used
at point B. Thus, the ratio OC/OB gives the measure of price efficiency or
allocative efficiency.
D

AE = OC/OB

where the distance BC represents the reduction in production costs that


would occur if the production were to occur at the allocatively (and
An Introduction to Data Envelopment Analysis 5

technically) efficient point B′ instead of at the technically efficient, but


allocatively inefficient, point B.
If the firm is efficient, both technically as well as allocatively, then the
ratio OC/OA will be the measure of overall efficiency or economic
efficiency (EE).

EE = OC/OA

where the distance AC can also be interpreted in terms of cost reduction. It


should be noted that the product of technical and allocative efficiency
provides the economic efficiency and all three measures are bounded by 0
and 1.

OB OC OC
TE × AE = × = = EE
OA OB OA

There are two empirical approaches to the measurement of efficiency


based on the above concepts of technical and allocative efficiency. The
first, favoured by most economists, is parametric (either stochastic or
deterministic), where the form of production function (or isoquant) is
either assumed to be known or is estimated statistically. However, in many
cases, the functional form of production function (or isoquant) is
unknown. In the nonparametric approach, no functional form is assumed a

T
priori and the piecewise linear convex isoquant is constructed empirically
from observed inputs and outputs, as is shown in Figure 1-2.
AF
R
D
6 Chapter One

Figure 1–2 Piecewise linear convex isoquant

If Farrell’s (1957) article is taken as the seminal work, the fundamental


research reported in 1978 is undoubtedly the basis for subsequent
developments in the nonparametric approach to evaluating technical
efficiency. In the subsequent work, Charnes and Cooper (1985) provided
the formal definition of efficiency as follows:
100% efficiency is attained for a production unit only when

a)

T
None of the outputs can be increased without either (i)
increasing one or more of the inputs, or (ii) decreasing some
of its other outputs.
AF
b) None of its outputs can be decreased without either (i)
decreasing some of its outputs, or (ii) increasing some of its
other inputs.

This definition is in accordance with the economist’s concept of Pareto


(Pareto-Koopmans) optimality. If there is no way of establishing a true or
theoretical model of efficiency, that is, the absolute standard, then the
R

definition needs to be adapted so that it refers to levels of efficiency


relative to known levels attained elsewhere in similar circumstances.
Charnes and Cooper (1985) thus provided the further definition:
D

100% relative efficiency is attained by any (unit) only when comparisons


with other relevant (units) do not provide the evidence of inefficiency in
the use of any input or output.
An Introduction to Data Envelopment Analysis 7

1.3 Evaluating Efficiency: Numerical Examples


and Graphic Presentation
The heart of DEA analysis lies in creating the best virtual producer for
each real producer. If a given producer, A, is capable of producing Y(A)
units of output with X(A) units of inputs, then other producers should also
be able to do the same if they were to operate efficiently. Similarly, if
producer B is capable of producing Y(B) units of output with X(B) units of
inputs, then other producers should also be capable of doing the same.
Producers A, B, and others can then be combined to form a composite
producer with composite inputs and composite outputs. Since this
composite producer does not necessarily exist, it is called a virtual
producer.
The producer is considered as inefficient if virtual output > actual
output for a given input level or virtual input < actual input for a given
output level. The former is known as output-oriented technical efficiency,
whereas, the latter is known as input-oriented technical efficiency. These
two orientations of efficiency measurement address two different
questions:

• The input-oriented measures of technical efficiency address the


question, By how much can input be reduced while attaining the
current level of output?

T
• The output-oriented measures of technical efficiency address
question, By how much can output be increased while keeping
level of current input fixed?
the
the
AF
1.3.1 Technical efficiency: one input-one output case

To understand the concept of efficiency, we begin with a simple


example of a one-input and one-output case. Consider four branches of
Interbank in Peru for the evaluation of efficiency. For each branch, we
have a single output measure, personal transactions (measured in
R

thousands) and a single input measure, managerial staff (measured in


numbers), as shown, respectively, in columns 2 and 3 of Table 1-1a. For
example, for the Chacarilla branch in one year, there were 50,000 personal
transactions and 17 staff members were employed.
D
8 Chapter One

Table 1–1a Input (#1) and Output (#1) of Bank Branches and their
Relative Efficiency (Input-Oriented)

PT Staff PT/Staff Relative


Code Branch
(Y1) (X1) (Y1/X1) Efficiency
Santa
A 130 20 130/20 = 6.50 6.50/6.50 = 1.00
Anita
B Chacarilla 50 17 50/17 = 2.94 2.94/6.50 = 0.45
C El Polo 85 18 85/18 = 4.72 4.72/6.50 = 0.73
Jockey
D 25 11 25/11 = 2.27 2.27/6.50 = 0.35
Plaza

Here, branches can be seen as taking inputs and converting them (with
varying degrees of efficiency) into outputs. The commonly used method to
measure efficiency is ratio, which requires a straightforward calculation
based on data on a single output and input. In this case, the single ratio can
be obtained by dividing number of personal transactions (Y1) by number of
staff (X1), as shown in column 4 of Table 1-1a. One can observe that Santa
Anita (A) has the highest ratio of personal transactions per staff member
(6.50), whereas Jockey Plaza (D) has the lowest ratio of personal
transactions per staff member (2.27).
As Santa Anita (A) has the highest ratio of 6.50, all other branches can

T
be compared to it and their relative efficiency calculated with respect to
the Santa Anita branch. To do this, the ratio of each branch is divided by
6.50 (the value for Santa Anita), as shown in the last column of Table 1-1a,
to attain the relative efficiency of each branch. One can observe that the
AF
value of efficiency varies between 0 and 1. The only branch that is
efficient is Santa Anita (A), with an efficiency score of 1. The most
inefficient branch is Jockey Plaza (D), with its efficiency score of 0.35.
Similarly, one can obtain the efficiency of bank branches by using the
ratio of number of staff (X1) to number of personal transactions (Y1), as
shown in the fifth column of Table 1-1b. It can be seen that Santa Anita
R

(A) incurs the lowest cost per personal transaction, whereas Jockey Plaza
(D) incurs the highest cost per personal transaction.
D
An Introduction to Data Envelopment Analysis 9

Table 1–1b Input (#1) and Output (#1) of Bank Branches and their
Relative Efficiency (Output-Oriented)

PT Staff Staff/PT (Staff/PT)/ Relative


Code Branch
(Y1) (X1) (X1/Y1) Min(Staff/PT) Efficiency
A Santa Anita 130 20 20/130 = 0.15 0.15/0.15 = 1.00 1/1.00 = 1.00
B Chacarilla 50 17 17/50 = 0.34 0.34/0.15 = 2.27 1/2.27 = 0.44
C El Polo 85 18 18/85 = 0.21 0.21/0.15 = 1.40 1/1.40 = 0.71
D Jockey Plaza 25 11 11/25 = 0.44 0.44/0.15 = 2.93 1/2.93 = 0.34

As Santa Anita (A) has the lowest per unit cost of personal transactions
(i.e., 0.15), we can compare all other branches to it and calculate their
relative efficiency with respect to the Santa Anita branch. To do this, we
can divide the ratio of each branch by 0.15 (the value for Santa Anita), as
shown in column 6. However, in this case, the efficiency of branch (shown
in column 7) is the reciprocal of values presented in column 6.
The concept of input and output-oriented technical efficiency is well
demonstrated in Figure 1-3. The output, number of personal transactions
(Y1), is represented in the Y-axis, whereas, the input, number of staff (X1),
is shown in the X-axis. The input-output combinations of bank branches
are represented by branch codes, A, B, C, and D. The straight line from the
origin through point A measures the frontier technology, which shows the

T
maximum attainable level of output (number of personal transactions),
given the level of input (number of staff). Given the frontier technology
OA, all other branches are inefficient except the Santa Anita (A) branch.
AF
R
D
10 Chapter One

Figure 1–3 Technical efficiency: one-input and one-output case

Let the reference branch be Jockey Plaza (D), which is using OM


amounts of input to achieve the current level of output, MD. If Jockey
Plaza (D) were technically efficient, the input that is required to achieve
the current level of output would have been ON. The technical efficiency
under input-orientation is defined as the maximum proportional reduction
in input for a given level of output.

T
TEI = ON/OM = 3.84/11 = 0.35

The subscript I indicates that the efficiency is input-oriented.


AF
The efficiency of Jockey Plaza (D) can also be evaluated from the
output-orientation. Given the current level of input OM, the maximum
potential level of output that could be achieved by branch D is MD′. Under
the output-orientation, the technical efficiency is the reciprocal of the
maximum proportional expansion of output (i.e., the reciprocal of the ratio
of distances MD′ to MD).
R

TEO = MD/MD′ = 25/71.43 = 0.35

The subscript O indicates that the efficiency is output-oriented.


It should be noted that under CRS, the efficiency under output
D

orientation is exactly the same as the efficiency under input orientation.


This can be easily inferred from Figure 1-3 through the property of
triangle as given below:
An Introduction to Data Envelopment Analysis 11

ΔOPN ≅ ΔOD' M
NP MD' NP ON MD ON
⇒ = ⇒ = ⇒ = ⇒ TEO = TEI
ON OM MD' OM MD' OM

1.3.2 Technical efficiency: one input-two outputs case

Suppose now that we have two output measures (personal transactions


completed and number of business transactions, BT, completed) and the
same single input measure (number of staff), as shown in Table 1-2a. For
example, for the branch, Chacarilla (B), in one year, 50,000 transactions
relating to personal accounts and 25,000 transactions relating to business
accounts were conducted, and 17 staff members were employed. Here, the
performance of the branches needs to be assessed on how efficiently they
use their single input (number of staff) to produce the two distinct
categories of transaction outputs.

Table 1–2a Input (#1) and Outputs (#2) of Bank Branches and their
Output-Input Ratios

PT BT Staff PT/Staff BT/Staff


Code Branch
(Y1) (Y2) (X1) (Y1/X1) (Y2/X1)
A Santa Anita 130 55 20 6.50 2.75
B Chacarilla 50 25 17 2.94 1.47
C
D
El Polo
Jockey Plaza
85
25
60
15 T 18
11
4.72
2.27
3.33
1.36
AF
The output-oriented technical efficiency of each branch in producing
two outputs can be found by dividing each of their outputs by their input.
As one can observe, Santa Anita (A) has the highest ratio of personal
transactions per staff member, whereas El Polo (C) has the highest ratio of
business transactions per staff member. Chacarilla (B) and Jockey Plaza
R

(D) do not compare so well with Santa Anita (A) and El Polo (C), so are
presumably underperforming. Thus, Chacarilla (B) and Jockey Plaza (D)
are relatively less efficient at using their given input resource (number of
staff) to produce outputs, number of personal transactions, and number of
D

business transactions.
One problem with comparison via ratios in this case is that different
ratios can give a different picture and it is difficult to combine the entire
set of ratios into a single numeric judgment. For example, consider
12 Chapter One

Chacarilla (B) and Jockey Plaza (D): Chacarilla is (2.94/2.27) = 1.29 times
as efficient as Jockey Plaza at personal transactions but only (1.47/1.36) = 1.08
times as efficient at business transactions. Thus, the challenge here is to
combine these numbers into a single judgment.
Figure 1-4a shows the X-axis representing the personal transactions per
staff member (Y1/X1) and the Y-axis representing the business transactions
per staff member (Y2/X1). The observed level of two outputs of a bank
branch is represented by its branch code. The frontier technology is
formed by connecting the observations A and C and further extending the
line horizontally from point A to the Y-axis and extending the line
perpendicularly from point C to the X-axis.

Figure 1–4a Output technical efficiency: one input – two outputs case

T
AF
Any branches on the frontier are technically efficient (TEo = 1).
Hence, for our example, Santa Anita (A) and El Polo (C) are efficient.
However, it does not mean that the performance of Santa Anita (A) and El
R

Polo (C) could not be improved. In fact, what we can say is that on the
evidence (data) available, we have no idea of the extent to which their
performance can be improved. Clearly, Chacarilla (B) and Jockey Plaza
(D) are inefficient (TEo ≤ 1).
D

If the branch, Jockey Plaza (D), is placed under evaluation, the ratio
personal transactions/business transactions = (25/15) = 1.67, that is, there are
1.67 personal transactions for every business transaction. Mathematically, the
value 1.67 is also the ratio of personal transactions per staff
An Introduction to Data Envelopment Analysis 13

member/business transactions per staff member, that is, 2.27/1.36 = 1.67.


Numerically, we can measure the (relative) efficiency of Jockey Plaza (D)
by the following ratio: length of line from origin to Jockey Plaza
(OD)/length of line from origin through Jockey Plaza to efficient frontier
(OR) = 2.646/6.126 = 0.432.
The input-oriented technical efficiency of each branch can be found by
dividing their input by each of their outputs, as shown in Table 1-2b. Here
we can see that Santa Anita (A) incurs the highest number of personal
transactions in terms of output Y1, whereas El Polo (C) incurs the highest
number of business transactions in terms of output Y2. On the other hand,
Jockey Plaza (D) incurs the least number of transactions in terms of both
the outputs.

Table 1–2b Input (#1) and Outputs (#2) of Bank Branches and their
Input-Output Ratios

PT BT Staff Staff/PT Staff/BT


Code Branch
(Y1) (Y2) (X1) (X1/Y1) (X1/Y2)
A Santa Anita 130 55 20 0.15 0.36
B Chacarilla 50 25 17 0.34 0.68
C El Polo 85 60 18 0.21 0.30
D Jockey Plaza 25 15 11 0.44 0.73

T
Let the X-axis represent the number of staff per number of personal
transactions (X1/Y1) and the Y-axis represent the number of staff per
AF
number of business transactions (X1/Y2). The frontier is formed by joining
A to C and extending the line from point A (parallel to the Y-axis) and
point C (parallel to the X-axis). Clearly, the branches A and C are efficient
(TEI = 1) as they lie on the frontier of the isoquant, whereas the other two
branches, B and D, are technically inefficient (TEI ≤ 1) as they lie above
the isoquant. The further away a branch is from the isoquant, the more
R

inefficient it is.
The input-oriented technically efficiency of branch D as shown in
Figure 1-4b can be obtained by taking the following ratio: length of line
from origin to efficient frontier (OD′)/length of line from origin to Jockey
D

Plaza (OD) = 0.368/0.852 = 0.432.


14 Chapter One

Figure 1–4b Input technical efficiency: one input – two outputs case

1.3.3 Technical efficiency: two inputs-one output case

Now let us assume that each branch is producing the single output Y1
(number of personal transactions) with the help of two inputs, namely X1
(number of staff) and X2 (size of the branch), as shown in Table 1-3a. For
example, Chacarilla (B) completes 50,000 personal transactions with the

T
help of 17 staff members in a branch of 80 meter square.

Table 1–3a Inputs (#2) and Output (#1) of Bank Branches and their
AF
Output-Input Ratios

PT Staff Size PT/Staff PT/Size


Code Branch
(Y1) (X1) (X2) (Y1/X1) (Y1/X2)
A Santa Anita 130 20 150 6.50 0.87
B Chacarilla 50 17 80 2.94 0.63
R

C El Polo 85 18 40 4.72 2.13


D Jockey Plaza 25 11 45 2.27 0.56
D
An Introduction to Data Envelopment Analysis 15

Let us divide the output (Y1) by each of its inputs X1 and X2 to obtain
the output per unit of each input. This is reflected in the last two columns
of Table 1-3a. One can observe that El Polo (C) produces the highest
number of units of output per unit of staff (X1) as well as size (X2). On the
other hand, Jockey Plaza (D) produces the least number of units of output
per unit of each of the inputs. In Figure 1-5a, the X-axis represents the
personal transactions per staff (Y1/X1) and the Y-axis represents the
personal transactions per unit of size (Y2/X1). The production frontier is
formed by connecting the observations A and C and further extending the
line from the point A to Y-axis and from the point C to X-axis.

Figure 1–5a Output technical efficiency: two inputs – one output case

T
AF
Clearly, branches A and C are technically efficient (TEO = 1) and the
other two branches, B and D, are technically inefficient (TEO ≤ 1). In order
to derive the efficiency score of D, a line can be drawn from the origin
R

passing through the inefficient observation D to the frontier. The output-


oriented technical efficiency of Jockey Plaza (D) is the ratio of the length
of line from origin to Jockey Plaza (OD)/length of line from origin through
Jockey Plaza to efficient frontier (OR) = 0.994/2.338 = 0.425.
D

Similarly, to obtain the input-oriented technical efficiency, let us


divide each input by its output to derive the cost on each input for every
unit of output. This is shown in the last two columns of Table 1-3b. One
can observe that El Polo (C) incurs the lowest cost in terms of both the
16 Chapter One

inputs per personal transactions. On the other hand, Jockey Plaza (D)
incurs the highest cost in terms of both the inputs per personal
transactions.

Table 1–3b Inputs (#2) and Output (#1) of Bank Branches and their
Input-Output Ratios

PT Staff Size Staff/PT Size/PT


Code Branch
(Y1) (X1) (X2) (X1/Y1) (X2/Y1)
A Santa Anita 130 20 150 0.15 1.15
B Chacarilla 50 17 80 0.34 1.60
C El Polo 85 18 40 0.21 0.47
D Jockey Plaza 25 11 45 0.44 1.80

In Figure 1-5b, the X-axis represents the number of staff per number of
personal transactions (X1/Y1) and the Y-axis represents size of branch per
number of personal transactions (X2/Y1). The isoquant is formed by
connecting the observations A and C and further extending the line upward
from point A (parallel to the Y-axis) and towards the right from the point C
(parallel to the X-axis).

Figure 1–5b Input technical efficiency: two inputs – one output case

T
AF
R
D
An Introduction to Data Envelopment Analysis 17

The observations that lie on the frontier of the isoquant are technically
efficient (TEI = 1), whereas the observations that lie towards the right of
the isoquant are technically inefficient (TEI ≤ 1). Thus, the technically
efficient branches are Santa Anita (A) and El Polo (C), and the technically
inefficient branches are Chacarilla (B) and Jockey Plaza (D). In order to
evaluate the efficiency of the Jockey Plaza (D) branch, we can draw a line
from the origin passing through the isoquant to the observation D. The
efficiency of Jockey Plaza (D) can then be measured by the following
ratio: length of line from origin to the isoquant (OD′)/length of line from
isoquant to the observed point (D′D) = OD′/ D′D = 0.787/1.853 = 0.425.

1.3.4 Returns to scale and technical efficiency

The issue of returns to scale concerns what happens to units’ outputs


when the firms change the amount of inputs that they are using to produce
their outputs. So far, we have discussed the cases of frontier technology
under CRS wherein a radial increase in input vector causes the same
proportion of increase in the output vector. However, the assumption of
CRS could be considered as restrictive in many situations, and a true
frontier production function might, in fact, be more appropriately achieved
by following variable returns to scale (VRS), which encompasses both
increasing and decreasing returns to scale.
In increasing returns to scale (IRS), a radial increase in the input vector

T
causes a larger proportion of radial increase in the output vector. In other
words, a doubling of all inputs leads to more than a doubling of all
outputs. On the other hand, in decreasing returns to scale (DRS), a radial
increase in the input vector causes a smaller proportion of radial increase
AF
in the output vector. In other words, a doubling of all inputs leads to less
than a doubling of all outputs.
To examine the efficiency under CRS and VRS, let us take an example
of one input-one output case (as in Table 1-1a) with an additional two
bank branches–San Borja (E) and San Isidro (F). Table 1-4 provides the
input and output of all six branches.
R
D
18 Chapter One

Table 1–4 Input (#1) and Output (#1) of Bank Branches

PT Staff
Code Branch
(Y1) (X1)
A Santa Anita 130 20
B Chacarilla 50 17
C El Polo 85 18
D Jockey Plaza 25 11
E San Borja 28 9
F San Isidro 70 10

In Figure 1-6, the output Y1 (number of personal transactions) is


measured on the Y-axis, and the input X1 (number of staff) is measured on
the X-axis. Input-output combinations of all six bank’s branches are shown
by branch codes A, B, C, D, E, and F, respectively. The straight line from
the origin passing through the point F represents the production frontier
under the CRS assumption. Along this line, a radial increase in input
vector causes the same proportion of radial increase in output vector. The
shape of the frontier under VRS is formed by the bank branches E, F, and
A. Thus, a piecewise line connecting the points M*, E, F, and A, and the

T
extension represents the production frontier under VRS. Different
segments of such a production frontier show different returns to scale. On
the segment EF, a percentage increase in input (X1) leads to more than a
percentage increase in output (Y1), and thus, it reflects the IRS of the
AF
production frontier. On the other hand, for the segment FA, a percentage
increase in input (X1) leads to less than a percentage increase in output
(Y1), and thus, it reflects the DRS of the production frontier. The point F
reflects the CRS where a percentage increase in input (X1) leads to exactly
a percentage increase in output (Y1).
R
D
An Introduction to Data Envelopment Analysis 19

Figure 1–6 Technical efficiency under variable returns to scale

Under CRS, the only branch that is efficient is San Isidro (F).
However, under VRS, two more branches, namely Santa Anita (A) and
San Borja (E) become efficient. Thus, three branches, namely, Chacarilla
(B), El Polo (C), and Jockey Plaza (D) are inefficient, irrespective of the
scale of technology. Let us consider Chacarilla (B), which is inefficient
under both CRS and VRS assumptions.

T
The output-oriented technical efficiency of Chacarilla (B) as shown in
Table 1-5 is measured by the amount by which output could be increased
AF
without requiring extra inputs. Given the current level of input as ON, the
branch B can improve the level of output by BB* under CRS and BB′ under
VRS. The difference between these two, that is B′B*, is attributed to the
scale effect. Thus, the output-oriented technical efficiency is the ratio
NB/NB*under CRS and NB/NB′ under VRS. The scale efficiency is the
ratio NB′/NB*, obtained by taking the ratio of technical efficiency under
CRS to technical efficiency under VRS.
R

The input-oriented technical efficiency is measured by the amount by


which input consumption could be reduced at the current level of output.
Given the current level of output as NB, branch B can reduce the level of
D

input consumption (X1) by LN under CRS and by MN under VRS. The


difference between the two, that is LM, is attributed to the scale effect.
Thus, the input-oriented technical efficiency is the ratio OL/ON under
CRS and OM/ON under VRS. The scale efficiency is the ratio OL/OM
20 Chapter One

obtained by taking the ratio of technical efficiency under CRS to technical


efficiency under VRS. Table 1-5 shows the calculation of the above three
types of efficiency of branch Chacarilla (B).

Table 1–5 Efficiency of Chacarilla (B)

Efficiency Output-oriented Input-oriented


*
NB/NB = 50/119 = OL/ON = 7.143/17 =
TECRS
0.420 0.420
NB/NB′ = 50/112 = OM/ON = 9.523/17 =
TEVRS
0.446 0.560
SE = TECRS/TEVRS NB′/NB* = 0.941 OL/OM = 0.750

1.3 Basic Models in Data Envelopment Analysis


This section provides the basic DEA models of CCR (Charnes et al.,
1978) and BCC (Banker, Charnes, & Cooper, 1984) under both output as
well input orientation that have been discussed so far, with numerical
examples and graphs. Further, the extensions of the basic DEA models for
super efficiency and benchmarking are presented.

T
1.3.1 Basic DEA models

Let us assume that there are n DMUs. Each DMUj (j = 1, 2,…, n)


consumes a vector of inputs, x j = ( x1 j , x2 j ,..., xmj ) T to produce a vector of
AF
outputs, y j = ( y1 j , y2 j ,..., ysj ) T . The superscript T represents transpose.
The DMU to be evaluated is designated as DMU0 and its input-output
vector is denoted as ( x0 , y0 ).
The output-oriented CCR model, in line with Charnes et al. (1978), can
be defined as follows, and involves a two-stage DEA process:
R
D
An Introduction to Data Envelopment Analysis 21

s m
Max φ + ε (∑ sr+ +∑ si− )
r =1 i =1

s.t.
n
φ yr 0 − ∑ yrj λ j + sr+ = 0, r = 1,..., s,
j =1 (1.1)
n

∑x λ
j =1
ij j + si− = xi 0 , i = 1,..., m,

λ j ≥ 0, sr+ ≥ 0, si− ≥ 0, j = 1,..., n,


r = 1,..., s, i = 1,..., m,

and φ is otherwise unconstrained. Here λ j represents the structural


variables, sr and si represent slacks, and 0 < ε is a non-Archimedean
+ −

infinitesimal, which is defined to be smaller than any positive real number.

Definition 1.1: (DEA efficiency) DMU0 is DEA efficient if and only if


the following two conditions are satisfied:
(i) max φ = φ * = 1, (ii) sr+* = si−* = 0, ∀i , r , (all slacks are zero) where *
designates an optimum.
Definition 1.2: (Weakly DEA efficiency) DMU0 is weakly DEA
efficient if and only if the following two conditions are satisfied:

T
(i) max φ = φ * = 1, (ii) s r+* ≠ 0 and/or si−* ≠ 0 (not all slacks are zero)
for some i and r in some alternative optima, where * designates an
optimum.
AF
Similarly, the input-oriented CCR model in line with Charnes et al.
(1978) can be defined as follows, and involves a two-stage DEA process:
R
D
22 Chapter One

s m
Min ϕ − ε (∑ sr+ +∑ si− )
r =1 i =1

s.t.
n

∑y
j =1
rj λ j − sr+ = yr 0 , r = 1,..., s,
(1.2)
n
ϕ xi 0 − ∑ xij λ j − si− = 0, i = 1,..., m,
j =1

λ j ≥ 0, sr+ ≥ 0, si− ≥ 0, j = 1,..., n,


r = 1,..., s, i = 1,..., m,

and ϕ is otherwise unconstrained. Here, λ j represents the structural


variables, sr+ and sr− represent slacks, and 0 < ε is a non-Archimedean
infinitesimal that is defined to be smaller than any positive real number.

Definition 1.3: (DEA efficiency) DMU0 is DEA efficient if and only if


the following two conditions are satisfied: (i) min ϕ = ϕ * = 1,
(ii) sr+* = si−* = 0, ∀i , r , (all slacks are zero) where * designates an
optimum.
Definition 1.4: (Weakly DEA efficiency) DMU0 is weakly DEA
efficient if and only if the following two conditions are satisfied:

T
(i) min ϕ = ϕ * = 1 , (ii) s r+* ≠ 0 , and/or si−* ≠ 0 , (not all slacks are zero)
for some i and r in some alternative optima, where * designates an
optimum.
AF
System (1.1) and System (1.2) assume that the best-practice frontier
exhibits CRS, that is, a best-practice DMU that is both technically as well
as scale efficient. If scale inefficiency is allowed in best-practice DMUs,
n
one can assume VRS and incorporate ∑ λ j = 1 into System (1.1) and
j =1
R

System (1.2) to obtain the efficiency respectively, under output-orientation


and input-orientation. The DEA efficiency models under VRS are
popularly known as BCC models (Banker et al., 1984). (For a complete
discussion on standard DEA models, please refer to Cooper, Seiford, &
D

Tone, 2000). Once the VRS is established and scale efficiency scores are
computed, the analysis can be taken a step further. This involves
determining whether a particular DMU is experiencing IRS, DRS, or
An Introduction to Data Envelopment Analysis 23

operating at the most productive scale size. To make this assessment, DEA
is repeated with nonincreasing returns to scale (NIRS) by incorporating the
J
restriction ∑λ
j =1
j ≤ 1 in System (1.1) (or, in System 1.2 under input-orientation).

It should be noted that, by definition, NIRS implies CRS or DRS. So, if


the score for a particular DMU under VRS equals the NIRS score, then
that DMU must be operating under DRS. Alternatively, if the score under
VRS is not equal to the NIRS score, this implies a DMU is operating
under IRS (Coelli, Rao, & Battese, 1998). When the VRS score equals the
CRS score, then the DMU is said to be operating at the most productive
scale size.

1.3.2 Super efficiency DEA model

The basic DEA models of CCR and BCC are further extended for
ranking and benchmarking the efficient DMUs. To enhance understanding,
we simplify our discussion and restrict the presentation of super-efficiency
and benchmarking DEA models for output-orientation.
The System (1.1) divides the DMUs into inefficient and efficient ones.
However, as all efficient DMUs receive the same efficiency score of 1, it
is not possible to distinguish among the efficient DMUs. To overcome this
problem, Andersen and Petersen (1993) proposed the super-efficiency
ranking method for efficient DMUs only. The super-efficiency approach

T
measures the level of extent by which the inputs need be increased (or the
outputs be decreased) to avoid becoming inefficient. The super-efficiency
model is identical to the DEA models described above, but a DMU under
AF
evaluation is excluded from the reference set. This allows a DMU to be
located above the efficient frontier, that is, to be super-efficient among the
efficient DMUs. Therefore, the super-efficiency score for efficient DMU
can in principle take any value greater than or equal to 1.
The output-oriented CCR super-efficiency model, in line with
Andersen and Petersen (1993), can be defined as follows and involves a
two-stage DEA process:
R
D
24 Chapter One

s m
Max φ + ε (∑ sr+ +∑ si− )
r =1 i =1

s.t.
n
φ yr 0 − ∑
j =1, j ≠ 0
yrj λ j + sr+ = 0, r = 1,..., s,
(1.3)
n


j =1,, j ≠ 0
xij λ j + si− = xi 0 , i = 1,..., m,

λ j ≥ 0, sr+ ≥ 0, si− ≥ 0, j = 1,..., n, j ≠ 0,


r = 1,..., s, i = 1,..., m.

This procedure makes a ranking of efficient DMUs possible (i.e., the


higher the value, the higher the rank). It can be seen that the scores for
inefficient DMUs are unaltered, as in System (1.1).

1.3.3 Benchmarking DEA model

Let E* represent the benchmarks or the best-practice DMUs identified


by System (1.1). In line with Zhu (2002), Cook, Seiford, and Zhu (2004),
and Xue and Harker (2002), the output-oriented variable benchmarking
super-efficiency model under CRS can be represented in System (1.4):

Max δ 0 + ε (∑ s0+r +∑ s0−i )

s.t.
s

r =1
m

i =1
T
AF
δ 0 yr 0 − ∑
j∈E , j ≠ o
*
yrj λ j + s0+r = 0, r = 1,..., s,
(1.4)


j∈ E * , j ≠ o
xij λ j + s0−i = xi 0 , i = 1,..., m,

δ 0 ≥ 0, λ j ≥ 0, s0+r ≥ 0, s0−i ≥ 0, j ∈ E * ,
R

r = 1,..., s, i = 1,..., m.

One can obtain the output-oriented variable benchmarking super-efficiency


D

model under VRS by incorporating ∑ λ j = 1 into System (1.4).


j∈E* , j ≠ 0
An Introduction to Data Envelopment Analysis 25

Similarly, one can obtain input-oriented super-efficiency and


benchmarking DEA models.
The reciprocal of optimal value φ of System (1.3), 1 / φ provides the
super-efficiency score. The standard output-oriented VRS super-efficiency
model can be obtained by adding the restriction on the sum of the intensity
n
variables, that is, ∑
j =1, j ≠ 0
λ j = 1 in System (1.3). However, the super-efficiency

model under the assumption of VRS may be infeasible (Lee, Chu, & Zhu,
2011), unlike the CRS super-efficiency model in System (1.3). Seiford and
Zhu (1999) provided the necessary and sufficient conditions for
infeasibility of super-efficiency models and further showed that
infeasibility must occur in the case of a super-efficiency model under
VRS. They found that infeasibility in output-oriented super-efficiency
occurs when the inputs of the evaluated DMU is outside the production
possibility set spanned by the inputs of the remaining DMUs. Further
discussion of this approach is well addressed in Section 7.2 of Chapter 7.

Notes
1
In a recent survey paper, Fethi and Pasiouras (2010) provided an extensive
survey on efficiency and productivity studies in the banking sector published in
various research journals covering the period between 1998 and 2008.
2
Zhou, Ang, and Poh (2008) presented a literature survey on the application of
3
DEA to Energy and Environmental studies.

T
The detailed bibliographic update on DEA can be seen in Seiford (1997) for
the period 1978-1996, Emrouznejad, Parker, and Tavares (2008) for the period
1978-2006 and Liu, Lu, Lu, and Lin (2012) for the period 1978-2009.
AF
4
A detailed survey on developments of DEA models can be seen in Seiford and
Thrall (1990), Seiford (1996), Cooper, Seiford, Tone, and Zhu (2007), and
Cook and Seiford (2009).
5
The first ever DEA model developed by Charnes, Cooper, and Rhodes in 1978
is popularly known as the CCR model.
R

References
Aigner, D., Lovell, C.A.K., & Schmidt, P. (1977). Formulation and
estimation of stochastic production function models. Journal of
D

Econometrics, 6, 21-37.
Andersen, P., & Petersen, N.C. (1993). A procedure for ranking efficient
units in data envelopment analysis. Management Science, 39(10),
1261-1265.
26 Chapter One

Banker, R.D., Charnes, A., & Cooper, W.W. (1984). Some models for
estimating technical and scale inefficiencies in data envelopment
analysis. Management Science, 30(9), 1078-1092.
Charnes, A., & Cooper, W.W. (1985). Preface to topics in data
envelopment analysis, Annals of Operations Research, 2, 59-94.
Charnes, A., Cooper, W.W., & Rhodes, E. (1978). Measuring the
efficiency of decision making units. European Journal of Operational
Research, 2(6), 429-444.
Coelli, T., Rao, D.S.P., & Battese, G.E. (1998). An Introduction to
Efficiency Analysis. London, UK: Kluwer Academic.
Cook, W.D., & Seiford, L.M. (2009). Data envelopment analysis (DEA)–
thirty years on. European Journal of Operational Research, 192(1), 1-
17.
Cook, W.D., Seiford, L.M., & Zhu, J. (2004). Models for performance
benchmarking: Measuring the effect of e-business activities on banking
performance. Omega, 32(4), 313-322.
Cooper, W.W., Seiford, L.M., & Tone, K. (2000). Data Envelopment
Analysis: A Comprehensive Reference Text with Models, Applications,
References, and DEA-solver Software. Boston, MA: Kluwer
Academic.
Cooper, W.W., Seiford, L.M., Tone, K., & Zhu, J. (2007). Some models
and measures for evaluating performances with DEA: Past
accomplishments and future prospects. Journal of Productivity
Analysis, 28(3), 151-163.

19(3), 273-292.
T
Debreu, G. (1951). The coefficient of resource utilisation. Econometrica,

Emrouznejad, A., Parker, B.R., & Tavares, G. (2008). Evaluation of


AF
research in efficiency and productivity: a survey and analysis of the
first 30 years of scholarly literature in DEA. Socio-Economic Planning
Sciences, 42(3), 151-157.
Farrell, M.J. (1957). The measurement of productive efficiency. Journal of
the Royal Statistical Society Series, Series A (General), 120(3), 253-
290.
Fethi, D.M., & Pasiouras, F. (2010). Assessing bank efficiency and
R

performance with operational research and artificial intelligent


techniques: a survey. European Journal of Operational Research,
204(2), 189-198.
Førsund, F.R., & Sarafoglou, N. (2002). On the origins of data
D

envelopment analysis. Journal of Productivity Analysis, 17(1-2), 23-


40.
An Introduction to Data Envelopment Analysis 27

Grosskopf, S. (1986). The role of the reference technology in measuring


productive efficiency. Economic Journal, 96(382), 499-513.
Koopmans, T.C. (1951). Analysis of production as an efficient
combination of activities. In: T. C. Koopmans (Ed.), Activity Analysis
of Production and Allocation, pp. 33-97. New York, NY: John Wiley
& Sons.
Lee, H-S., Chu, C-W., & Zhu, J. (2011). Super-efficiency DEA in the
presence of infeasibility. European Journal of Operational Research,
212(1), 141-147.
Liu, J.S., Lu, L.Y.Y., Lu, W-M., & Lin, B.J.Y. (2012). Data envelopment
analysis 1978-2010: A citation-based literature survey. Omega, 41(1),
3-15.
Nunamaker, T.R. (1985). Using data envelopment analysis to measure the
efficiency of non-profit organizations: a critical evaluation.
Managerial and Decision Economics, 6(1), 50-58.
Raab, R., & Lichty, R. (2002). Identifying sub-areas that comprise a
greater metropolitan area: The criterion of country relative efficiency.
Journal of Regional Science, 42(3), 579-594.
Seiford, L.M. (1994). A DEA bibliography (1978-1992). In: A. Charnes,
W.W. Cooper, A. Y. Lewin, & L.M. Seiford (Eds.), Data Envelopment
Analysis: Theory, Methodology and Application, pp. 437-470. Boston,
MA: Kluwer Academic.
—. (1997). A bibliography for data envelopment analysis (1978-1996).
Annals of Operations Research, 73, 393-438.

T
Seiford, L.M., & Thrall, R.M. (1990). Recent developments in DEA–the
mathematical programming approach to frontier analysis. Journal of
Econometrics, 46(1-2), 7-38.
AF
Seiford, L.M., & Zhu, J. (1999). Infeasibility of super efficiency data
envelopment analysis models. INFOR, 37(2), 174-187.
Xue, M., & Harker, P.T. (2002). Note: Ranking DMUs with infeasible
super-efficiency DEA models. Management Science, 48(5), 705-710.
Zhou, P., Ang, B.W., & Poh, K.L. (2008). A survey of data envelopment
analysis in energy and environmental studies. European Journal of
Operational Research, 189(1), 1-18.
R

Zhu, J. (2002). Quantitative Models for Performance Evaluation and


Benchmarking: Data Envelopment Analysis with Spreadsheets and
DEA Excel Solver. Boston, MA: Kluwer Academic.
D
28 Chapter One

Authors Note
Vincent Charles and Mukesh Kumar, CENTRUM Católica, Graduate
School of Business, Pontificia Universidad Católica del Perú, Jr. Daniel
Alomía Robles 125-129, Los Álamos de Monterrico, Santiago de Surco,
Lima 33.
Correspondence concerning this work should be addressed to Vincent
Charles, Email: vcharles@pucp.pe
The authors would like to thank Varinia Gonzales Zúniga and Juan
Carlos Paliza at the CENTRUM Investigación for their able assistance.
The authors are also grateful for the comments and suggestions made by
the reviewers.

T
AF
R
D

You might also like