You are on page 1of 6

Data Envelopment Analysis

(A systematic tutorial using MS Excel)

1. Introduction
How would you compare and identify the best/average/worst performing store in a retail chain? Let’s say even if you know the
average/worst performing store, how would you identify the areas that needs improvement and by how much?

While it is easy to gauge the performance while comparing the revenue-cost model, it does not hold true for stores that are
different in aspects such as store size, population density, affluent area etc. We would like a model that can incorporate such
complexities.

2. Table of Contents
I. What is DEA?
 Input/output oriented approach
 CRS and VRS
II. Visualize the “envelope” – Example
III. Multiple input & output measures – Step by step tutorial
 Dataset
 Objective and Method
 Formulation
 Assumptions
 Steps to perform DEA in excel using add-in Solver
IV. Conclusion

What is DEA?

Data Envelopment Analysis (DEA) is a nonparametric method in operations research and economics for the estimation of
production frontiers. It is used to empirically measure productive efficiency of decision-making units (or DMUs) based on input
and output metrics.

DEA analysis can be performed using the 2 methodologies – input & output oriented models

 Input-orientated model looks at the amount by which inputs can be proportionally reduced, with outputs fixed
 Output-orientated model looks at the amount by which outputs can be proportionally expanded, with inputs fixed

DEA can be conducted under the assumption of, Constant returns to scale (CRS) occurs when increasing the number of inputs
leads to an equivalent increase in the output and Variable returns to scale (VRS) occurs when an increase in inputs does not result
in a proportional change in the outputs

Example – 1 (1 input and 1 output model):

Let’s try to visualize (2-d plot) the envelope for comparing different store units in a retail environment using 1 input and 1 output
measure.

If we want to compare one input and one output of


multiple outlet on a retail chain, for example lets’ say
Café Coffee Day (CCD). CCD wants to compare the
performance of their outlets by comparing expense
(input) to revenue (output) (1-1 model). The
corresponding the 2-d graph will look like one in Figure
1.

S1 to S8 are 8 CCD stores which are being compared


against each other by 2 methods VRS and CRS

In VRS a curve is drawn connecting the outer most


point S1, S2, S3 & S4 creating an “envelope” for S5,
S6, S7 & S8. All stores falling under the curve are
Figure 1: 1 input - 1 output model
inefficient compared to their fellow stores (S1, S2, S3 &
S4)
In CRS a line is drawn from origin to the outer most point (S2 in this case) such that this line doesn’t intersect the VRS curve (a
tangent to the curve). As per CRS line, only S2 store is efficient while all other stores are considered inefficient.

CRS approach aims at simple proportion i.e. store S2 with an expense of 150 INR and revenue of 250 INR is most efficient CCD
store amongst all, so all other stores should achieve 3:5 ratio in order to become efficient. On the other hand, VRS looks for the
envelope in space i.e. store S1 to S4.

In case of scenarios where there are more than one input/output metrics, comparison gets complicated and it would not be possible
to depict as in figure 1. Out of the 2 methods (CRS and VRS), VRS method is often used since it takes care of the multi input –
output model which occurs in real world environment.

Example – 2 (Multiple input and output measures)

Now, we will take an example that compares units comprising of multiple input and output measures and take you through with a
step by step instruction with figures & formulas for better understanding.

Dataset: Below is the hypothetical data consisting of retail stores, the input and output measures identified for the analysis are,

 Input measures – Store size (100 sq. Mt), Cost (INR), No. of resources
 Output measures – Revenue (INR), Customer Satisfaction Score (1-10)
Table 1. Store's metrics

INPUT OUTPUT
Stores Store Size Cost (INR K) no. of Resources Revenue Customer
(100 sq mt) (INR K) Satisfaction Score
Store 1 5 35.00 5 145 9
Store 2 7 80.00 6 114 5
Store 3 6 74.00 4 106 8
Store 4 9 55.00 6 125 10
Store 5 8 55.00 12 130 8
Store 6 7 93.00 7 200 4
Store 7 4 86.00 6 200 10
Store 8 5 80.00 5 101 6

Objective – To compare the stores based on the metrics identified as input and output for the measure of efficiency.

Method – Input oriented variable returns to scale (VRS) model where inputs can be proportionally reduced for inefficient units,
while keeping the outputs fixed.

Formulation/Model Construction –

Objective -
th
 Minimize θk ; for k Store , where k Є(1¿ 8)

Subject ¿−¿
8

 Input Constraints
∑ (α i x m∗λi )≤ θk ∗¿ α k x m where mЄ (1 ,2 , 3)¿
i=1

8
 Output Constraints ∑ (α ¿ ¿ i y n∗λi )≥ α k y n where n Є (1 ,2) ¿
i=1

8
 Weights ∑ λi=¿ 1 ,(Constraint applicable for VRS calculation)¿
Constraints i=1

0 ≤ λi ≤1 where iЄ ( 1¿ 8 )∧¿ 0 ≤ θk ≤1 where k Є (1¿8)


Where; α =store, x=input , y=output , λ=weight

Assumptions –
1. All Stores considered for analysis are measured on common metrics
2. Stores belong to the same geography
3. Data is captured along the same time duration for all stores
4. The input metrics considered have scope of reduction
5. External factors such as strike, curfew, natural calamity does not impact the metrics
Steps to perform DEA in MS Excel (The choice of MS Excel is for better understanding of the formulation of model)

Step –1: Identify the input and output measures in the model
We need to choose the input/output measures such that the input should be decreased and the output should be
increased, hence we need to ensure that we understand the business objective before finalizing on the measures. E.g.,
Business would not be interested in reducing the revenue; however, they would like to reduce the overall cost.
In a scenario, where the output is customer complaint, which ideally should be reduced, we need to transform it as
reciprocal i.e. (1/Customer Complaint)
Refer figure 1 for input and output measures.
Step – 2: Setting up excel for formulation
Setting up excel for formulating the problem in the below format

Table 2. Setting up excel

There are total 8 stores for which DEA analysis is conducted, hence there will be total 8 iterations. For Store 1 (1st
iteration) we will set up Excel Solver and the step by step guide provided below
The above sheet (Table 2) has 11 cells that contain formulas (refer Table 3).
I11 – Sum of weights (λ) – VRS weights constraint
C15:E15 – Sum Product of weights with the relevant input measure (e.g. – sum product of store size with weights in
cell C15) - LHS of input constraints
C17:E17 – Product of the store-in-consideration input measure with the efficiency (θ) (e.g. product of store size with
the efficiency in cell C17) - RHS of input constraints for Store 1
Table 3. Formulas in Cells

F15:G15 – Sum Product of weights with the relevant output measure (e.g. – sum product of Revenue with weights in
cell F15) - LHS of output constraints
F17:G17 – These are simply reference cells of respective output of store 1(e.g. product of store size with the
efficiency in cell C17) - RHS of output constraints for Store 1
Step – 3a: Solver inputs
1. Choose solver from Data ribbon (Figure 2)

Figure 2. Select Solver

2. Set Objective to minimize the Efficiency score


θ(cell: $I$16)
3. By Changing Variable Cells Weights (λ) &
Efficiency score θ (cell: $I$13:$I$10, $I$16)
4. Subject to the constraints
a. Input Constraints: $C$15:$E15 <=
$C$17:$E17
b. Output Constraints: $F$15:$G15 <=
$F$17:$G17
c. VRS Constraints: $I$11 = 1 (sum of all
weights)
5. Make Unconstrained Variables Non-Negative
Check (yes)
6. Select a Solving Method Simplex LP
7. Solve enter
8. Click OK

Figure 3. Solver arguments


Table 4. Iteration-1 Output Step – 3b: Solver output (λ, θ)
λ θ
VRS Store 1 Efficiency Score
We will record the output of the solver which are Efficiency Score (θ) and Weights (λ)
Store 1 1 1.00
Store 2 0 for Store 1.
Store 3 0
Store 4 0 The value of λ is distributed amongst all the stores with which the store in question is
Store 5 0
Store 6 0
comparable. The stores, which are identified as comparable, are also known as its
Store 7 0 “Peers” which means that the store is question can only be compared to its peers.
Store 8 0
In this case, the value of λ is 1 against Store 1 and rest are zeros which means Store 1
is only comparable to itself hence achieves the maximum efficiency score (θ) of 1.

Step – 4: Iterations
Repeat the above process from Step-1 to Step-3a for other stores (Store 2 to Store 8) and record the output as
mentioned in Step-3b. It would be as per table 5.
Table 5. VRS output of all stores
λ θ
VRS Store 1 Store 2 Store 3 Store 4 Store 5 Store 6 Store 7 Store 8 VRS Efficiency Score
Store 1 1.000 0.359 1.000 Store 1 1.000 Let’s understand table 5.
Store 2 Store 2 0.769
Store 3 0.513 1.000 0.500 Store 3 1.000 We will interpret the output of
Store 4 1.000 Store 4 1.000
Store 5 Store 5 0.636
weights (λ) vertically for each
Store 6 Store 6 0.925 store.
Store 7 0.128 1.000 1.000 0.500 Store 7 1.000
Store 8 Store 8 1.000

Store 1: Comparable to itself only with efficiency score of 1 (Efficient)


Store 2: Comparable to its peers Store 1, Store 3 & Store 7 with efficiency score of 0.769 (Inefficient)
Store 3: Comparable to itself only with efficiency score of 1 (Efficient)
Store 4: Comparable to itself only with efficiency score of 1 (Efficient)
Store 5: Comparable to its peer Store 1 only with efficiency score of 0.636 (Inefficient)
Store 6: Comparable to its peer Store 7 only with efficiency score of 0.925 (Inefficient)
Store 7: Comparable to itself only with efficiency score of 1 (Efficient)
Store 8: Comparable to its peers Store 3 & Store 7 with efficiency score of 1 (Efficient)

Step – 5: Weights (λ) to achieve efficiency


Once we find out the VRS efficiency score of all stores, next obvious question is “how would stores, with efficiency
score (θ) < 1, become efficient?” Here, peers of inefficient stores and their corresponding weights (λ) will help to
achieve the same.
Since it is an Input-Oriented VRS model of DEA where output is fixed while we try to minimize the inputs, hence we
will calculate the optimal inputs required for Table 6. Optimal Input calculation
inefficient stores to attain maximum
efficiency.
Calculation of optimal inputs:
SumProduct of Peers’ weights with their
corresponding inputs…
E.g., Cell C13 calculates the optimal value of Table 7. Scope of reduction in inputs for Store 2

store size for Store 2 in Table 6 by


SumProduct of Store 2’s weights column and
store size of input column. Similarly for other
inputs like cost and no. of resources.
Table 7 suggests the optimal values of inputs and required reduction of inputs of Store 2 to be at par with peers and
achieve the maximum efficiency.

Similarly, for stores 5 & 6, we can calculate the optimal inputs.

Conclusion: As we have seen in this tutorial, DEA is the powerful scientific method to compare multiple
teams/units/outlets/stores etc. It doesn’t only provide the efficiency score but also suggests the required changes in the
inputs (for input-oriented DEA). We discussed two type of returns-to-scale – CRS and VRS. There are two more type
of returns-to-scale – DRS (Decreasing Returns to Scale) and IRS (Increasing Returns to Scale) but they are out of
scope of this tutorial. Hope this tutorial helps to get started with DEA. We would like hear from you on this and please
let us know if you need help to understand any part of this write-up.

Authors –

Himanshu Tripathi
Working as a Data Scientist

Email:

Himanshut15@iimb.ernet.in

You might also like