You are on page 1of 49

Vertical Integration and Supply Chains

Shresth Garg1 , Pulak Ghosh2 , and Brandon Tan1


1
Department of Economics, Harvard University
2
Indian Institute of Management Bangalore

February 2020

Abstract
Vertical integration is central to understanding patterns of economic
activity, but there has been limited empirical work measuring the extent
to which firms own and utilize direct upstream and downstream produc-
tion links for sourcing physical inputs. We use administrative data from
Karnataka, India on the universe of good shipments between any two es-
tablishments to answer this question. Uniquely, we can identify if two
establishments are under joint ownership allowing us to map the flow of
goods both within and across firms. We find that 11% of input value can
be sourced from vertically integrated upstream establishments. Of this
potential 11%, between 30 to 40% of trade actually materializes. This
suggests that the supply of physical goods along the production chain is
an important rationale for vertical integration. Notably, within the set of
vertically integrated firms, firms which source at least one product from
within account for over three-quarters of economic activity. We look at
factors associated with the decision to source a given product from within
and find that firm size, distance to outside and within firm suppliers,
frequency of input requirement, product relationship specificity, volume,
R&D requirements and competition both upstream and downstream are
important factors. We also look at factors associated with the ownership
of an vertically integrated establishment and find that firm size, prod-
uct specificity, R&D requirements and competition matter. Finally, we
estimate an establishment level gravity specification to assess the impor-
tance of integration for trade. While both distance and state borders are
important, integration emerges as by far the strongest driver of sourcing
decision.

1
1 Introduction
Vertically integrated firms play a large and important role in the economy. There
exists a rich literature focused on integration decisions and their consequences,
but we know very little empirically about the nature of these vertical relation-
ships. Theory puts forward many possible rationales for the existence of these
relationships such as mitigating contracting frictions (Coase, 1937; Williamson,
1971), scale and scope economics (Stigler, 1951; Novak and Stern, 2009), or
strategic motives related to consolidating or extending market power (Perry,
1989; Rey and Tirole, 2007; Bresnahan and Levin, 2012). However, uncovering
the relative importance of different theories requires data on intra-firm activity
in vertically integrated firms which is often unavailable. In this paper, we use
administrative data on the movement of physical goods both within and outside
the firm from Karnataka, India to measure the extent to which firms own and
utilize direct upstream and downstream production links for sourcing physical
inputs.

We use administrative data from Karnataka, one of the largest states in India
with a population of 61 million people and GDP greater than $220 billion USD.
We observe the universe of physical shipments, above a threshold (∼ $700 USD),
between any two establishments through or within the state. Uniquely, we can
identify if two establishments are under joint ownership allowing us to map the
flow of goods both within and across firms. We measure the extent to which
direct production links exists within a firm, and find that downstream estab-
lishments can potentially source 11% of their input value from an integrated
upstream establishment. We then measure trade along these production links
and find that between 30 to 40% of the potential trade materializes. Further,
within the set of vertically integrated firms, a large number source at least one
product from within. These firms make up over three-quarters of economic ac-
tivity. To our knowledge, this is the first paper to provide empirical evidence
on within-firm production chains using linked establishment to establishment
transaction data at the product level. The literature so far has been largely
limited to identifying production linkages using industry level input-output ta-
bles, while our data allows us to do so at the firm level.

We run a series of robustness checks to verify our benchmark results. The first
measurement challenge is defining buyers and suppliers for a given product in
our data. We identify an establishment as a buyer of an input if its total inward
shipment value of that product exceeds its total outward shipment value mul-
tiplied by some threshold. Similarly, we identify an establishment as a seller of
an input if its total outward shipment value of that product exceeds its total in-
ward shipment value multiplied by some threshold. Our results are robust to the
thresholds that are picked at this stage. Next, we identify vertically integrated
production links by whether a downstream establishment has a “potential” up-
stream supplier owned by the same firm for a particular input. Our results are
similar under different definitions and scale requirements for the existence of a

2
“potential” supplier owned by the same firm. Our results are also robust to
excluding products that are not an establishment’s “primary input”, suppliers
that do not ship frequently, and suppliers that are not within a buyer’s district.

For our main analysis, we use four-digit product codes. We provide additional
results using different levels of aggregation for products. First, we aggregate all
the products in 21 different broad categories and provide results on the share
of within-firm sourcing for each category. Second, we ensure that our 4 digit
product categories are sufficiently narrow. We repeat our analysis for observa-
tions which report an 8 digit product code. As larger firms are more likely to
report 8 digit product codes this sample selection is endogenous. Reassuringly,
the proportion of within-firm sourcing remains approximately the same.

Our results relate to the literature on vertical integration emerging as an ef-


ficient outcome in the presence of contracting frictions. Frictions may lead to
problems of hold-up (Williamson, 1971; Williamson, 1975), incentives (Gross-
man and Hart, 1986, Hart and Moore, 1990) or decision making externalities
like double marginalization (Spengler, 1950). Inherent in all these theories of
vertical integration is the supposition that integrated upstream and downstream
units transact with each other. We find results suggesting that transactions in
physical inputs is an important rationale for vertical integration.

After establishing that firms utilize their internal networks, we explore the de-
cision of utilizing existing within-firm production links. This is an important
margin as many firms face this decision. 60 percent of economic activity takes
place in firms that are vertically integrated in at least one product. We explore
the effect of distance, product specificity, product R&D requirement, transac-
tion frequency, and upstream and downstream competition on the decision to
source from within. To our understanding, we are the first paper to explore this
firm decision.

A large literature on gravity models in trade highlights the importance of dis-


tance in determining patterns of trade (Anderson and Van Wincoop, 2004).
While it is attractive to source from an integrated supplier, the advantage de-
creases the further the integrated supplier is and conversely the closer an outside
firm supplier is. We find results in line with this prediction. Doubling the dis-
tance from an integrated supplier reduces within-firm sourcing by 5 percentage
points while doubling the distance from an outside supplier increases within-firm
sourcing by 7 - 12 percentage points. Shipments that travel twice the distance
are 4 - 5 percentage points more likely to be externally supplied.

When investments are relationship-specific, contracting frictions may lead to


under-investment on the part of the transacting parties (Klein, Crawford, and
Alchian, 1978, Williamson, 1979, Williamson, 1985, Grossman and Hart, 1986,
Hart and Moore, 1990). If vertical integration occurs to reduce the amount
of under-investment, we will find that firms are more likely to source products

3
that are relationship-specific from within. We use the measure of specificity
developed in Rauch (1999) to test this prediction. A product is categorized as
relationship-specific if it is not listed on an exchange. We find that within-firm
sourcing for specific products is 7 - 9 p.p. greater than for non-specific prod-
ucts. Further, the literature suggests that there may be under-investment by
the contracting parties if they do not internalize the benefit to the other. This
would drive higher sourcing from within-firm establishments for products where
investment is important. We find that doubling product R&D investment in-
tensity increases the within-firm sourcing by 8 p.p.

Williamson (1985) emphasizes that the frequency of transactions can be impor-


tant, and that one-time transactions are less likely to be sourced from within.
Goods needed frequently may be more critical in the production process and any
disruption may result in a bottleneck in production. We construct a measure
of transaction frequency at the product level, which measures the proportion of
months that a firm sources a given input. We find that the frequency of ship-
ment is important in explaining the decision to source from within. Receiving a
shipment monthly instead of quarterly results in a 7 percentage point increase
in within-firm sourcing. Volume, measured by the number of shipments, also
matters, and doubling the number of shipments results in 2.6 - 5 p.p. higher
within-firm shipments.

In the Lucas (1988) model, managerial talent drives the firm size distribution. If
larger firms have better managerial talent and sourcing from within is a decision
made by the management, then larger firms are likely to source from within1 .
We find that larger firms in terms of total value, number of establishments, and
number of products are far more likely to take advantage of their integrated
supply networks.

The literature in industrial organization emphasizes market competition as an


important determinant of the decision to vertically integrate. In a similar vein,
we explore whether the sourcing decision is affected by the market competition
in the upstream or the downstream industry. We use the Herfindahl-Hirschman
Index (HHI) at the product level as a measure of competition.

Higher competition among the suppliers of a product, including the integrated


supplier, has an ambiguous prediction on the decision by the downstream estab-
lishment to source from within. While more competition increases the options
available to the downstream establishment making it more likely to source from
outside the firm, the downstream establishment is also a captured buyer for the
upstream establishment in the face of higher competition. Empirically, we find
that more competition upstream increases within-firm sourcing. A 0.1 increase
in HHI decreases the share of within-firm sourcing by 4.4 percentage points.
1 In Melitz (2003), firms need to incur a fixed cost before exporting, and more productive

firms are more likely to export. Similarly, if there are fixed costs of sourcing from within then
it may be more beneficial for larger firms to source from within due to scale economies.

4
We similarly analyze competition in the downstream industry. Ideally, we would
like to conduct the analysis at the product level, similar to the upstream case. If
an establishment faces higher competition for a product, we would like to know
if it is more likely to source the inputs for that product from within. How-
ever, while we see inputs at the establishment level we do not see them at the
product level, i.e., for an establishment that sells multiple products we see the
total inputs purchased, but not inputs purchased for each of the products sepa-
rately. We get around this issue, by assigning each downstream establishment a
market competition measure which is the weighted average of the HHI over all
its output products where we weigh each input’s HHI by its output value. We
find that establishments operating in a more competitive environment are more
likely to source inputs from outside the firm. A 0.1 increase in our weighted
HHI measure faced by the downstream firm decreases within-firm sourcing by
3.6 percentage points. This result was not obvious ex-ante. If firms operate in
a more competitive industry, then having an upstream supplier provides a com-
petitive advantage by reducing the incidence of double marginalization hence
making it more likely that firms would source from within. On the other hand,
input quality and price matter more in more competitive industries and firms
are more likely to source from the highest quality supplier available which may
be outside the firm.

For our main analysis, we focused on the decision of the downstream buyers
to source from within the firm. Another way to get at the question would be
to look at upstream suppliers that have an integrated downstream buyer and
measure the proportion of their sales within the firm. These results would di-
verge if the size distributions of upstream and downstream establishments are
different. We find that when measured from the suppliers’ side, our results are
approximately the same.

Most of the discussion so far has focused on the intensive margin decision of
utilizing an existing link. We also look at what is associated with the existence
of an integrated seller for a given product, or the extensive margin decision
over vertical ownership. We find that larger firms are more likely to have an
integrated seller. If an input is more relationship-specific there is a higher like-
lihood of having an integrated seller. More competitive product categories are
less likely to have an integrated buyer. Finally, if a product is more R&D in-
tensive it is more likely to have an integrated seller.

Finally, we run an establishment-level gravity specification to assess the relative


importance of distance, state borders, and vertical integration in affecting the
volume of trade. In our model, a downstream establishment chooses to source
an input from a set of potential upstream establishments that we observe in the
data. The sourcing decision is affected by the distance from the downstream
buyer, whether the upstream supplier is integrated and whether the upstream
supplier is in the same state. On average, a downstream establishment has

5
over 3000 potential suppliers and the share of sourcing with any one is small
at 0.02%. For the average firm in our data, a one standard deviation reduction
in distance from the supplying establishment increases the share of sourcing to
0.026%, removing state border barriers increases the share to 0.07%, and ver-
tical integration of the supplying establishment increases the share to 2.03%.
While both distance and state borders are important, integration emerges as by
far the strongest driver of sourcing decision.

Our results pertain to a specific context, and care must be taken to generalize it
to other settings. We are studying firm behavior in a developing country where
contracting frictions might be especially high. Boehm and Oberfield (2018)
study how differences in contract enforcement across states in India distorts pro-
duction behavior. They find that in industries that tend to rely more heavily on
relationship-specific intermediate inputs, plants in states with more congested
courts shift their expenditures away from intermediate inputs and appear to be
more vertically integrated. Congestion in courts may also push firms to source
inputs from within rather than relying on outside suppliers. Indeed, Atalay,
Hortaçsu, and Syverson (2014) find that firms are much less likely to source
inputs from integrated suppliers in the United States. Ramondo, Rappoport,
and Ruhl (2016) study the question for MNCs based out of the US and find that
most MNCs do not engage in trade with affiliates. Atalay, Hortaçsu, Li, et al.
(2019) estimate a gravity specification similar to ours at the seller-destination
level to measure the importance of distance and existence of integrated suppliers
for within firm trade in the United States. They also find that firm boundaries
serve as significant barriers to trade, though the magnitude of our results indi-
cate that they are more important in our context.2 Our different results, in a
developing country context, highlight the importance of taking into account the
legal and contracting environment when studying firm behavior.

The remainder of the paper presents our empirical results in more detail. It is
organized as follows. In Section 2 we describe the data. In Section 3 and 4 we
define variable definitions and construction. In Section 5, we present our results
on the ownership and utilization of vertically integrated links. In Section 6, we
analyze when firms utilize their vertically integrated networks. In Section 7,
we study the firm integration decision. In Section 8, we quantify the relative
importance of distance, state borders, and vertical integration in affecting the
volume of trade. We conclude in Section 9.
2 Both Atalay, Hortaçsu, and Syverson (2014) and Atalay, Hortaçsu, Li, et al. (2019) use

data from the US Commodity Flow Survey (CFS). Their data contains a sample of 40 ran-
domly selected shipments per quarter for 9000 multi-unit firms. Their data does not allow
them to distinguish if a shipment is within firm. To get around this issue, they combine infor-
mation on the destination of the shipment with information on the presence of an integrated
establishment at the destination. Unlike the CFS, our data allows us to precisely classify each
shipment as being within or outside the firm.

6
2 Data
In India, every registered business is required to submit an electronic document
(known as an e-way bill)3 to the government prior to any movement of goods
valued above the threshold of Rs. 50,000 (∼ $700 USD). This includes any
good transported by road, air, railways, or water vessel. If the consigner is a
registered taxpayer, they are responsible for generating an e-way bill. If they
are not registered, then generating the e-way bill becomes the responsibility of
the consignee or the person transporting goods. Notably, the bill is generated
even if goods are shipped to a different establishment within the firm. The law
was introduced to increase tax compliance and reduce shipping times. Gov-
ernment officials have the authority to intercept any conveyance to verify the
e-way bill or the e-way bill number for all inter and intra-state shipments. The
penalty for non-compliance is Rs 10,000 (∼ $ 141 USD) or the value of tax-
evaded, whichever is greater. In its first phase, the law covered only interstate
shipments and in later phases was expanded to include intra-state shipments as
well.4

We use administrative data on e-way bills from the state of Karnataka. Kar-
nataka was the first state to roll out this bill at the intra-state shipment level,
starting on April 1, 2018. Our dataset covers the universe of bills from April 1,
2018 to August 29, 2019.

For each e-way bill, we observe the date of shipment, each firms’ tax ID (GSTIN),
distance, the ZIP code (PIN code) of the sender and the receiver, and the total
value of the shipment5 . A given shipment can contain multiple goods. For each
good within a shipment, we observe its HS product code, its total value, and
quantity. Firms report either 4 digit or 8 digit HS product codes, so for most
of our analysis, we will define our products by their 4 digit code. However, we
will repeat our analysis with subsetting to observations for which we see 8 digit
codes for robustness.

We provide some descriptive statistics in Tables 1, 2, and 3. We observe around


1.2 million firms and 1300 products. An average firm operates in 1.76 locations,
is associated with 5 products, and makes 186 shipments in the period we ob-
serve. The average value of a shipment is around Rs. 200,000 ( $ 2,820 USD).
The average total value of outward shipments for a given firm is Rs. 15 mil-
lion ( $ 210,000 USD). For each product, we observe on average around 5,500
establishments and 80,000 inward shipments.
3 The filing of E-way bills was mandated with the rollout of Goods and Services Tax in

India starting in 2017. There is a small but growing literature studying the impact of the tax
regime. See for example Agarwal et al. (2019) and Leemput (2020).
4 For more information refer to the information provided at https://cleartax.in/s/eway-bill-

gst-rules-compliance.
5 We top-code all values at the 99th percentile.

7
3 Measuring Vertical Integration and Within-
Firm Sourcing
This section explains how we use our data to measure vertical integration and
whether an input is sourced from within the firm.

3.1 Identifying Firm Establishments


Each registered firm in the state is assigned a unique tax id (GSTIN). All es-
tablishments6 owned by the same firm will report the same GSTIN in its e-way
bill when transporting goods. We also observe the PIN code of each sender and
receiver. Each PIN code is mapped to exactly one delivery post office. There are
over 150 thousand distinct PIN codes in India and it corresponds on average to
an area of 21.22 square kilometers and a population of roughly 8000 people. We
define an establishment as a GSTIN-PIN code pair. Therefore, if we see a firm
shipping from 5 different PIN codes, we say that the firm has 5 establishments.
We identify about 2 million establishments according to our definition. Note
that while we are not able to separate establishments that operate in the same
PIN code, these are geographic areas that are very small and may effectively
operate as a single entity.

3.2 Identifying Buyers and Suppliers For Each Product


Each e-way record for a shipment may contain multiple products. We observe
the HS code for all products within each shipment. Some small firms (turn
over less than $ 700,000) report only 2 digits of the HS code, while larger firms
report 4 or 8 digits. For our main analysis, we focus on 4 digit HS codes, by
aggregating the 8 digit codes to the 4 digit level and remove all observations at
a 2 digit level which account for less than 0.4% percent of total shipment value
in our data. Additionally, we repeat our analysis by subsetting to observations
for which we see an 8 digit code and use all 8 digits for robustness. We have
about 1,295 products in our data with 4 digit HS codes, and 10,610 products
with 8 digit HS codes.

Next, we determine the set of inputs and outputs for each establishment. It
is possible that an establishment both ships in and ships out a given product.
Thus, we define an establishment to be a “net-buyer” of a good if the total
inward shipment value of that product observed in our data exceeds the to-
tal outward shipment value a multiplied by some threshold. In our preferred
specification, we use a threshold of 1.2, so a given product is an input to an
establishment or the establishment is a “net-buyer” of that good if its total in-
ward shipment value is greater than 1.2 times its total outward shipment value.
6 We use the word establishment to refer to a single unit within the firm. For example, if

a firm owns two factories then we will say that the firm has two establishments.

8
Similarly, we define an establishment to be a “net-seller” of a good if the total
outward shipment value of that product observed in our data exceeds 1.2 times
its total outward shipment value. We provide results for three such thresholds
for robustness (1, 1.2 and 1.5) and find that our results are not sensitive to the
choice of threshold.

In Tables 4 and 5, we present some descriptive statistics on buyers and sellers.


For a given product, about 80 percent of the observed establishments are buyers
and about 20 percent are suppliers. Each firm on average sells 1.5 products and
buys 7 products.

3.3 Measuring Vertical Integration


In this paper, we do not rely on industry-level links based on I-O tables to proxy
for vertical integration. Instead, we directly measure whether a downstream es-
tablishment has a potential upstream supplier within the firm for a particular
input. If it does, then we determine that a direct vertically integrated produc-
tion link for that input exists. An analogous variable construction follows for
whether an upstream establishment has a potential downstream buyer, which
we compute for robustness.

We consider each establishment-product pair for which the establishment is a


“net-buyer” in our data as defined by the previous section. In our most liberal
definition of vertical integration (“Upstream Integrated: Exists”), we determine
that a “net buyer” establishment has an integrated upstream source for that
product if there exists another establishment within the firm which is a “net-
seller”. However, it may be the case that the upstream supplier is much smaller
in scale than the downstream establishment, in which case the firm could not
source from within. We define another measure vertical integration (“Upstream
Integrated: Large”) which requires that in addition to an upstream supplier or
“net-seller” existing within the firm, the largest of these integrated suppliers
much be at a sufficient scale relative to the demand of the downstream estab-
lishment. As it is ambiguous what should be the ‘sufficient scale’ required, we
provide results for different threshold values, 0.5 and 1. A threshold value of 0.5
here would mean that the largest integrated supplier’s total sales are at least
50% of what the downstream establishment requires. Our preferred specification
(“Upstream Integrated: Total”) falls somewhere between these two measures.
The scale is important, but it could be that multiple integrated establishments
together can meet the requirements of the downstream buyer. We sum over
the out-value of all the upstream suppliers within the firm and ask if this to-
tal supply is large enough relative to the demand of downstream establishment.
Again, we provide results for different threshold values, 0.5 and 1. Our preferred
specification uses 0.5 as the threshold, so we determine that a “net-buyer” es-
tablishment has an integrated upstream source for a given input if the total
sales of that input over all integrated suppliers are at least 50% of what the
downstream establishment buys.

9
Similarly, we determine whether an upstream establishment has a potential
downstream buyer by considering each establishment-product pair for which
the establishment is a “net-seller” in our data and checking for the existence of
a downstream “net-buyer” (“Downstream Integrated: Exists”), and also condi-
tioning on establishment-level scale (“Downstream Integrated: Large”) or total
scale (“Downstream Integrated: Total”).

3.4 Measuring Utilization of Within-Firm Production Chains


Our data allows us to measure the extent to which vertically integrated firms
utilize their within-firm direct production links. First, we consider a shipment
i to be “internal” if both the sender and receiver on the e-bill have the same
unique tax id (GSTIN). Then for each establishment (j) and product (p) pair
with an upstream buyer, we compute the share of total inputs which are sourced
within the firm by summing over all “internal” in-shipments of product p to
establishment j and dividing by the sum of all in-shipments of product p to
establishment j.

Σi∈χ(j) valueip ∗ 1{senderi = j}


W ithinSharejp =
Σi∈χ(j) valueip
where W ithinSharejp is the share of shipments of product p to establishment
j which are from within firm suppliers, valueip is the total value of product p
in shipment i, senderi is the sending establishment of shipment i, and χ(j) is
the set of shipments to establishment j.

We can also aggregate to the firm level and measure the extent to which a given
firm k utilizes its vertically integrated production links by taking the weighted
average for all W ithinSharejp for each product-establishment associated with
a firm k which has an upstream integrated supplier.

Σjp∈∆(k) [W ithinSharejp ∗ 1{U pstreamIntegratedjp = 1} ∗ Σi∈χ(j) valueip ]


F irmU tilizationk =
Σjp∈∆(k) [1{U pstreamIntegratedjp = 1} ∗ Σi∈χ(j) valueip ]

where F irmU tilizationk is the share of potential vertically integrated sourcing


that is realized, ∆(k) is the set of all establishment-product pairs associated
with firm k, and U pstreamIntegratedjp is an indicator for whether a potential
upstream supplier exists within the firm for a particular input based on our
definitions from the previous section.

4 Other Variable Construction


We also construct several additional variables for our analysis.

10
4.1 Distance
Each e-way bill reports the distance over which the goods are transported. This
is the measure of distance we use when we compute the average distance over
which goods are transported for supply relationships which exist. For instance,
the average shipment distance for a given input for some establishment. How-
ever, we are also interested in counterfactual distances for supply relationships
that fail to materialize. Particularly, we are interested in the distances to po-
tential suppliers within and outside the firm boundary for each “net buyer”
establishment-product pair, as well as how far a given establishment is from
other establishments owned by the same firm. Here, our measure of distance
between two establishments is the distance between their PIN code centroids.

4.2 Input Requirements


We define several variables to capture a given establishment’s requirements for
a given input p. We proxy for volume by summing over the total value of
product p in every inward shipment. We also count the number of inward
shipments that carry the product p. We proxy for the value of the input by
computing the average value of product p in an inward shipment which carries
product p. Finally, we also measure the time-frequency of the input requirement
by counting the number of months for which an inward shipment that carries
product p is made and dividing by the total months which the establishment
exists in our data – the share of months with a shipment of product p.

4.3 Product Relationship Specificity


We define a measure for the importance of relationship-specific investments
across products. Using data from Rauch (1999), we identify whether or not the
input is sold on an organized exchange and whether or not it is reference priced in
a trade publication. If an input is sold on an organized exchange then the market
for this input is thick, and the scope for hold-up is limited. If a buyer attempts
to renegotiate a lower price, then the seller can simply take the input and sell
it to another buyer. If an input is not sold on an exchange, it may be reference
priced in trade publications, which are purchased by potential buyers and sellers
of the input. Trade publications are only produced if there is a sufficient number
of purchasers of the publication. Therefore, if an input is reference priced in
a publication, then this indicates that there exists a reasonably large number
of potential buyers and sellers of the input. This is our preferred definition.
However, inputs not sold on an exchange but referenced in trade publications
can be thought of as having an intermediate level of relationship-specificity,
which we will use as a secondary definition for robustness. Rauch’s original
classification uses the 4-digit SITC Rev. 2 system. Each industry is coded as
being in one of the following three categories: sold on an exchange, reference
priced, or neither. We use a 4-digit SITC to 10-digit HS concordance from
Feenstra and Hanson (1996) to match the classification to our data. Rauch has

11
both a liberal estimate and a conservative estimate. We use the liberal estimate
in our main specification and our results are not affected by this choice.

4.4 R&D Intensity


We use data from Nunn and Trefler (2013) who calculated R&D intensity using
firms in the Bureau van Dijk’s Orbis dataset. R&D intensity is defined as R&D
expenditures to total sales. The data is provided at the HS 6 digit level, so we
aggregate this measure to the HS 4 digit level by taking the mean over all HS
6 digit intensity measures which correspond to a given HS 4 digit code.

4.5 Size
We proxy for firm or establishment size by its total in-shipment value. We also
count the number of products that a firm or establishment buys and sells. For
firms, we additionally count the number of establishments that it owns.

4.6 Competition
We use the Herfindahl-Hirschman Index (HHI) at the product level as a measure
of competition. We calculate this measure by squaring the market share of each
firm competing in a market and then summing the resulting numbers.

To define market concentration, one needs to define a market within which firms
are assumed to compete. We define the market to be either the state or a district
in the state. We define the market share of firm j in product p to be its share
of total outward shipment value in market m.
Σi∈χ(j) valueip
sjpm =
Σj 0 ∈Φ(m) Σi∈χ(j 0 ) valueip
where sjpm is the market share of firm j in product p in market m, valueip is
the total value of product p in shipment i, χ(j) is the set of shipments from firm
j, and Φ(m) is the set of firm in market m.

Our market concentration measure is:

HHIpm = Σj∈Φ(m) s2jpm


The HHI index ranges from 0 to 1.0, moving from a huge number of very small
firms to a single monopolistic producer.

Our preferred specification is to define a market as the state since there is


a substantial amount of across district trade. Our results are robust to this
specification.

12
5 Results: Ownership and Utilisation
In this section, we present our empirical results related to the ownership of
vertically integrated production chains and the utilisation of these networks.

5.1 Ownership of Vertically Integrated Production Chains


By directly observing all inputs and outputs for each product at every estab-
lishment in our data, we can measure the extent to which firms own vertically
integrated production links. In Table 6, we present the results for our various
definitions of vertical integration as described in Section 3. Our preferred spec-
ification is using “Upstream/Downstream Integrated: Total” with a threshold
of 1.2, where we determine that a potential vertical link for a given product
exists if the total out/input value of the firm is greater than 1.2 times its total
in/output value for that product.

We find that downstream establishments can potentially source 11% of the total
input value from an integrated upstream establishment. When we only consider
firms that operate in more than one location, or multi-establishment firms, 13%
of the total input value can be sourced from an integrated upstream supplier.
We also take the suppliers’ perspective and find that upstream establishments
can potentially sell 10% of its total output value to an integrated downstream
establishment. This measure increases to 11% when we only consider multi-
establishment firms.

In Table 7, we show that a large share of firms in the economy are vertically
integrated. Firms that can source at least one product from within make up
60% of economic activity.

5.2 Utilization of Vertically Integrated Production Chains


A large share of trade can take place within vertically integrated firms. Next,
we measure the extent to which this trade materializes.

5.2.1 Baseline Results


We measure the share of input value for which a vertically integrated upstream
supplier exists that is sourced within firm.

Our main results for this section are reported in Table 8. In this table, we con-
sider firms located within the state, as we see their complete transaction data.
As explained in Section 3, we define a firm to be a seller of a product if its total
out-value for that product exceeds a multiple of its total in-value. Each column
refers to a different threshold value for the multiple. We report results for 1,
1.2 and 1.5. Each row corresponds to a different definition of the firm having
an integrated supplier. In the first two rows, we consider the case where the

13
firm owns a single upstream seller at sufficient scale, using threshold values of
0.5 and 1 respectively, while in the next two rows we look at whether the total
capacity of integrated sellers is sufficiently large.

In this table, we report the weighted average of W ithinSharejp , where we weigh


by establishment j’s total purchase value of product p. Thus, each number rep-
resents the share of total trade that takes place within firms, out of the total
potential trade that can take place within the firm. Our measure ranges be-
tween 32% and 35%, for different specifications.

We also report the unweighted average in Table 9. Unweighted average treats


each establishment’s realization of W ithinSharejp as an independent draw.
Thus, thus number represents the average share at the establishment level. The
results for this measure are similar to the weighted share, ranging from 37% to
40% for different specifications.

5.2.2 Sample Selection Robustness


In addition to checking our measures with various threshold levels, we run a
series of further robustness checks on sample selection.

First, we remove from the sample the observations where the establishment
sources the product less than three times. This may represent a one-off trans-
action and it may not be worth it to source from within. In Table A1, we find
that our baseline measures do not change much and range between 30 and 34
percent. We also report our unweighted results in Table A2.

Second, we consider only the primary inputs for each establishment. It may be
that minor inputs are not important enough to be sourced within the firm. We
define an establishment’s primary input to be that of the largest total inward
shipment value. We find similar results as the baseline. The results are reported
in Table A5 and our measure ranges from 33 to 38 percent. We also report our
unweighted results in Table A6.

Third, we consider only potential suppliers within a downstream establishment’s


district. It may be that distance prohibits firms from utilizing their within-firm
production links. In Table A3, we show that our baseline results are robust,
ranging from 31 to 46 percent. We also report our unweighted results in Table
A4.

5.2.3 Robustness to Scale Requirement


Up to this point we have considered cases where the integrated suppliers must
be of sufficient scale for the downstream establishment to consider sourcing from
within. In Table A7 we check the robustness of our results to adopting a very

14
liberal definition of having an integrated supplier. We consider an establishment
to have an integrated supplier if there is any net-seller of the input within the
firm, irrespective of the scale of the net-seller. We find that the unweighted
average of within-firm sourcing is around 37% while the weighted average is
around 19%. This likely represents a lower bound on the amount of within-firm
sourcing.

5.2.4 Within Firm Downstream Selling


We also consider the utilization of vertical production networks from the per-
spective of an upstream supplier. We present analogous results to the baseline,
except now we measure the share of output value that is sold within the firm
for which a vertically integrated downstream buyer exists. These results would
diverge if the size distribution of upstream and downstream establishments are
different.

In Table 11, we report that the share of total sales that takes place within firms,
out of the total potential sales that can take place within the firm ranges from
29% to 38% for different specifications. We also report the unweighted average
in Table 12 which ranges from 42% to 48%.

5.2.5 Firm Level Utilization


Up to this point, we have treated each establishment within a firm indepen-
dently. We also consider sourcing at the firm level, dividing vertically integrated
firms into firms that source at least one product from within and those which
do not source any products. We present our results in Table 10. Vertically
integrated firms which source at least one product from within account for 72
to 76 % of economic activity amongst all vertically integrated firms. We also
present the unweighted share of firms in Table A8.

5.2.6 Results by Product Category


We report the within-firm numbers for 21 aggregated product categories in Ta-
ble 14. We find a lot of variation, with the within-firm share ranging from 10%
to 90%. Looking at the weighted average, we find that the mineral products
and wood/cork/straw articles are least likely to be sourced from within while
precious metals and stones and arms and ammunition are most likely to be
sourced from within.

We also report the unconditional share of within-firm sourcing in the table. This
is the amount of trade for the product category as a proportion of total trade
in the economy. The number ranges from 0.02 for wood/cork/straw articles to
0.19 for precious metals and stones.

15
5.2.7 Robustness to Product Code Level
For most of our specifications, we use 4 digit HS codes to define our product
categories. However, there may be a concern that defining the product at 4
digit level is too broad. For example, we may wrongly classify a firm as having
an integrated supplier by looking at the 4 digit level, if products within a given
4 digit category are not substitutable. This would downward bias our results.
To see how much of a concern this is we repeat our baseline analysis using the
observations for which we have 8 digit HS codes. As larger firms are more likely
to report 8 digit HS code, the sample selection is endogenous.

We report our results in Tables A9 and A10. The share of within-firm sourcing,
when weighted by value, lies between 41 - 46% while the unweighted number lies
between 30 - 32%. These numbers are similar to our earlier results suggesting
that product definition is unlikely to be a big concern.

6 Results: When Do Firms Utilize Their Verti-


cally Integrated Networks?
We have established that firms utilize their internal networks. Next, we explore
this intensive margin decision that firms make about whether to utilize their
existing within-firm production links. This is an important margin that many
firms face. 60 percent of economic activity takes place in firms vertically in-
tegrated in at least one product. In this section, we look at what factors are
associated with a firm’s decision to source from within given that a potentially
vertically integrated supplier exists.

6.1 Distance
In Table 15 we explore how distance affects the firm’s decision to source from
within. In columns (1) and (2) we find that doubling the distance from where
the input is sourced decreases within-firm shipments by around 5 percentage
points. Thus, input suppliers outside the firm are on average located further
than the input suppliers within the firm. This is consistent with different estab-
lishments within a firm locating close to each other. Looking at firm dispersion,
we compute a measure of how dispersed a firm is, i.e., for a given establishment
what is the average distance to all other establishments within the firm. In
columns (3) and (4) we find that firms that are more geographically dispersed
are more likely to source from outside.

In columns (5) and (6), we look at the impact of distance to net-sellers within
and outside the firm. We find that doubling the average distance to integrated
sellers reduces within-firm sourcing by around 5 p.p., while doubling the average
distance to outside firm sellers increases the within-firm sourcing by 7 - 12 p.p.

16
The attenuation in sourcing with increasing distance is higher for outside firm
suppliers, i.e., firms are less elastic to distance for within-firm suppliers.

6.2 Relationship Specificity


In Table 16, we look at how product relationship specificity drives a firm’s de-
cision to source from within. Using Rauch (1999)’s classification, we find that
if a product is listed on an exchange, it is 7 - 9 p.p. less likely to be sourced
from within. Thus, in line with theoretical predictions, firms are more likely to
source more specific products from within the firm.

In columns (2) and (4) we broaden the definition of non-specific products to


include products that are either listed on an exchange or are reference priced.
The omitted category, in this case, is differentiated products. For this definition,
we find that the firm is between 4 - 5 p.p. less likely to source from within. The
reduction in coefficient size is consistent with products that are reference-priced
but not traded on exchanges being more ‘specific’ than products that are traded
on exchanges.

6.3 Frequency and Volume


In Table 17, we find that products which are sourced more frequently tend to
be sourced from within-firm suppliers. Going from sourcing the product once
per quarter to monthly increases the within-firm sourcing by 7 p.p. This can
be due to multiple reasons. If there are contracting costs associated with every
shipment, then it may make sense to source from within. The product can also
be a crucial input into the production process and may lead to bottlenecks in
production in case of delays.

We also explore the impact of the number of shipments in Table 18, and find a
positive effect. Doubling the number of shipments increases within-firm sourcing
by 2.6 - 5 p.p.

6.4 Competition
In Table 19 we explore how upstream and downstream competition affects
within-firm sourcing. As outlined previously, upstream competition has an am-
biguous prediction on the level of within-firm sourcing. On the one hand, it
increases the options available to the downstream establishment, making it less
likely that the product will be sourced from within. On the other hand, in the
face of higher competition, the downstream establishment is a captured buyer
for the upstream establishment which may increase the amount of within-firm
sourcing. Empirically, we find that an increase in upstream competition in-
creases within-firm sourcing. A 0.1 increase in the HHI increases within-firm

17
sourcing by 4 p.p.

In columns (3) and (4), we look at the impact of downstream competition.


We find that if the downstream firm faces a more competitive environment,
it reduces within-firm sourcing. One potential mechanism is that competition
forces the firm to look for better inputs in terms of quality which are more likely
available outside. However, we do not test for the potential mechanisms that
can be at play in generating the result.

6.5 R&D
In Table 20 we look at whether firms are more likely to source products that
are R&D intensive from within. High R&D intensive products may require
upfront investment by the supplier. However, a non-integrated supplier does not
internalize all of the benefits of this investment. This makes it more likely that
such products will be sourced from within. Using the measure of R&D intensity
from Nunn and Trefler (2013), we find that products which are more intensive in
R&D requirement are more likely to be sourced from within. Doubling the R&D
requirement increases the share of within-firm sourcing by about 8 percentage
points.

6.6 Firm Scale


Finally, we look at the impact of firm scale on within-firm sourcing in Table
21. Larger firms have more opportunities to source from within and as they
operate at a larger scale, stand to benefit more from sourcing from within. It
is also likely that larger firms have greater managerial capacity, and therefore,
are more likely to exploit the opportunity to source from within. On the other
hand, larger firms are likely to have lower costs of contracting, as they may have
easier access to the legal system in the case of a dispute.

Empirically, we find that larger firms, measured in terms of value of shipments,


number of products or number of locations, tend to increase within-firm sourc-
ing. Doubling the value of shipments increases the within-firm shipment by 0.7
p.p., doubling the number of products that the firm deals in increases it by 1.6
p.p. and doubling the number of locations that the firm operates in increases it
by 5 percentage points.

7 Firm Integration Decision


Our results in the previous section focused on the decision to source conditional
on the firm ownership structure. In this section, we explore what is associated
with the existence of a within-firm supplier, i.e., what firm and product charac-
teristics are associated with the decision to own a vertically integrated upstream

18
firm. We report results in Tables 22 and 23.

Table 22 reports the establishment-product level regression results with an indi-


cator for the existence of a vertically integrated seller as the outcome variable.
The mean value of the outcome variable is 0.10. We find that doubling the total
value of trade for a product increases the probability of existence of an inte-
grated supplier by 0.7 percentage points. Doubling the number of shipments
made increases the probability by 2.5 p.p.. If a product is not relationship-
specific, i.e., if it is listed on an exchange, the probability that there will be an
integrated seller goes down by 1.1 p.p.. Market competition is associated with a
lower probability of having an integrated supplier, with an increase in upstream
and weighted downstream HHI of about 0.1 increasing the probability of exis-
tence by 1.9 and 3.4 p.p. respectively. We do not find a significant coefficient
for distance to outside sellers. Finally, doubling R & D intensity is associated
with a 10 p.p. increase in the probability of an integrated supplier existing

Above we reported that higher trade in a product is associated with a higher


probability of having an integrated seller. We also verify this at the firm level.
At the firm level, we aggregate the existence of integrated buyers across different
product categories that the firm operates in by taking a weighted average. In
23, we show that larger firms, in terms of, total inward shipping value, number
of products and number of locations operated in, are more likely to have an
integrated seller. Doubling these measures increases the probability of owning
a vertically integrated supplier by 0.3, 0.7 and 4.5 p.p points respectively.

8 How Valuable are Vertical Links?


The results presented in the previous sections suggest that the benefits from
vertical integration are substantial. In this section, we develop a model of firm
sourcing and estimate an establishment-input level gravity specification to quan-
tify the extent to which firm boundaries are barriers to trade relative to distance
and state borders.

Our identification comes from revealed preference where establishments trade-


off within-firm ownership with distance and being in the same state. Specifically,
from observing bilateral trade flows, we measure how much more likely a given
establishment sources from a vertically integrated supplier, relative to a geo-
graphically close supplier in terms of distance or a within state supplier.

8.1 Model of Sourcing


Establishment j decides to source some unit value input from K potential sup-
pliers. The cost minimization problem of the establishment is:

19
min τjk + cjk + jk ,
k

where τjk captures the transportation costs associated with sourcing from sup-
plier k, cjk is the contracting cost, and jk is an establishment-supplier specific
idiosyncratic shock. We assume that jk follows an EV1 distribution, yielding
the following expression for the share of input firm j sources from k.

exp(τjk + cjk )
E[Xjk /Xj ] = (1)
Σk exp(τjk + cjk )
We parameterize τjk as follows:

τjk = ατ 0 + ατ 1 log(distancejk ) + ατ 2 1jk (withinf irm) ∗ log(distancejk )


+ ατ 3 1jk (withinstate) ∗ log(distancejk ) (2)

where distancejk is the distance in meters from establishment j to k, 1jk (withinf irm)
is an indicator function for whether the supplying establishment is a vertically
integrated firm, and 1jk (withinstate) is an indicator for whether the supplying
establishment is located within the same state.

We parameterize cjk as follows:

cjk = αc0 + αc2 1jk (withinf irm) + αc3 1jk (withinstate).


Therefore, sourcing from within the firm reduces both the per-unit distance
cost of sourcing and the fixed contracting cost. Substituting the parameterized
expressions into Equation 1 we get our estimating equation,

E[Xjk /Xj ] = exp(α0 + α1 distancejk + α2 1jk (withinf irm) ∗ distancejk


+ α3 1jk (withinstate) ∗ distancejk + α4 1jk (withinf irm)+
α5 1jk (withinstate) + γj ) (3)

where γj is a fixed effect for buying establishment which absorbs the denomi-
nator in Equation 1.

8.2 Estimation
We estimate Equation 3 with a multinomial pseudo maximum likelihood estima-
tor, implemented via a Poisson regression. For each downstream establishment-
input pair, we define the set of potential suppliers to be all “net sellers” of the
input which operate a sufficient scale relative to the demand of the downstream
establishment. As before, we define “net-sellers” of the input to be establish-
ments with total outward shipment value exceeding 1.2 times its total inward
shipment value, and the ‘sufficient scale” condition requires that total sales of

20
the input be at least 50% of what the downstream establishment buys.

In our baseline specification, we fix α2 and α3 , the coefficients on the distance-


within firm and within state-within firm interaction terms, to be equal to 0. We
present our estimates in Table 24. We estimate that the elasticity of bilateral
trade flows with respect to distance and being in the same state is -0.12 and
1.3 respectively. This elasticity with respect to within-firm ownership is 4.6. To
interpret the coefficients, we consider the average establishment in our sample
which has many potential suppliers for each of its inputs and a baseline share
of sourcing with any given supplying establishment at 0.02%. A one standard
deviation reduction in distance from the supplying establishment increases the
share of sourcing to 0.026%. Moving the supplying establishment within state
increases the share of sourcing to 0.07%. Vertical ownership increases the share
of sourcing to 2.03%. Vertical integration of a supplying establishment increases
the share of sourcing much more than either a reduction in distance or being
within the same state indicating that firm boundaries are particularly important
barriers to trade in our setting.

In Table 25, we estimate α2 and α3 , the coefficients on the distance-within firm


and within state-within firm interaction terms. We find that α2 has a positive
coefficient. This indicates that the relationship between vertical integration and
the trade volume is stronger for more distant locations. Similarly, we find that
α2 has a positive coefficient suggesting that the relationship between vertical
integration and the trade volume is stronger for within-state sourcing.

We also consider the analogous decision to sell to potential buyers from the per-
spective of an upstream supplier. In Appendix Tables A11 and A12, we present
our estimates and find similar results.

Atalay, Hortaçsu, Li, et al. (2019) estimate a similar gravity specification with
data on seller-destination trade flows and find that the elasticity of bilateral
trade flows with respect to the addition of a same-firm establishment in a des-
tination is 0.89.7 Their main result is that having an additional vertically in-
tegrated establishment in a given destination ZIP code has the same effect on
shipment volumes as a 40% reduction in distance. We replicate their specifi-
cation with our data in Appendix Table A13. In our setting, the elasticity of
bilateral trade flows with respect to operating a same-firm establishment in the
destination is 0.97.8 This implies that having a vertically integrated establish-
7 Calculated by multiplying the same-firm ownership share coefficient (2.828) from a Poisson
1
regression and 1+r where r is the average number of potential recipients in a destination
(0.315).
8 Calculated by multiplying the same-firm ownership share coefficient (2.411) from a Poisson

regression and the average same-firm ownership share given that a vertically integrated firm
exists in the destination (0.402). Following Atalay, Hortaçsu, Li, et al. (2019), we carry out
a simple calculation to compute the magnitude relative to distance: exp( 0.402∗2.411
−0.401
) where
-0.401 is the coefficient on log(distance).

21
ment in a given destination has the same effect on shipment volumes as a 91%
reduction in distance. This is substantially larger than that found by Atalay,
Hortaçsu, Li, et al. (2019) in the US context.

9 Conclusion
We use administrative data from a large state in India on the movement of phys-
ical inputs both within and between firms. We make five main contributions.

First, we map out the supply network within and across the firms, focusing on
the dimension of vertical flow of physical inputs. We document that approxi-
mately 11% of inputs, by value, can be sourced from within under the current
ownership structure.

Second, we measure how much of this trade materializes and find that around 30
- 40% of the potential trade takes place. This number is economically significant,
suggesting that sourcing of physical inputs from within the firm is important.
The result is robust to a variety of sampling and measurement choices.
Third, we explore the factors that are associated with the decision to source
from within. We find that distance to suppliers within and outside the firm,
frequency and volume of product requirement, market competition in upstream
and downstream industries, product specificity, product R & D intensity and
firm scale all matter in explaining the decision.

Third, we look at the extensive margin decision and see what is associated with
the existence of an integrated buyer. We find that larger firms are more likely to
have an integrated buyer. At the product level, larger trade value in the prod-
uct, higher specificity, higher R&D intensity, and lower competition all increase
in the probability of having an integrated supplier.

Finally, we estimate an establishment-level gravity specification to assess the


importance of integration for trade. We find that while both distance and state
borders are important, integration emerges as the strongest driver of sourcing
decisions.

References
Agarwal, Sumit et al. (June 2019). “Tax-Pass-through, Pricing Strategy and
Consumer Spending Dynamics: The Indian GST Experience”. In: SSRN
Electronic Journal.
Anderson, James E. and Eric Van Wincoop (Sept. 2004). “Trade costs”. In:
Journal of Economic Literature 42.3, pp. 691–751.

22
Atalay, Enghin, Ali Hortaçsu, Mary Jialin Li, et al. (Nov. 2019). “How Wide Is
the Firm Border?” In: The Quarterly Journal of Economics 134.4, pp. 1845–
1882.
Atalay, Enghin, Ali Hortaçsu, and Chad Syverson (2014). “Vertical integration
and input flows”. In: American Economic Review 104.4, pp. 1120–1148.
Boehm, Johannes and Ezra Oberfield (2018). “Misallocation in the Market for
Inputs: Enforcement and the Organization of Production”. In: National Bu-
reau of Economic Research Working Paper Series No. 24937.
Bresnahan, Timothy and Jonathan Levin (2012). “Vertical Integration and Mar-
ket Structure”. In: NBER Working Papers.
Coase, R. H. (Nov. 1937). “The Nature of the Firm”. In: Economica 4.16,
pp. 386–405.
Feenstra, Robert C and Gordon H Hanson (1996). “Globalization, Outsourcing,
and Wage Inequality”. In: The American Economic Review 86.2, pp. 240–
245.
Grossman, Sanford and Oliver Hart (Aug. 1986). “The Costs and Benefits of
Ownership: A Theory of Vertical and Lateral Integration”. In: Journal of
Political Economy 94.4, pp. 691–719.
Hart, Oliver and John Moore (1990). Property Rights and the Nature of the
Firm. Tech. rep. 6, pp. 1119–1158.
Klein, Benjamin, Robert G. Crawford, and Armen A. Alchian (Oct. 1978). “Ver-
tical Integration, Appropriable Rents, and the Competitive Contracting Pro-
cess”. In: The Journal of Law and Economics 21.2, pp. 297–326.
Leemput, Eva Van (2020). “A Passage to India: Quantifying Internal and Ex-
ternal Barriers to Trade”.
Lucas, Robert E (1988). “On the Mechanics of Economic Development”. In:
Journal of Monetary Economics 22, pp. 3–42.
Melitz, Marc J. (2003). “The Impact of Trade on Intra-Industry Reallocations
and Aggregate Industry Productivity”. In: Econometrica 71.6, pp. 1695–
1725.
Novak, Sharon and Scott Stern (2009). “Complementarity Among Vertical In-
tegration Decisions: Evidence from Automobile Product Development”. In:
Management Science 55.2, pp. 311–332.
Nunn, Nathan and Daniel Trefler (Oct. 2013). “Incomplete contracts and the
boundaries of the multinational firm”. In: Journal of Economic Behavior
and Organization 94, pp. 330–344.
Perry, Martin (1989). Chapter 4 Vertical integration: Determinants and effects.
Ramondo, Natalia, Veronica Rappoport, and Kim J. Ruhl (Jan. 2016). “In-
trafirm trade and vertical fragmentation in U.S. multinational corporations”.
In: Journal of International Economics 98, pp. 51–59.
Rauch, James E. (June 1999). “Networks versus markets in international trade”.
In: Journal of International Economics 48.1, pp. 7–35.
Rey, Patrick and Jean Tirole (2007). Chapter 33 A Primer on Foreclosure.
Spengler, Joseph J (1950). Vertical Integration and Antitrust Policy. Tech. rep.
4, pp. 347–352.

23
Stigler, George J (1951). The Division of Labor is Limited by the Extent of the
Market. Tech. rep. 3, pp. 185–193.
Williamson, O E (1985). The Economic Institutions of Capitalism: Firms, Mar-
kets, Relational Contracting. Free Press.
Williamson, Oliver (1975). Markets and hierarchies, analysis and antitrust im-
plications : a study in the economics of internal organization. Free Press,
p. 286.
Williamson, Oliver E (1971). “The Vertical Integration of Production: Market
Failure Considerations”. In: American Economic Review 61.2, pp. 112–123.
— (Oct. 1979). “Transaction-Cost Economics: The Governance of Contractual
Relations”. In: The Journal of Law and Economics 22.2, pp. 233–261.

24
10 Tables

Table 1: Sample Description

Variable Count
Firms 1,191,778
Firm-Locations 2,094,156
Firm-Location-Products 9,054,299
Products 1,295
Locations 35,026
Locations in State 6,938
Districts in State 40
Notes: This table provides counts for each of the units in our data. Firms are a tax
identification number (GSTIN), locations are zipcodes (6 digit PIN codes), products are 4
digit HS product codes, and districts are 3-digit PIN codes.

25
Table 2: Descriptive Statistics: Firm Level

Statistic Mean St. Dev. Pctl(25) Median Pctl(75)


Number of Locations 1.76 19.78 1 1 1
Number of Products 5.34 12.17 1 2 5
Number of Shipments 186.63 12,504.52 2.00 7.00 34.00

26
Average Value of Shipments 202,291.10 368,527.30 50,352.47 89,000.00 187,995.50
Total Value of Inward Shipments 15,820,502.00 1,016,284,634.00 50,000 339,871.5 2,283,888.0
Total Value of Outward Shipments 15,818,995.00 1,216,740,171.00 0 0 735,155.2
Notes: This table provides descriptive statistics for the data aggregated to a firm level, where a firm is identified by a unique tax id (GSTIN). We
report the average number of locations the firm operates in, the average number of product categories that the firm trades, the average number of
shipments, the average value for each shipment and average value of inward and outward shipments. The numbers are reported in the local
currency, INR.
Table 3: Descriptive Statistics: Product Level

Statistic Mean St. Dev. Pctl(25) Median Pctl(75)


Total Value of Inward Shipments 13,271,724,237.00 39,532,758,125.00 201,668,508.00 1,556,990,204.00 8,291,778,989.00
Total Value of Outward Shipments 13,271,724,237.00 39,532,758,125.00 201,668,508.00 1,556,990,204.00 8,291,778,989.00
Number of Firms 4,824.37 8,816.63 365.5 1,541 5,039
Number of Firm-Locations 1,310.88 1,439.70 279.5 808 1,816.5

27
Number of Buying Firm-Locations 5,545.84 11,001.99 320.5 1,543 5,487.5
Number of Selling Firm-Locations 1,325.57 2,367.00 138.5 477 1,433.5
Number of In-Shipments 88,434.74 271,974.10 1,636.00 11,951.00 63,445.00
Number of Out-Shipments 80,958.96 207,496.70 1,618.00 11,743.00 59,994.50

Notes: This table provides descriptive statistics aggregated to the product level. The product is defined as a 4 digit HS product code. Total Value
of Inward Shipments is defined as the sum of all inward shipment values for that product. Total Value of Outward Shipments is defined as the sum
of all outward shipment values for that product. Firms are identified by a unique tax id. Firm-locations are a tax id - zip code pair. We define
buying and selling firm-locations according to Section 3. The Number of In/Out Shipments is the count of shipments for each product.
Table 4: Descriptive Statistics: Buyers and Sellers - Product-Establishment
Level

Statistic Mean St. Dev.


Net Seller (1.0) 0.20 0.40
Net Buyer (1.0) 0.80 0.40
Net Seller (1.2) 0.19 0.39
Net Buyer (1.2) 0.79 0.41
Net Seller (1.5) 0.18 0.39
Net Buyer (1.5) 0.79 0.41
Notes: This table reports the share of product - establishment pairs which are net sellers
and net buyers in the data according to our definition from Section 3. An establishment is
defined to be a Net Buyer (1/1.2/1.5), if the total in-value of the product is at least
1/1.2/1.5 times the total out value of the product. An establishment is defined to be a Net
Seller (1/1.2/1.5), if the total out-value of the product is at least 1/1.2/1.5 times the total
in-value of the product.

Table 5: Descriptive Statistics: Number of Input and Outputs - Firm Level

Statistic Mean St. Dev.


Net Seller (1.0) 1.33 5.60
Net Buyer (1.0) 7.09 14.46
Net Seller (1.2) 1.41 5.80
Net Buyer (1.2) 7.04 14.40
Net Seller (1.5) 1.26 5.41
Net Buyer (1.5) 6.98 14.34
Notes: This table reports the average number of products for which a firm is a buyer or seller
according to our various definitions from Section 3. An establishment is defined to be a Net
Buyer (1/1.2/1.5), if the total in-value of the product is at least 1/1.2/1.5 times the total
out value of the product. An establishment is defined to be a Net Seller (1/1.2/1.5), if the
total out-value of the product is at least 1/1.2/1.5 times the total in-value of the product.

28
Table 6: Potential Within Firm Sourcing

Variable All Multi-Establishment Firms


Upstream Integrated: Large 0.10 0.12
Upstream Integrated: Total 0.11 0.13
Upstream Integrated: Indicator 0.25 0.30
Downstream Integrated: Large 0.06 0.07
Downstream Integrated: Total 0.10 0.11
Downstream Integrated: Indicator 0.46 0.53
Notes: This table reports the total volume of trade that can be conducted within firm
boundaries. Different rows correspond to different definitions of a downstream establishment
having an integrated upstream supplier. See Section 3. Column 1 reports the the volume as
a proportion of total trade. Column 2 conditions on the set of firms which are
multi-establishment.

Table 7: Share of Vertically Integrated Firms

Integration Definition 1 1.2 1.5


Exists 0.65 0.63 0.62
Large: 0.5 0.62 0.61 0.60
Large: 1.0 0.61 0.59 0.58
Total: 0.5 0.62 0.61 0.60
Total: 1.0 0.61 0.59 0.58
Notes: This table reports the weighted share of economic activity which happens within
vertically integrated firms, as a share of total firm activity. Different rows correspond to
different definitions of vertical integration, as defined in Section 3. Large 0.5/1.0 requires
that an individual vertically integrated supplier must have output greater than 0.5/1.0 times
the downstream buyer’s inputs. Total 0.5/1.0 requires that all vertically integrated supplier
must have total output greater than 0.5/1.0 times the downstream buyer’s inputs. Each
column reports a different threshold requirement for being a buyer or a seller. An
establishment is defined to be a Net Buyer (1/1.2/1.5), if the total in-value of the product is
at least 1/1.2/1.5 times the total out value of the product. An establishment is defined to be
a Net Seller (1/1.2/1.5), if the total out-value of the product is at least 1/1.2/1.5 times the
total in-value of the product.

29
Table 8: Share of Within Firm Upstream Sourcing: Weighted

Integration Definition 1 1.2 1.5


Large: 0.5 0.32 0.30 0.30
Large: 1.0 0.34 0.33 0.33
Total: 0.5 0.31 0.30 0.31
Total: 1.0 0.34 0.32 0.33
Notes: This table reports the share of sourcing that takes place within the firm conditional
on a vertically integrated upstream firm existing for various definitions and thresholds, as
defined in Section 3. We weight by total inward shipment value. Each row reports the share
according to a different definition for integration or for a potential vertically integrated
supplier existing. Large 0.5/1.0 requires that an individual vertically integrated supplier
must have output greater than 0.5/1.0 times the downstream buyer’s inputs. Total 0.5/1.0
requires that all vertically integrated supplier must have total output greater than 0.5/1.0
times the downstream buyer’s inputs. Each column reports a different threshold requirement
for being a buyer or a seller. An establishment is defined to be a Net Buyer (1/1.2/1.5), if
the total in-value of the product is at least 1/1.2/1.5 times the total out value of the
product. An establishment is defined to be a Net Seller (1/1.2/1.5), if the total out-value of
the product is at least 1/1.2/1.5 times the total in-value of the product.

Table 9: Share of Within Firm Upstream Sourcing: Unweighted

Integration Definition 1 1.2 1.5


Large: 0.5 0.40 0.40 0.40
Large: 1.0 0.39 0.39 0.39
Total: 0.5 0.40 0.40 0.40
Total: 1.0 0.39 0.39 0.39
Notes: This table reports the share of sourcing that takes place within the firm conditional
on a vertically integrated upstream firm existing for various definitions and thresholds, as
defined in Section 3. We compute unweighted shares. Each row reports the share according
to a different definition for integration or for a potential vertically integrated supplier
existing. Large 0.5/1.0 requires that an individual vertically integrated supplier must have
output greater than 0.5/1.0 times the downstream buyer’s inputs. Total 0.5/1.0 requires
that all vertically integrated supplier must have total output greater than 0.5/1.0 times the
downstream buyer’s inputs. Each column reports a different threshold requirement for being
a buyer or a seller. An establishment is defined to be a Net Buyer (1/1.2/1.5), if the total
in-value of the product is at least 1/1.2/1.5 times the total out value of the product. An
establishment is defined to be a Net Seller (1/1.2/1.5), if the total out-value of the product
is at least 1/1.2/1.5 times the total in-value of the product.

30
Table 10: Share of Vertically Integrated Firms That Ship Within Firm:
Weighted

Integration Definition 1 1.2 1.5


Exists 0.72 0.72 0.73
Large: 0.5 0.74 0.75 0.75
Large: 1.0 0.75 0.75 0.76
Total: 0.5 0.74 0.75 0.75
Total: 1.0 0.75 0.75 0.76
Notes: This table reports the weighted share of vertically integrated firms that ship at least
one product within the firm for various definitions, as defined in Section 3. Each row reports
the share according to a different definition for integration or for a potential vertically
integrated supplier existing. Large 0.5/1.0 requires that an individual vertically integrated
supplier must have output greater than 0.5/1.0 times the downstream buyer’s inputs. Total
0.5/1.0 requires that all vertically integrated supplier must have total output greater than
0.5/1.0 times the downstream buyer’s inputs. Each column reports a different threshold
requirement for being a buyer or a seller. An establishment is defined to be a Net Buyer
(1/1.2/1.5), if the total in-value of the product is at least 1/1.2/1.5 times the total out value
of the product. An establishment is defined to be a Net Seller (1/1.2/1.5), if the total
out-value of the product is at least 1/1.2/1.5 times the total in-value of the product.

Table 11: Share of Within Firm Downstream Sales: Weighted

Integration Definition 1 1.2 1.5


Large: 0.5 0.33 0.32 0.33
Large: 1.0 0.37 0.38 0.38
Total: 0.5 0.29 0.29 0.30
Total: 1.0 0.32 0.33 0.34
Notes: This table reports the share of sales that takes place within the firm conditional on a
vertically integrated downstream firm existing for various definitions and thresholds, as
defined in Section 3. We weight by total inward shipment value. Each row reports the share
according to a different definition for integration or for a potential vertically integrated
buyer existing. Large 0.5/1.0 requires that an individual vertically integrated buyer must
have input greater than 0.5/1.0 times the upstream seller’s outputs. Total 0.5/1.0 requires
that all vertically integrated buyers must have total input greater than 0.5/1.0 times the
upstream seller’s outputs. Each column reports a different threshold requirement for being a
buyer or a seller. An establishment is defined to be a Net Buyer (1/1.2/1.5), if the total
in-value of the product is at least 1/1.2/1.5 times the total out value of the product. An
establishment is defined to be a Net Seller (1/1.2/1.5), if the total out-value of the product
is at least 1/1.2/1.5 times the total in-value of the product.

31
Table 12: Share of Within Firm Downstream Sales: Unweighted

Integration Definition 1 1.2 1.5


Large: 0.5 0.47 0.47 0.48
Large: 1.0 0.43 0.44 0.44
Total: 0.5 0.47 0.48 0.48
Total: 1.0 0.42 0.42 0.43
Notes: This table reports the share of sales that takes place within the firm conditional on a
vertically integrated downstream firm existing for various definitions and thresholds, as
defined in Section 3. We compute unweighted shares. Each row reports the share according
to a different definition for integration or for a potential vertically integrated buyer existing.
Large 0.5/1.0 requires that an individual vertically integrated buyer must have input greater
than 0.5/1.0 times the upstream seller’s outputs. Total 0.5/1.0 requires that all vertically
integrated buyers must have total input greater than 0.5/1.0 times the upstream seller’s
outputs. Each column reports a different threshold requirement for being a buyer or a seller.
An establishment is defined to be a Net Buyer (1/1.2/1.5), if the total in-value of the
product is at least 1/1.2/1.5 times the total out value of the product. An establishment is
defined to be a Net Seller (1/1.2/1.5), if the total out-value of the product is at least
1/1.2/1.5 times the total in-value of the product.

32
Table 13: Section Level Descriptive Statistics

Section Section Name Number of Number Volume (in Number of Number of Number Up-
Firms of Firm billions of Sellers Buyers stream Inte-
Locations INR) grated: To-
tal
1.00 Live Animals and Animal 18,348 26,926 114 4,791 14,495 459
Products
2.00 Vegetable Products 95,329 124,722 792 28,113 72,879 2,000
3.00 Animal or Vegetable Fats 22,353 32,539 317 4,264 19,424 521
and Oils
4.00 Prepared Foodstuffs, Bev- 81,908 128,727 1,029 20,052 67,748 2,264
erages, Spirits and Vine-
gar, Tobacco
5.00 Mineral Products 124,587 235,038 930 23,437 106,430 2,877
6.00 Chemicals and Para- 206,710 321,905 1,429 57,993 170,206 6,419
Chemical Products
7.00 Plastics and Rubber 240,793 354,105 974 70,453 194,207 6,948
8.00 Animal Hides and Skins 36,978 57,987 64 10,101 28,201 1,256
9.00 Wood, Cork, Straw and 69,359 104,727 155 21,336 54,961 2,151
Articles thereof
10.00 Pulp of Wood, Paper, 97,222 147,843 321 26,899 78,604 2,552
Paperboard, and Printed
Products

33
11.00 Textiles 275,939 362,863 1,826 119,294 193,341 6,473
12.00 Footgear, Headgear, Um- 31,141 47,063 143 11,336 22,207 1,192
brellas etc.
13.00 Articles made of Miner- 115,944 188,158 244 31,397 92,981 3,355
als, Stone, Plaster, Ce-
ment and Ceramic and
Glass Products
14.00 Precious Metals and Stone 8,102 11,912 31 2,859 5,651 318
15.00 Base Metals and Articles 296,596 501,463 3,171 106,650 235,061 11,361
thereof
16.00 Machinery and Mechani- 423,106 738,885 3,423 140,887 351,028 20,193
cal Appliances
17.00 Vehicles and Transport 58,825 86,060 1,410 19,154 43,361 1,793
Equipment
18.00 Photographic, Music, 100,466 153,144 381 29,163 81,378 3,575
Medical equipment and
Clocks
19.00 Arms and Ammunitions 1,532 1,939 5 733 878 67
20.00 Miscellaneous Manufac- 129,754 209,443 287 36,793 101,509 4,522
tured Products
21.00 Art 96,954 141,341 131 33,957 71,603 2,518
Notes: This table reports descriptive statistics for different aggregated product categories. For these broad categories, we report the number of
firms which either buy or sell the product, the number of establishments that buy or sell the product, the total volume of trade, number of sellers,
and the total number of sellers which are integrated.
Table 14: Section Level Share Within Firm

Section Section Name Share Share Share


Within Within Firm Within
Firm - - Weighted Firm - All
Unweighted
1.00 Live Animals and Animal 0.61 0.41 0.09
Products
2.00 Vegetable Products 0.54 0.42 0.06
3.00 Animal or Vegetable Fats 0.63 0.29 0.06
and Oils
4.00 Prepared Foodstuffs, Bev- 0.53 0.39 0.07
erages, Spirits and Vine-
gar, Tobacco
5.00 Mineral Products 0.20 0.27 0.04
6.00 Chemicals and Para- 0.36 0.31 0.05
Chemical Products
7.00 Plastics and Rubber 0.37 0.30 0.05
8.00 Animal Hides and Skins 0.42 0.28 0.07
9.00 Wood, Cork, Straw and 0.27 0.10 0.02
Articles thereof
10.00 Pulp of Wood, Paper, 0.43 0.22 0.04
Paperboard, and Printed
Products
11.00 Textiles 0.38 0.40 0.09
12.00 Footgear, Headgear, Um- 0.40 0.20 0.06
brellas etc.
13.00 Articles made of Miner- 0.20 0.16 0.03
als, Stone, Plaster, Ce-
ment and Ceramic and
Glass Products
14.00 Precious Metals and Stone 0.40 0.79 0.19
15.00 Base Metals and Articles 0.36 0.16 0.02
thereof
16.00 Machinery and Mechani- 0.46 0.34 0.07
cal Appliances
17.00 Vehicles and Transport 0.34 0.21 0.03
Equipment
18.00 Photographic, Music, 0.35 0.24 0.05
Medical equipment and
Clocks
19.00 Arms and Ammunitions 0.31 0.90 0.14
20.00 Miscellaneous Manufac- 0.36 0.21 0.05
tured Products
21.00 Art 0.26 0.35 0.04
Notes: This table reports the share of within firm trade for 21 different aggregated product
categories. Column (3) and (4) report the share as a proportion of potential within firm
trade that can take place. Column (5) reports the total within firm trade as a proportion of
total trade. 34
Table 15: Impact of Distance on Within Firm Shipping

Dependent variable:
Share of Shipments Within Firm (Percent)
(1) (2) (3) (4) (5) (6)
∗∗∗ ∗∗∗
Log Mean Distance −4.869 −5.466
(0.033) (0.033)
Firm Dispersion −7.444∗∗∗ −7.685∗∗∗
(0.036) (0.035)
Log Average Distance to Within Firm Sellers −4.833∗∗∗ −4.854∗∗∗
(0.032) (0.031)
Log Average Distance to Outside Firm Sellers 7.602∗∗∗ 12.335∗∗∗

35
(0.167) (0.316)
District FE Yes Yes Yes Yes Yes Yes
Product FE No Yes No Yes No Yes
Observations 1,075,224 1,075,224 1,033,873 1,033,873 1,004,406 1,004,406
R2 0.152 0.254 0.171 0.269 0.158 0.255
Adjusted R2 0.151 0.253 0.171 0.268 0.158 0.254
∗ ∗∗ ∗∗∗
p<0.1; p<0.05; p<0.01

Notes: This table reports estimates from a linear regression of the share of within firm sourcing on various distance measures at the
product-establishment level. The share of within firm sourcing is defined by “Upstream Integrated: Total” with a scale requirement of 0.5 and a
buying/selling threshold of 1.2 according to Section 3. Columns (1) and (2) have self reported shipment distance as the independent variable.
Columns (3) and (4) have firm dispersion as the independent variable. We construct firm dispersion for an establishment as the average distance to
other establishments within the firm (see Section 4). Columns (5) and (6) have the average distance to within and outside firm net-sellers of the
product. All specifications control for district fixed effects, even columns add product fixed effects.
Table 16: Impact of Specificity on Within Firm Shipping

Dependent variable:
Share of Shipments Within Firm (Percent)
(1) (2) (3) (4)
Listed on Exchange −9.336∗∗∗ −7.168∗∗∗
(0.220) (0.219)
Listed on Exchange or Reference Priced −5.154∗∗∗ −4.117∗∗∗
(0.119) (0.118)
In Total Value −2.089∗∗∗ −2.091∗∗∗
(0.015) (0.015)
District FE Yes Yes Yes Yes
Section FE Yes Yes Yes Yes
Observations 1,049,661 1,049,661 1,049,661 1,049,661
R2 0.162 0.162 0.177 0.177
Adjusted R2 0.161 0.161 0.176 0.176
∗ ∗∗ ∗∗∗
p<0.1; p<0.05; p<0.01

Notes: This table reports estimates from a linear regression of the share of within firm
sourcing on various specificity measures at the product-establishment level. The share of
within firm sourcing is defined by “Upstream Integrated: Total” with a scale requirement of
0.5 and a buying/selling threshold of 1.2 according to Section 3. The independent variable
variable in column (1) is an indicator if the product is listed on an exchange. Independent
variable in column (2) is an indicator if the product is listed on an exchange or is reference
priced. Columns (3) and (4) add In Total Value as controls. All specifications control for
district and section fixed effects.

36
Table 17: Impact of Frequency on Within Firm Shipping

Dependent variable:
Share of Shipments Within Firm (Percent)
(1) (2) (3)
∗∗∗ ∗∗∗
Frequency Index 27.259 16.291 12.564∗∗∗
(0.170) (0.164) (0.163)
District FE No Yes Yes
Product FE No No Yes
Observations 1,075,224 1,075,224 1,075,224
R2 0.023 0.143 0.239
Adjusted R2 0.023 0.142 0.237
∗ ∗∗ ∗∗∗
p<0.1; p<0.05; p<0.01

Notes: This table reports estimates from a linear regression of the share of within firm
sourcing on the time frequency of shipments at the product-establishment level. The share
of within firm sourcing is defined by “Upstream Integrated: Total” with a scale requirement
of 0.5 and a buying/selling threshold of 1.2 according to Section 3. Frequency Index is the
fraction of months with at least one inward shipment for a given product. Columns (2) and
(3) include district fixed effects. Column 3 includes product fixed effects.

Table 18: Impact of Volume on Within Firm Shipping

Dependent variable:
Share of Shipments Within Firm (Percent)
(1) (2) (3)
∗∗∗ ∗∗∗
Log Number of Shipments 5.073 3.238 2.635∗∗∗
(0.028) (0.027) (0.028)
District FE No Yes Yes
Product FE No No Yes
Observations 1,095,636 1,095,636 1,095,636
R2 0.029 0.147 0.243
Adjusted R2 0.029 0.146 0.241
∗ ∗∗ ∗∗∗
p<0.1; p<0.05; p<0.01

Notes: This table reports estimates from a linear regression of the share of within firm
sourcing on log the number of individual shipments at the product-establishment level. The
share of within firm sourcing is defined by “Upstream Integrated: Total” with a scale
requirement of 0.5 and a buying/selling threshold of 1.2 according to Section 3. Columns (2)
and (3) include district fixed effects. Column 3 includes product fixed effects.

37
Table 19: Impact of Competition on Within Firm Shipping

Dependent variable:
Share of Shipments Within Firm (Percent)
(1) (2) (3) (4)
∗∗∗ ∗∗∗
Upstream HHI 43.768 42.364
(0.660) (0.669)

Weighted Downstream HHI −38.911∗∗∗ −36.989∗∗∗


(1.076) (1.066)

District FE Yes Yes Yes Yes


Product Section FE No Yes No Yes
Observations 1,054,980 1,054,980 607,824 607,824
R2 0.145 0.166 0.151 0.171
Adjusted R2 0.144 0.165 0.150 0.170
∗ ∗∗ ∗∗∗
p<0.1; p<0.05; p<0.01

Notes: This table reports estimates from a linear regression of the share of within firm
sourcing on upstream and downstream competition at the product-establishment level. The
share of within firm sourcing is defined by “Upstream Integrated: Total” with a scale
requirement of 0.5 and a buying/selling threshold of 1.2 according to Section 3. HHI is sum
of firm out-value shares for each product. Upstream HHI is HHI for the establishment’s
input product. Weighted Downstream HHI is the mean over the HHIs for all the
establishment’s outputs weighted by out-value. All specifications include district fixed
effects. Columns (2) and (4) include product section fixed effects.

38
Table 20: Impact of R & D on Within Firm Shipping

Dependent variable:
Share of Shipments Within Firm (Percent)
(1) (2) (3)
∗∗∗ ∗∗∗
R & D Intensity 9.461 7.215 8.031∗∗∗
(0.317) (0.379) (0.376)
Log In Total Value −2.118∗∗∗
(0.015)
District FE Yes Yes Yes
Section FE No Yes Yes
Observations 1,023,113 1,023,113 1,023,113
R2 0.137 0.157 0.173
Adjusted R2 0.137 0.157 0.172
∗ ∗∗ ∗∗∗
p<0.1; p<0.05; p<0.01

Notes: This table reports estimates from a linear regression of the share of within firm
sourcing on the log of R&D intensity at the product-establishment level. The share of within
firm sourcing is defined by “Upstream Integrated: Total” with a scale requirement of 0.5 and
a buying/selling threshold of 1.2 according to Section 3. All columns contain district fixed
effects, Column (2) and (3) add product section fixed effects, Column (3) also adds log of in
total value as a control.

39
Table 21: Impact of Firm Scale on Within Firm Shipping

Dependent variable:
Share of Shipments Within Firm (Percent)
(1) (2) (3)
∗∗∗
Log Total In-Shipment Value 0.708
(0.043)

Log Number of Products 1.609∗∗∗


(0.116)

Log Number of Locations 5.035∗∗∗


(0.159)

Observations 50,994 50,994 50,994


R2 0.005 0.004 0.019
Adjusted R2 0.005 0.004 0.019
∗ ∗∗ ∗∗∗
p<0.1; p<0.05; p<0.01

Notes: This table reports estimates from a linear regression of the share of within firm
sourcing on various measures of firm size at the firm level. The share of within firm sourcing
is defined by “Upstream Integrated: Total” with a scale requirement of 0.5 and a
buying/selling threshold of 1.2 according to Section 3. We aggregate to the firm level by
taking a weighted average by total in-value. Total In-Shipment Value is the sum of the value
of all of the firm’s inwards shipment. Number of Products is the number of products that
the firm either buyers or sells. Number of locations is the number of zip codes that the firm
operates. We take logs over all independent variables.

40
Table 22: Existence of Integrated Seller

Dependent variable:
Integrated Seller Exists (Percent)
(1) (2) (3) (4) (5) (6) (7)
Log In Total Value 0.690∗∗∗
(0.002)

Log Number of Shipments 2.543∗∗∗


(0.006)

Listed on Exchange −1.171∗∗∗


(0.047)

Upstream HHI 18.538∗∗∗


(0.373)

Weighted Downstream HHI 33.790∗∗∗


(0.394)

41
Distance to Outside Sellers −0.032
(0.070)

R&D Intensity 10.419∗∗∗


(0.091)

District FE Yes Yes Yes Yes Yes Yes Yes


Product FE Yes Yes NA NA NA Yes NA
Observations 9,211,930 9,211,910 8,783,388 3,628,272 3,882,435 9,065,002 8,594,585
R2 0.058 0.062 0.019 0.105 0.080 0.043 0.021
Adjusted R2 0.058 0.062 0.019 0.104 0.080 0.043 0.021
∗ ∗∗ ∗∗∗
p<0.1; p<0.05; p<0.01

Notes: This table reports estimates from a linear regression of an indicator for having a vertically integrated supplier on variables measures at the
product establishment level. The indicator for having an integrated seller uses a scale requirement of 0.5 and a buying/selling threshold of 1.2
according to Section 3. Log In Total Value is the log of total inward shipments of the product at the establishment level. Log Number of Shipments
is the log of the number of shipments of a product that an establishment receives. Listed on Exchange is an indicator for if the product is traded on
an exchange. Upstream HHI is HHI for the establishment’s input product. Weighted Downstream HHI is the mean over the HHIs for all the
establishment’s outputs weighted by out-value. Distance to Outside sellers is the log of average distance to non-integrated net-sellers of the
product. R&D intensity is measured at the product level. All specifications contain district fixed effects and product fixed effects, unless if the
independent variable is defined at the product level.
Table 23: Existence of Integrated Seller – Firm Level

Dependent variable:
Integrated Seller Exists (Percent)
(1) (2) (3)
∗∗∗
Log In Total Value 0.362
(0.004)

Log Number of Products 0.747∗∗∗


(0.009)

Log Number of Locations 4.579∗∗∗


(0.015)

Constant −3.625∗∗∗ 0.447∗∗∗ −0.369∗∗∗


(0.056) (0.013) (0.010)

Observations 968,244 968,244 968,244


R2 0.008 0.008 0.083
Adjusted R2 0.008 0.008 0.083
∗ ∗∗ ∗∗∗
p<0.1; p<0.05; p<0.01

Notes: This table reports estimates from a linear regression of the indicator variable for
having at least one vertically integrated upstream supplier on various variables at the firm
level. The indicator for having an integrated seller uses a scale requirement of 0.5 and a
buying/selling threshold of 1.2 according to Section 3. In Total Value is the total inward
shipment value for the firm. Number of Shipments is the total count of shipments that a
firm receives. Number of locations is the number of zipcodes that the firm operates in.

42
Table 24: Relative Value of Vertical Integration

Dependent Variable: Share of Sourcing


Model: (1) (2) (3) (4)
Variables
ln(Distance) -0.2866∗∗∗ -0.2059∗∗∗ -0.1283∗∗∗ -0.2035∗∗∗
(0.0036) (0.0051) (0.0068) (0.0075)
Within-Firm 4.5848∗∗∗ 4.6134∗∗∗ 5.0779∗∗∗
(0.0468) (0.0448) (0.0487)
Within-State 1.2803∗∗∗
(0.0319)
Buyer FE Yes Yes Yes Yes
Sample All All All Karnataka
Fit statistics
Observations 28,396,595 28,396,595 28,396,595 11,006,471
Squared Correlation 0.002 0.024 0.034 0.063
Pseudo R2 0.09555 0.18408 0.19783 0.22613
BIC 418,821.53 406,214.92 404,286.11 309,719.89
One-way (Buyer FE) standard-errors in parentheses.
Signif Codes: ***: 0.01, **: 0.05, *: 0.1
Notes: This table reports estimates from a Poisson regression of the share of sourcing of an
input on distance, an indicator for same firm ownership, and an indicator for being in the
same state. Each observation is a buyer-potential supplier pair. Potential suppliers are
establishments who meet the scale requirement of 0.5 and a buying/selling threshold of 1.2
according to Section 3. Columns (1), (2) and (3) include the full sample. Column (4)
restricts the sample to establishments in Karnataka. All specifications include buying
establishment fixed effects.

43
Table 25: Relative Value of Vertical Integration – Interactions

Dependent Variable: Share of Sourcing


Model: (1) (2) (3)
Variables
Within-Firm 4.1607∗∗∗ 1.4264∗∗∗ 0.6811∗∗
(0.1113) (0.2329) (0.2771)
ln(Distance) -0.2200∗∗∗ -0.1569∗∗∗
(0.0053) (0.0078)
ln(Distance) × Within-Firm 0.0383∗∗∗ 0.0395∗∗∗
(0.0092) (0.0113)
Within-State 1.1569∗∗∗ 0.8991∗∗∗
(0.0304) (0.0338)
Within-State × Within-Firm 3.6252∗∗∗ 3.8051∗∗∗
(0.2322) (0.2341)
Fixed-Effects
Buyer FE Yes Yes Yes
Fit statistics
Observations 28,396,595 28,396,595 28,396,595
Squared Correlation 0.025 0.039 0.036
Pseudo R2 0.18422 0.20037 0.20376
BIC 406,230.41 403,923.82 403,508.77
One-way (Buyer FE) standard-errors in parentheses.
Signif Codes: ***: 0.01, **: 0.05, *: 0.1
Notes: This table reports estimates from a Poisson regression of the share of sourcing of an
input on distance, an indicator for same firm ownership, and an indicator for being in the
same state. Each observation is a buyer-potential supplier pair. Potential suppliers are
establishments who meet the scale requirement of 0.5 and a buying/selling threshold of 1.2
according to Section 3. All specifications include buying establishment fixed effects.

44
A Appendix Tables

Table A1: Share of Within Firm Upstream Sourcing: Weighted (At least 3
shipments)

Integration Definition 1 1.2 1.5


Large: 0.5 0.32 0.30 0.30
Large: 1.0 0.34 0.33 0.33
Total: 0.5 0.31 0.30 0.30
Total: 1.0 0.34 0.33 0.33

Table A2: Share of Within Firm Upstream Sourcing: Unweighted (At least 3
shipments)

Integration Definition 1 1.2 1.5


Large: 0.5 0.46 0.46 0.46
Large: 1.0 0.47 0.47 0.47
Total: 0.5 0.46 0.46 0.46
Total: 1.0 0.48 0.48 0.47

Table A3: Share of Within Firm Upstream Sourcing: Weighted (Within Dis-
trict)

Integration Definition 1 1.2 1.5


Large: 0.5 0.32 0.31 0.31
Large: 1.0 0.36 0.34 0.34
Total: 0.5 0.32 0.31 0.31
Total: 1.0 0.35 0.34 0.34

45
Table A4: Share of Within Firm Upstream Sourcing: Unweighted (Within Dis-
trict)

Integration Definition 1 1.2 1.5


Large: 0.5 0.44 0.44 0.45
Large: 1.0 0.44 0.44 0.44
Total: 0.5 0.44 0.44 0.45
Total: 1.0 0.44 0.44 0.44

Table A5: Share of Within Firm Upstream Sourcing: Weighted (Primary Input)

Integration Definition 1 1.2 1.5


Large: 0.5 0.35 0.34 0.34
Large: 1.0 0.38 0.37 0.38
Total: 0.5 0.34 0.33 0.34
Total: 1.0 0.38 0.37 0.38

Table A6: Share of Within Firm Upstream Sourcing: Unweighted (Primary


Input)

Integration Definition 1 1.2 1.5


Large: 0.5 0.33 0.34 0.35
Large: 1.0 0.33 0.34 0.35
Total: 0.5 0.33 0.34 0.35
Total: 1.0 0.33 0.34 0.35

Table A7: Share of Within Firm Upstream Sourcing: No Supplier Scale Re-
quirement

Integration Definition 1 1.2 1.5


Exists (Unweighted) 0.37 0.37 0.37
Exists (Weighted) 0.19 0.19 0.19

Table A8: Share of Vertically Integrated Firms That Ship Within Firm: Un-
weighted

Integration Definition 1 1.2 1.5


Exists 0.24 0.25 0.25
Large: 0.5 0.27 0.27 0.28
Large: 1.0 0.25 0.26 0.26
Total: 0.5 0.27 0.27 0.28
Total: 1.0 0.25 0.26 0.26

46
Table A9: Share of Within Firm Upstream Sourcing: Weighted (8 digit product)

Integration Definition 1 1.2 1.5


Large: 0.5 0.42 0.41 0.42
Large: 1.0 0.46 0.44 0.45
Total: 0.5 0.42 0.41 0.42
Total: 1.0 0.45 0.45 0.46

Table A10: Share of Within Firm Upstream Sourcing: Unweighted (8 digit


product)

Integration Definition 1 1.2 1.5


Large: 0.5 0.32 0.32 0.32
Large: 1.0 0.31 0.30 0.30
Total: 0.5 0.32 0.32 0.32
Total: 1.0 0.31 0.30 0.30

Table A11: Relative Value of Vertical Integration - Seller Side

Dependent Variable: Share of Selling


Model: (1) (2) (3) (4)
Variables
ln(Distance) -0.2909∗∗∗ -0.1362∗∗∗ -0.1113∗∗∗ -0.1408∗∗∗
(0.0037) (0.0067) (0.0074) (0.0085)
Within-Firm 5.9774∗∗∗ 6.0644∗∗∗ 6.6033∗∗∗
(0.0706) (0.0711) (0.0775)
Within-State 0.9312∗∗∗
(0.0411)
Buyer FE Yes Yes Yes Yes
Sample All All All Karnataka
Fit statistics
Observations 96,950,332 96,950,332 96,950,332 55,789,480
Squared Correlation 0.004 0.024 0.029 0.052
Pseudo R2 0.11514 0.21028 0.21543 0.24579
BIC 430,387.65 416,840.68 416,143.05 345,162.69
One-way (Buyer FE) standard-errors in parentheses.
Signif Codes: ***: 0.01, **: 0.05, *: 0.1

47
Table A12: Relative Value of Vertical Integration - Interactions - Seller Side

Dependent Variable: Share of Selling


Model: (1) (2) (3)
Variables
Within-Firm 3.6677∗∗∗ 3.9013∗∗∗ 0.2977
(0.0981) (0.2057) (0.2348)
ln(Distance) -0.2375∗∗∗ -0.2386∗∗∗
(0.0051) (0.0054)
ln(Distance) × Within-Firm 0.2471∗∗∗ 0.2725∗∗∗
(0.0100) (0.0110)
Within-State 0.7972∗∗∗ 0.4627∗∗∗
(0.0361) (0.0395)
Within-State × Within-Firm 2.8669∗∗∗ 3.5799∗∗∗
(0.2092) (0.2135)
Fixed-Effects
Buyer FE Yes Yes Yes
Fit statistics
Observations 96,950,332 96,950,332 96,950,332
Squared Correlation 0.031 0.041 0.043
Pseudo R2 0.21611 0.21554 0.22533
BIC 416,045.89 416,127.34 414,802.72
One-way (Buyer FE) standard-errors in parentheses.
Signif Codes: ***: 0.01, **: 0.05, *: 0.1

48
Table A13: Relative Value of Vertical Integration - Replication of Atalay (2019)

Dependent Variable: Share of Selling


Model: (1) (2) (3) (4)
Variables
ln(Distance) -0.4014∗∗∗ -0.2689∗∗∗ -0.4297∗∗∗ -0.2904∗∗∗
(0.0032) (0.0047) (0.0027) (0.0044)
Within-Firm Share 2.4108∗∗∗ 3.7557∗∗∗ -1.1890∗∗∗ 1.1702∗∗∗
(0.1335) (0.1136) (0.1594) (0.2076)
ln(Distance) × Within-Firm Share 0.4139∗∗∗ 0.2694∗∗∗
(0.0158) (0.0195)
Fixed-Effects
Buyer FE Yes Yes Yes Yes
Destination FE No Yes No Yes
Fit statistics
Observations 15,110,867 6,544,751 15,110,867 6,544,751
Squared Correlation 0.008 0.017 0.011 0.019
Pseudo R2 0.1399 0.18976 0.14675 0.19207
BIC 360,040.35 389,539.01 359,191.81 389,301.87
One-way (Buyer FE) standard-errors in parentheses.
Signif Codes: ***: 0.01, **: 0.05, *: 0.1

49

You might also like