Analysing The Global Interindustry Trade (2000-2014) - An Exercise Through A Network Perspective

Analysing the global interindustry trade (2000-2014).
An exercise through a Network perspective.
Master in Big Data Analytics & Social Mining
- Università di Pisa, AA 2019-2020
Studente: Lorenzo Lodi

Relatore: Prof. Andrea Passarella
Struttura ospitante-tirocinio: IRPET
1
SUMMARY
Introduction: network features and focuses of the research p. 1
1) Intuitive insights from network visualization and general properties of the aggregate and
external graphs p. 2
2) Community Detection on the aggregate and on the external graphs: looking for global value
chain and international clusters of production p. 9
3) Studying centrality on the external graph: relevant sectors and countries in the international
trade of intermediate goods p. 14
4.1) Analysis of the “Degree Centrality” p. 15
4.2) Analysis of the “Strength” centrality p. 17
4.3) Analysis of the “Betweenness” centrality p. 19
4.4) Analysis of the “Pagerank” centrality p. 21
Conclusion p. 23
Appendix
1) conceptual structure of the dataset, its main features and building of the networks p.25
1.1) Country codes and names encompassed by WIOD p.27
1.2) Sectors codes and names encompassed by WIOD p.28
2) Degree distributions p. 29
3) indices
3.1) Clustering coefficient p.30
3.2) Betweenness p.31
3.3) Pagerank p.32
4) Community detection p.32

4.1) community membership of the automotive communities (external network) p.34
References p.35
2
Introduction: network features and focuses of the research.
During the last decades the drive of global trade has been represented by the growth of transactions
involving intermediate goods and services, or in other term by the increasing relevance of the Global
Value Chain (GVC). On this track, production has become more fragmented, i.e. goods and services
are the more and the more produced integrating inputs from different sectors, rather than being
produced within the same sector, both nationally and internationally. As many studies put forth
[1][2][3], network analysis has revealed a useful tool for studying the evolution of these
phenomenon. In my thesis I tried to apply this approach to the 2016 release of the World Input
Output Database (see appendix 1), encompassing data since 2000 to 2015, about 44 countries and
56 sectors. From these data, I derived two kinds of weighted and directed networks, or graphs,
composed on average of 2000 nodes: (1) an aggregate network, including all the transactions among
sectors, irrespectively if they are among sectors of the same country or among sectors of different
countries (2) an external network encompassing only connections among sectors of different
countries. In these networks, edges indicate commercial relations among sectors, thus inward links
are imports and outward links are exports, when their weight epitomise the value in million dollars
of the transactions. As first step I plotted the visualization of the aggregate network for three
selected years, in order to intuitively assess if it is possible to speak about Global\international Value
Chain, or if exchanges in intermediate goods and services are still nationally based, before trying to
evaluate in general terms the general evolution of some network properties, analysing and
comparing the aggregate and the external networks. Then, I accomplished a community detection
on the aggregate graph for selected years, trying to identify value chains which are not nationally
based. Subsequently, I looked for relevant communities also in the external graph in order to find
which sectors and countries are more integrated among themselves in the international trade of
intermediate goods. Eventually, I focused on the external graph, in order to estimate the relevance
of sectors and countries in the international trade of intermediate goods and services, analysing
some centrality measures. All these issues were investigated in a diachronic perspective, i.e.
showing evolutions along the 15 years considered by the dataset, or selecting three key yeas: 2000,
the year before the access of China in the WTO, considered a turning point in the global trade, 2007,
the year before the crack of Wall Street and 2014, the more recent year after the Great Crisis
occurred in 2008.
3
1) Intuitive insights from network visualizations and general properties of the aggregate and
external graphs
In fig.1 the aggregate network for year 2000 is showed. The more an edge is heavy, the less the
algorithm I used for the visualization gives length to the links, so the higher is the value of the
intermediate transactions between two sectors. The denser and well separated regions identifiable
in the graph constitute transactions involving nodes\sectors of the same country. A first relevant
conclusion we can establish from fig.1 is thus that in 2000 national transactions were bigger than
the international ones, therefore that value chains were still mainly nationally based in that period.
We can also identify a core and a periphery in the graph: the former is constituted by the countries
which have high-weighted connections, while the periphery are the ones characterized by less
relevant links. There are of course geographical reasons behind the relevance of economic
transactions and the closeness among sectors\countries, in fact we can intuitively distinguish two
major groups represented by the EU and the East Asia\North American economies. Nevertheless,
the peripheral position of countries such as Spain or Turkey with respect to their European
neighbours cannot be explained only through geography. If we then compare the external network
(fig.1 right) with the aggregate one of 2000, dividing nodes in three geographic area (Asia, America
and Europe), we cannot recognize some clear well separated regions associated to each geographic
area, despite some denser regions are visible.
Figure 1: "aggregate network" (left), "external network" (right) 2000
4
Proceeding examining the visualizations (fig.2), we can graphically grasp the huge increase of the
intermediate goods’ trade occurred between 2000 and 2007, both in term of value and in term of
connectivity. The aggregate graph of 2000 (fig.2 left) is indeed apparently less loose, and denser,
than the one of 2007: national clusters are still recognizable, but they are closer and closer each
other, signalling the growing value of international exchanges in intermediate goods; it seems that
also the overall number of edges grows. However, the 2007 and the 2014 graphs seems quite
similar. This means that between 2007 and 2014 the international exchange of intermediate goods
and services relented (fig.3 left). Looking at the external networks for 2007 (fig.2 right), it seems
confirmed the huge increase in number and weight\value of transactions in 2007 with respect to
2000, with a relative diminishing of growth between 2007 and 2014 (fig.3 right). Moreover, also in
2007 and 2014 well separate regions are not recognizable, though sectors of the same regions tend
to be closer each other.
Figure 2: "aggregate network" (left) and "external network" (right) 2007
5
Figure 3: "aggregate network" (left) and "external network" (right) 2014
We can substantiate the qualitative insights derived from the visualization in quantitative terms.
Figure below (fig.4) compares the average weight of the aggregate network ‘s edges with the one
of the external network, along 15 years; the measure is obtained dividing the total weight of each
network – external and aggregate - for its total number of edges. Unsurprisingly, the values are
higher for the aggregate graph, including also national transactions which remain more relevant
than the international ones (fig.4). However, fig.4 displays, the average weight increases more in
the external network than in the aggregate one: its value in 2014 is in fact the 260% of the value in
2000, while for the aggregate graph the percentage is 240. That indicate that international
transactions have been the most dynamic component of the recent growth of transactions in
intermediates, however, after the Great Crisis, the growth of the average weight of the aggregate
graph continues to increase steadily, while the measure for the external graph seems stagnating.
6
Figure 4: “average weight growth” ext. And egg. Graph 2000-2014
As fig.5 indicate, transactions increment not only in value\weight but also in number, i.e.
intersectoral exchanges grow and fragmentation of production increase. The metric we use to grasp
this feature is the ratio of the total number of edges – irrespectively of their weight and direction –
to the total number of nodes, I define density. In term of growth, we see that the density increases
more rapidly in the external graph, almost doubling from 2000 to 2014, while, in the aggregate
graph, it became only almost 50% bigger from 2000 to 2014. Of course, the number of inward and
outwards connections of each node (degree) is not equally distributed: the degree distribution
shows many nodes with low connectivity and a few hubs with high connectivity, especially for what
concern the external networks. However, the graphs do not fit a power law distribution (see
appendix 2). I also calculated the average clustering coefficient for each network – both aggregate
and external – along time, in order to assess if the growth in connectivity corresponded to a growth
in the complexity of connections. The clustering coefficient of a node is the ratio of the triangles to
the triples identifiable in its connections; the triples are connections of a node with nodes which do
not connect each other and the triangles are connections with interconnected nodes (see appendix
2.1). As shown by fig.6, what I found is that in the aggregate network global clustering is higher than
in the external one – unsurprisingly given the visible presence of nationally based groups of highly
interconnected nodes – but its value tend to decrease, while in the external graph it grows,
suggesting the increasing relevance of international clusters of production.
7
Figure 5: “average density growth” agg. and ext. networks 2000-2014
Figure 6: “average clustering coefficient” agg. and ext.

networks 2000-2014
8
2) Community Detection on the aggregate and on the external graphs: looking for global
value chain and international clusters of production
As we have seen in paragraph 1, the value chains seem still being far than global; in fact, in the
aggregate network, denser regions in the graph correspond to national economies. We next
examine the issue through a community detection. A network could be partitioned into different
communities, i.e. many edges connecting nodes in the same community and few connecting nodes
between different communities. If the aggregate network would be completely globalized, i.e. any
pair of industries in the world has an equal chance to be connected, we should not expect any
significant community structure to be detected. For accomplishing such a task we used the infomap
method which is based on random walks and allows us to maintain our graph as weighted and
directed, differently from other methods focusing on maximizing the modularity in undirected
networks (see appendix 3). At first, I applied the function on the aggregate network, then I searched
for stable communities comparing communities of different years through Jaccard similarity. As
forecasteble, communities and countries almost coincide in the aggregate graph. This phenomenon
is showed by the grid plots below (fig.7,8,9) where on the x axis there are sectors and on the y axis
there are countries; each cell, thus, correspond to a sector-country, while each colour, or gradient,
defines a community. Each coloured band in the grid, therefore, represent a nationally based
community. However, also scattered cell of the same colour are visible, displaying the fact that some
communities does not coincide with nations. Thus, I underlined with countourned rectangles the
most recognizible international communities. In particular, since 2000 it is recognizable a
community centred around German “heavy industry” (red contourned rectangles in fig.7,8,9,) –
basic metal (r15), metal product(r16), automotive(r20) and machinery & equipemnt(r19) - and
involving Austria, Czech Republic, Belgium, Poland (in automotive and machinery & equipment),
Nederlands (basic metal, automotive), Slovakia, France, Great Britain, Spain and Switzerland
(automotive). In the subsequent year, as fig.8 and 9 show, other countries add, Great Britain get
out, while some original countries, such as the Eastern european one got integrated also through
basic metal, metal products and machinery & equipment, sign of a complexification of the relations
among these countries and Germany, between 2000 and 2014. Another stable international
community is found in textiles – “German centred textile community” (yellow contourned
rectangles in fig. 7, 8,9; textile sector is r6) - involving eastern european countries, France, Austria,
Germany (since 2000); in 2014 Spain and Portugal get in and and Great Britain get out.
9
Figure 7: “community structure” 2000 agg. network
10
Shifting to the external graph, in paragraph 1, I found that no regional clusters are obvious analysing
its visual shape. Nevertheless, sectors of the same region tend to be closer each other, when
different denser regions – though not well separated – do exist, suggesting the presence of
communities. As first exercise, I tried to deal with this issue at an high level of granularity, deriving
from the external graph a nation-to-nation weighted undirected graph with 44 nodes, each
corresponding to a country, and each edges representing the gross trade flows among two countries
(in a few words: the weight of the edge connecting A to B is the sum of total imports of A from B
and of total export from A to B). Subsequently, I applied to this network a community detection,
through the Fastgreedy community detection method. I have found 4 major communities (fig 10),
while the more interesting insight are a shift of Luxembourg and Great Britain from the “Core
Europe” community to the “Northern European” one, occurred between 2000 and 2007. Then, we
identified a temporaty shift of Nederland and Belgium from the “Core Europe” community to the
“Mediterranean Europe” one, while between 2007 and 2014 Romania integrates in the German
11
based community, in parallel with the move of Switzerland from the latter to the “Mediterranean
Europe” community
Figure 10: evolution of communities in the nation-to-nation network
Eventually, I went back to the original external network, using the Infomap community detection
method and searching for stable communities comparing communities of different years through
Jaccard similarity, considering “stable” the communities with no more than 50% variation of the
original nodes from an year to another. At the end, I identified 16 stable communities , each related
to a sector (the most important one ares labeled in the alluvial graph below). I found, thus, that
trade relations are still mainly based on relations among sectors, but they are increasingly
internationalized involving a bigger and bigger amount of country-sectors. As we see from fig. 11, in
fact, the period between 2007 and 2014 marked a growing integration of sectors and countries in
communities, marked by the coloured edges getting in communities from the group of nodes
previously not involved in communities. Analysing the memebership of the communities they are
especially sectors of the emergent countries, such as Turkey, India, Brazil etc. (see appendix 4.1 for
the “automotive communities”). Then we notice a huge influx of sectors from “Non Community
World” to the “Metal Products”, corresponding to the growing reliance on ferrous scrap (included
in r26: Waste Collection etc.) of this sector. We also identified two region-based community for
automotive: as fig.12 details, moreover, since 2007 there has been a shift from the “Automotive
12
Pacific Community” to the european one of some major asian players such as China, Taiwan, Korea,
Japan and India.
Figure 11: evolution of communities ext. network
Figure 127: shifts in the “automotive communities”
13
3) Studying centrality on the external graph: relevant sectors and countries in the
international trade of intermediate goods.
In order to assess the relevance of the various sectors and countries in the international trade of
intermediate good, a first way to proceed is analysing the number of their import and export
relations. By a network standpoint we can accomplish this task calculating the degree of the nodes,
i.e. the number of their inward and outward edges, respectively defined as “in” and “out-degree”.
A second approach for studying the importance of countries and sectors in the international trade
of intermediate goods is computing the value of their import and export relations. The measure I
used for that was the weighted degree, or strength, which is based on the sum of the weights of the
inward – “in-strength” - and outward – “out-strength” - edges for each node. As I mentioned, degree
and strength evaluate the relevance of countries\sectors by different angles, but it should be
interesting to assess the importance of nodes taking in account both their connectivity and the value
of their connections. In fact, a sector may export\import to\from many destinations\origins, but
only a few of them be relevant in term of value. Vice-versa, a sector may have a high value in
import\export relations, but with only a few countries. These considerations are relevant because
the sensibility of a sector to external shocks or its criticality for the system depends both on its
relevance in term of value and in term of connectivity. A measure able to capture both dimensions
could be a weighted version of the betweenness centrality, scoring each node on the basis of the
number of path with the minimum cost which passes through them (see appendix 3.2); paths are
represented by the connections which separate two nodes, while the cost is related to the weights
of connections: the higher is the weight of the edges constituting a path, the smaller is the cost to
take that path. In order to have a high betweenness - i.e. having many cost-minimizing paths passing
through it - a node should have both many and highly weighted inward and outward connections,
e.g. being relevant both in term of connectivity and in term of value of import and exports. Actually,
if one wants to concentrate on the criticality of a sector for the system, betweenness could be
flawed. In fact, a sector may have a high betweenness because it exports a lot to many sectors and
import from many sectors, but without being a relevant target by the standpoint of nodes
going\exporting to it; moreover, these nodes could be not so relevant for the whole network. A
measure which could allow us to deal with such an issue is the Pagerank centrality which scores
nodes on the basis of the probability to be visited by a “random walker” (see appendix 3.3) and
balances (1) the importance of a sector as importer in term of connectivity (2) the importance of
the sectors exporting to that sectors for the network (3) the importance of that sectors for the nodes
14
which export to it. Having said that, we can proceed showing the results of my analysis using the
different measures here defined.
3.1 Analysis of the “Degree Centrality” on the external network
If we consider the top 20 sectors for the “Out Degree Centrality” (bar-chart in fig. 13)., we see that
Germany has the biggest number of most central sectors, while the US leads in “In Degree
Centrality” (bar-chart in fig. 14). , but has only 2 sectors in the “Out Degree” ranking. China does not
appear in both “out” and “in degree” rankings in 2000, but in 2014 it is the second more important
country in “out degree” and third in “in degree”. Japan instead shift from the third position in 2000
to disappearing in 2014 - for what concern “out degree” - while it maintains only one sector in the
“in degree” ranking. Interestingly (table in fig. 13) some littler economies show a high level of
forward linkages in the global trade for some relevant manufacturing sectors, i.e. chemicals (r11) –
Nederland in 2000 and 2007 - and machinery and equipment (r19) – Italy in 2014. In general,
different sectors are captured by the different rankings (“out” and “in”): in fact, the more numerous
industries captured in the “out degree” rankings are sectors such as chemical (r11), electronics (r17)
and machinery & equipment (r19) – dominated by USA, Germany and China - while the “in degree”
ranking (table in fig 14) captures also many construction sectors (r27), and some government
sectors (r51; Great Britain and USA).
Figura 13: “top 20 sectors2 for Out-degree
15
Figura 14: “top 20 sectors” for In-degree
We can also look at the relevance of countries in international trade summing the scores of the “in”
and “out” degree of all their sectors. I made it for some countries we can distinguish in developed
(USA, Japan, Germany, Great Britain, France and Italy) and emerging (China, Brazil, India, Korea,
Taiwan, Russia, Mexico). I separated them in two plots in order to facilitate the visualization. If we
look at fig.15, we see that since 2001 Germany leads in out-degree, followed by the USA, while China
increases steadily and Japan stagnates. Other developed countries, moreover, seem being more
relevant exporters in term of out-degree with respect to the emerging ones, despite after the crisis
their measures tend to stagnate, differently to the emerging countries which seem keeping
increasing (in particular South Korea). For what concern “in degree” (fig. 16), the USA maintain the
leadership, along the 15 years considered, followed by Germany and China, while for the other
countries, the trends and the relative positions of countries seem reflecting the “in degree” plot,
except for Russia and Taiwan, the exports of which seem having an higher connectivity than their
imports
16
Figura 15: evolution of aggregate Out- degree for developed and emerging countries 2000-2014
Figura 16: evolution of aggregate In- degree for developed and emerging countries 2000-2014
3.2 Analysis of the “Strength” centrality
Looking at the first 20 sectors for “Out strength” and “In strength”, we see a similar dynamic related
to the growing relevance of China, which appears in the rankings since 2007, before becoming the
leading country in 2014 in “out strength” (bar-chart fig. 17). The USA still dominate the ranking in
“in strength” (bar-chart fig.18), where China is second. Germany is second in 2000 and third in 2007
and 2014, both in the “out” and “in” diagrams. Japan striking reduces its importance, especially for
what concerns the “out strength” bar-chart. The sectors included in the tables (fig.17, 18) seems
partly mirroring the ones considered by the degree, except for a bigger presence of sectors like
quarrying and mining (r4) in the “out” ranking: in the “out degree” one, in fact, only Russia was
17
included, while here also Canada and Norway are encompassed, probably because in term of value
they are very important, but in term of connectivity they are more dependent than Russia to their
regional markets (respectively North America and Europe). As important importers of inputs,
instead, in the “in” table appears some refinery sector (r10) – not captured by the “in degree” tables
- in particular the one of emerging and\or Asian countries such as China, India and japan. These
insights confirm our assumption about the fact that a sector can be a relevant node in terms of
weight, but not necessarily in terms of connections.
Figura 17: “top 20 sectors” for Out-strength
Figura 18: “top 20 sectors” for In-strength
Repeating what I did with the “degree”, I summed the “Out” and “In-strength for all sectors of each
country, comparing the evolution of developing and emergent countries in the system, as exporter
and importers of intermediates in term of value. As fig.19 displays, since 2014 China has become
the biggest exporter and – since 2010 – the biggest importer, overcoming the USA. Germany, which
18
in 2000 was at the same level of Japan, before increasing faster than the latter both in “Out” and “I
strength” (fig.19,20). Looking at the emergent countries of the right plots in fig 19,20, their
relevance as exporters steadily increase, in particular the Russian and the Korean ones. Instead –
especially after the crisis – the developed countries tend to stagnate, in particular Italy, which is
overcame by Taiwan and Korea – which since 2010 overcame also France and Great Britain - and is
almost reached by India, Mexico and Brazil. Staying on the right plots, but shifting to In-Strength,
the trends are similar, except for Russia, which is on the bottom on the ranking, while Italy is
overcame by India in 2010.
Figure 19: evolution of aggregate Out-strength for developed and emerging countries 2000-2014
Figura 8: evolution of aggregate In-strength for developed and emerging countries 2000-2014
19
3.3 Analysis of the “Betweenness” centrality
Taking in account the betweenness, it could be noticed that the country which lead the ranking of
the 20 more central sectors (fig.21) is Germany, which accounts for 5, 8 and 8 sectors respectively
in 2000, 2007 and 2014. The USA have 7,4 and 4, positioning second. China, instead, has only 2, 3
and 3 sectors in the same period and is third. Paying attention to the top 20 sectors in term of
Betweenness, it is interesting how there are “new entry” with respect to the “Top 20” for “Degree”
and strength, such as Great Britain in financial services (r41), Luxemburg in the same sector since
2007 and interestingly Ireland in computer services (r40), since 2014, maybe signalling the growing
investments made by the big-tech that country in recent years. Summing all the betweenness scores
for all sectors in emerging and developed countries as I made with the previous measures (fig. 22),
we see that also in aggregate term Germany is the leading country since 2003, followed by the USA,
the total betweenness centrality of which, however, steadily declines. China is third, but its value
progressively grows, while Japan is far below and stagnates. For what concern other developed and
Figura 21:” top 20 sectors” for betweenness
emerging countries, we observe a decline of Great Britain and Italy, and a slight improvement of
France, along with a persisting divergence between developed and emerging countries, though
counterbalanced by the growth of Russia and South Korea.
20
Figura 21: evolution of aggregate betweenness developed and emerging countries 2000-2014
3.4 Analysis of the “Pagerank” centrality
For what concern Pagerank centrality, the game is still about Usa, China, and Germany, which have
the highest number of sectors in the top 20 (respectively 6-5-4, 1-4-5, 5-4-3 ) in the three years
considered (fig.22). The more interesting thing, however, is that the “Top 20 Ranking” for Pagerank
captures the relevance of some sectors not included in the other rankings, or sectors we have seen
yet in the previous tables, but that now are associated to different countries. In particular, in 2000
and 2007 we can observe the presence of the Chinese textile sector (r6), the only one owning to
China in the 2000’s ranking, before the country became a leading player also in other sectors.
Moreover, we can see in the tree-maps below the Norwegian and Danish water transport
(respectively in 2007 and in 2000-2007; Denmark has the biggest container shipment operator in
the world: Maersk). They however disappear in 2014, marking the huge effect of the relenting in
world trade – which occurs for the 40% through water transports. The relevance of the British health
sector (r53) is also grasped (even the most central sector in 2007). In addition, in 2007 the Irish
pharmaceutical sector is captured. Then, the construction sector of some tiny economies such as
Belgium (2014) and Denmark (2007) are identified. Considering the top 4 sector for 2014 is also
insightful: USA government (r51), China’s construction sector(r27) – China has responded to the
Great Crisis with huge investments in infrastructures - Germany’s and USA automotive (r20). Looking
at the overall trend obtained summing all the Pagerank scores of all sectors for each country, it is
interesting noticing that the evolution of the USA Pagerank is specular to the China’s one, which in
2013 reach the former, before even overcoming it. In general, the score is declining for all the
21
developed countries – and in particular for Great Britain - while for the emerging ones it slightly
increases, at exclusion of Taiwan.
Figura 22: “top 20 sectors2 for Pagerank
Figura 23: evolution of aggregate Pagerank for developed and emerging countries 2000-2014
22
Conclusion
By a general standpoint a major finding of my research has been a huge increase in fragmentation
and value of the GVC since 2000 to 2007. Instead, after the Great Crisis the process relents, both
considering the aggregate network and the external one. Our community detection on the
aggregate network shows that production networks are still nationally centred, but two
international value chains – “heavy industry” and “textiles” - led by Germany exist since 2000 and
expanded. On the external network, instead, communities substantially correspond to sectors,
except for two communities related to automotive, which are regionally based. Moreover, a bigger
number of country-sectors are involved in international networks. Interestingly, then, some relevant
automotive sectors shift from the “Pacific Automotive” community to the “EU” one. Relevant
insights have been found also exploring communities in a country based external graph, such as the
shift of Great Britain from the “Core Europe” community to the “Northern European” one. For what
concerns the studying of the importance of sectors in the international trade of intermediate goods
(described by the external network), our analysis shows that the “game” is about sectors of Usa,
Germany and China. However, the respective relevance of each country depends on how we
conceive centrality\importance in the network, as well as for what concerns the balance between
the measures related to developed and emerging countries. In addition, different indicators of
centrality allow us to capture different sectors as relevant. In particular, strength and degree grasp
mainly the relevance of sectors such as electronics, metal products, chemicals and machinery &
equipment related to the three major players. Betweenness, instead, signals also the relevance of
sectors such as informatic (Ireland) and financial services (Great Britain and Luxembourg). The same
happens for what concern Pagerank, able to capture among the most central industries sectors such
as the Danish water transports (since 2000 to 2007). Finally, it is interesting that the first 4 sectors
in terms of Pagerank in 2014 reflect the ones which probably boosted the global demand after the
Great Crisis, i.e. the USA government sector (including the military one), Chinese construction
sector, USA and German automotive sector.
23
24
Appendix
1) conceptual structure of the dataset, its main features and building of the
networks
Input–output analysis is the name given to an analytical framework developed by Wassily Leontief
in the late 1930s, in recognition of which he received the Nobel Prize in Economic Science in 1973.
The fundamental information used in input–output analysis concerns the flows of products from
each industrial sector, considered as a producer, to each of the sectors, itself and others, considered
as consumers. This basic information from which an input–output model is developed is contained
in an interindustry transactions table. The rows of such a table describe the distribution of a
producer’s output throughout the economy. The columns describe the composition of inputs
required by a particular industry to produce its output [4]. These interindustry exchanges of goods
constitute the shaded portion of the table depicted in the figure below, which is the section of the
input output table interesting for our analysis.
Figura 24: structure of a simple input-output table
The database I used – World Input Output Database (WIOD) - follows this general model but is more
complex, taking in account 56 sectors for each of the 43 countries considered (plus the “rest of the
world”, conceived as a national economy [5]. The figure below [6] summarizes the structure of a
table provided by WIOD, where the interindustry transaction table is composed by 44x44 “56x56”
25
sub-matrixes, or blocks. I underlined it with a red contoured rectangle, while the blue dashed
rectangles put in evidence the diagonal blocks related to transaction among sectors of the same
country. Therefore, the matrix corresponding to the red rectangle represents the basis of my
aggregate graph. In order to obtain the external graph, instead, I substituted with zero the values
in the diagonal blocks, maintaining only transactions among sectors of different countries.
Figura 25: structure of a WIOD table
WIOD provide data in different formats; I used the ones in xlsb, in order to manipulate and clean
them through the Python library Pandas and Numpy. Then, I interpreted the matrixes I obtained as
weighted adjacency matrixes (left fig. 27) where each cell with value different from zero represents
a weighted connection. I did the same for connections with weights lower than 50, as rudimentary
threshold for relevant connections, aiming to facilitate the visualization and the community
detections (that I will explain later). Then, I built the networks through the Python library I-Graph
and the function Graph.TupleList(), after transforming my data in an edge-list; in fact I could not
find a function in Python I-graph able to obtain a weighted and directed graph in from an adjacency
matrix; an edge-list (right fig. 27) could be interpreted as a list of tuples, where the first element of
a tuple indicates the source node, the second one the target node and the third the weight of the
edge connecting them. Subsequently, I plotted the graph in SVG format through the function
write_svg().
26
Figura 269: adjacency list (left), adjacency matrix (right)
1.1) Country codes & names encompassed in WIOD tables

country code country name
IND India
RUS Russia
BRA Brazil
CZE Czech Republic
MLT Malta
NLD Nederland
POL Poland
AUS Australia
CYP Cyprus
DNK Denmark
LVA Latvia
KOR South Korea
MEX Mexico
ESP Spain
AUT Austria
CAN Canada
PRT Portugal
DEU Germany
HRV Croatia
USA Usa
LUX Luxemburg
SVK Slovakia
HUN Hungary
EST Estonia
IDN Indonesia
ITA Italy
IRL Ireland
FRA France
CHE Switzerland
BGR Bulgaria
FIN Finland
SVN Slovenia
ROU Romania
JPN Japan
TWN Taiwan
GRC Greece
BEL Belgium
NOR Norway
LTU Lituania
CHN China
GBR Great Britain
SWE Sweden
TUR Turkey
ROW Rest of the World
27
1.2) Sector codes & names encompassed in WIOD tables
sector code sector name
r1 Crop and animal production, hunting and related service activities
r2 Forestry and logging
r3 Fishing and aquaculture
r4 Mining and quarrying
r5 Manufacture of food products, beverages and tobacco products
r6 Manufacture of textiles, wearing apparel and leather products
Manufacture of wood and of products of wood and cork, except furniture;
r7 manufacture of articles of straw and plaiting materials
r8 Manufacture of paper and paper products
r9 Printing and reproduction of recorded media
r10 Manufacture of coke and refined petroleum products
r11 Manufacture of chemicals and chemical products
r12 Manufacture of basic pharmaceutical products and pharmaceutical preparations
r13 Manufacture of rubber and plastic products
r14 Manufacture of other non-metallic mineral products
r15 Manufacture of basic metals
r16 Manufacture of fabricated metal products, except machinery and equipment
r17 Manufacture of computer, electronic and optical products
r18 Manufacture of electrical equipment
r19 Manufacture of machinery and equipment n.e.c.
r20 Manufacture of motor vehicles, trailers and semi-trailers
r21 Manufacture of other transport equipment
r22 Manufacture of furniture; other manufacturing
r23 Repair and installation of machinery and equipment
r24 Electricity, gas, steam and air conditioning supply
r25 Water collection, treatment and supply
Sewerage; waste collection, treatment and disposal activities;
r26 materials recovery; remediation activities and other waste management services
r27 Construction
r28 Wholesale and retail trade and repair of motor vehicles and motorcycles
r29 Wholesale trade, except of motor vehicles and motorcycles
r30 Retail trade, except of motor vehicles and motorcycles
r31 Land transport and transport via pipelines
r32 Water transport
r33 Air transport
r34 Warehousing and support activities for transportation
r35 Postal and courier activities
r36 Accommodation and food service activities
r37 Publishing activities
Motion picture, video and television programme production,
r38 sound recording and music publishing activities; programming and broadcasting activities
r39 Telecommunications
r40 Computer programming, consultancy and related activities; information service activities
r41 Financial service activities, except insurance and pension funding
r42 Insurance, reinsurance and pension funding, except compulsory social security
r43 Activities auxiliary to financial services and insurance activities
r44 Real estate activities
r45 Legal and accounting activities; activities of head offices; management consultancy activities
r46 Architectural and engineering activities; technical testing and analysis
r47 Scientific research and development
r48 Advertising and market research
r49 Other professional, scientific and technical activities; veterinary activities
r50 Administrative and support service activities
r51 Public administration and defence; compulsory social security
r52 Education
r53 Human health and social work activities
r54 Other service activities
Activities of households as employers;
r55 undifferentiated goods - and services-producing activities of households for own use
r56 Activities of extraterritorial organizations and bodies
28
2) Degree distributions:
Figure 27: degree distribution for selected years, agg. and ext. networks
In particular the external graphs, could suggest a power law distribution, showing histograms quite
skewed toward extreme values, while in the aggregate ones the intensity tend to decrease slower,
for each degree level. However, I tested all the distributions – both of the aggregate and the external
graph - with the function fit_power_law() of the R version of the I-Graph library, selecting as
minimum degree 20, which I found reasonable looking at my data. Nevertheless, the p-value of the
Kolmogorov-Smirnov test resulting as output of fit_power_law(), applied to each distribution,
indicated that the hypothesis that the data fitted a power law was to be rejected; in fact, I ever
obtained values smaller than p-value 0.05.
29
3) indices
2.1) Clustering coefficient:
Given a network, the local clustering coefficient is the ratio of the triangles to the triples
identifiable in each node of the network; the triples are connections of a node with nodes which
do not connect each other and the triangles are connections with interconnected nodes. The
average clustering coefficient is the average value of the clustering coefficient of the nodes of a
network:
∑ 𝑖 𝑛°𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒𝑠 𝑛𝑜𝑑𝑒𝑖
𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝐶𝑙. 𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 = ∑ ⁄𝑁
∑𝑖 𝑛°𝑡𝑟𝑖𝑝𝑙𝑒𝑠 𝑛𝑜𝑑𝑒𝑖
𝑁
𝑁 = 𝑛𝑜𝑑𝑒𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑔𝑟𝑎𝑝ℎ
Figura 28: components of the cl. Coef. calculation
There are also directed and weighted version of the clustering coefficient and algorithm able to
calculate it, but I was interested in assessing the complexity of the graph in very general terms;
thus, in order to calculate that measure I used the transitivity_avglocal_undirected() function of I-
Graph, which simply ignores the directions of edges.
2.1) betweenness:
I computed the betweenness centrality through the betweenness() function of the Python library
I-Graph, which follows the formula below, calculated iteratively, for each node, by the algorithm:
𝐵𝑒𝑡𝑤𝑒𝑒𝑛𝑛𝑒𝑠𝑠𝑣 =
σ
∑s≠t≠v∈V (s, t|v)
σ
∑s≠t≠v∈V (s, t)
s ≠ t ≠ v ∈ V (for all possible pairs of nodes which are not v)
σ = path with minimum cost
30
Figure below summarize the fact that the passing through a node minimizes the cost of a path
between two nodes, if the sum of the inward and outward connection of that node from and to
those nodes are higher than the weight of the path directly connecting those two nodes (or in
general than other path linking those two nodes passing through other nodes).
Figure 29: paths among nodes
2.2) Pagerank centrality:

In order to calculate the Pagerank, I used the pagerank() function of Python I-Graph. For
understanding this centrality measure, let assume a graph as the one below: the probability of going
to a node depends on the probability of being in the nodes which reach it, in turn depending on the
probability of those nodes to be reached from the other nodes in the network, and so forth (the
probability to reach a node from another node is related to the number of connections that from
the latter go to the former and vice-versa for the probability to be reached). In other words, the
probabilities of the nodes to be visited by a “random surfer” recursively depend on the probabilities
of the other nodes to be visited. Thus: how to calculate and rank them?
Figure 30: prob. to reach nodes for a random surfer
The problem could be resolved iteratively assuming that at 𝑡 = 0 the probability of being in a node
is equal for all nodes, i.e. 1⁄𝑁, where 𝑁 is the number of nodes of the network; alternatively, we
31
can say that the Pagerank score of all nodes is now 1⁄𝑁. At 𝑡 + 1, the probability of going to a node
𝑖 is therefore related to the probability of being in the nodes 𝑗 (targeting 𝑖), as calculated for 𝑡 = 0,
and to the probability to go from 𝑗 to 𝑖, which depends on the number– or, in the weighted Pagerank,
on the weights - of links that go from 𝑗 to 𝑖.The iteration is repeated until the ranking of probabilities
stabilizes. In formula: the Pagerank rank of a node 𝑖 at time 𝑡 + 1 depends on the Pagerank of the
nodes which reach it at 𝑡, each divided by the number of edges – or the sum of their weights, in the
weighted Pagerank – that from 𝑗 go to 𝑖, indicated with 𝐿(𝑗)); 𝑗 ∈ M(𝑖) refers to the set of edges
from 𝑗 to 𝑖 . Actually, in the calculation is added a dumping factor 𝑑 (default:0.85) aiming to score
also the unconnected nodes.
1
𝑃𝑅𝑖;𝑡=0 =
𝑁
1−𝑑 𝑃𝑅(𝑗; 𝑡)𝑤𝑖𝑗
𝑃𝑅𝑖;𝑡+1 = +× ∑
𝑁 𝐿(𝑗)
𝑗 ∈M(𝑖)
In synthesis, in order to have a high Pagerank a node have to be targeted by many nodes, in their
turn important – i.e. with high Pagerank – and with a relatively small number\weight of outward
connections, indicating that the connections to the receiving node are important by the standpoint
of the targeting nodes[7]. That the sense behind our statement about Pagerank (paragraph 5) as
able to balance (1) the importance of a sector as importer in term of connectivity (2) the importance
of the sectors exporting to that sectors for the network (2) and the importance of that sectors for
the nodes which export to it.
32
4) Community detection
In order to accomplish the community detection, I used two methods. The first is Infomap and I
applied it to the aggregate and the external network. It is associated to the community_infomap()
function of Python I-Graph and is based on the “map equation” which identifies the community
structure of a graph partitioning it iteratively and looking for the partition minimizing the
“description length” of a “random surfer” in the network [8]. Intuitively, for a given partition, the
less a random surfer shifts from a module to another and the more he visits the nodes internal to
the modules with respect to the other modules, the more probable is that the given partition is
designated as the best community structure of the graph. I also used the Fastgreedy method –
Python I-Graph Function: clusters_fast_greedy() - in order to accomplish a community detection on
the nation-to-nation graph of gross flows. This method is based on the maximization of the
modularity: for a given partition, modularity measures how the structure of the partition is different
from a model constituted by the same groups of nodes, but randomly connected [9].
Once I detected the communities, I defined as “stable” the ones with a Jaccard Similarity
corresponding to, or bigger than 0.50, in order to analyse their evolution along time. In practice, the
output of the overmentioned algorithms as I implemented them is a list of lists of nodes, each of
the inner lists representing a community. However, applying the functions to different networks-
years, the problem to recognize communities along different network-years emerges. Thus, I
iteratively confronted the elements within the lists of lists I obtained, considering being the same
communities (or “stable” along time) the list of nodes with a Jaccard similarity equal to - or bigger
than – 0.50. The Jaccard Similarity is simply the ratio of the “intersection” to the “union” of the
elements of two sets, thus the ratio of the number of nodes shared by two communities to the
number of nodes comprised in the set of their nodes.
A∩B
𝐽𝑎𝑐𝑐𝑎𝑟𝑑 𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 =
A∪B
Figure 31: components of the Jaccard similarity
33
3.3) community membership of the automotive communities (external network)
sectors 2000 comm. 2000 sectors 2007 comm. 2007 sectors 2014 comm 2014
GBRr20 EU automotive GBRr20 EU automotive GBRr20 EU automotive
DEUr20 EU automotive DEUr20 EU automotive DEUr20 EU automotive
CZEr20 EU automotive CZEr20 EU automotive CZEr20 EU automotive
FRAr20 EU automotive FRAr20 EU automotive FRAr20 EU automotive
ITAr20 EU automotive ITAr20 EU automotive ITAr20 EU automotive
SVNr20 EU automotive SVNr20 EU automotive SVNr20 EU automotive
HUNr20 EU automotive HUNr20 EU automotive HUNr20 EU automotive
AUTr20 EU automotive AUTr20 EU automotive AUTr20 EU automotive
BELr20 EU automotive BELr20 EU automotive BELr20 EU automotive
DEUr28 EU automotive DEUr28 EU automotive DEUr28 EU automotive
ESPr20 EU automotive ESPr20 EU automotive ESPr20 EU automotive
ESPr28 EU automotive ESPr28 EU automotive ESPr28 EU automotive
GBRr28 EU automotive GBRr28 EU automotive GBRr28 EU automotive
ITAr28 EU automotive ITAr28 EU automotive ITAr28 EU automotive
POLr20 EU automotive POLr20 EU automotive POLr20 EU automotive
NLDr20 EU automotive NLDr20 EU automotive NLDr20 EU automotive
SWEr20 EU automotive SWEr20 EU automotive SWEr20 EU automotive
CHEr20 EU automotive CHEr20 EU automotive CHEr20 EU automotive
CHEr28 EU automotive CHEr28 EU automotive CHEr28 EU automotive
AUTr28 EU automotive AUTr28 EU automotive AUTr28 EU automotive
DNKr28 EU automotive DNKr28 EU automotive DNKr28 EU automotive
PRTr20 EU automotive PRTr20 EU automotive PRTr20 EU automotive
TURr20 EU automotive TURr20 EU automotive TURr20 EU automotive
NORr28 EU automotive NORr28 EU automotive NORr28 EU automotive
SVKr20 EU automotive SVKr20 EU automotive SVKr20 EU automotive
NORr20 EU automotive NORr20 EU automotive NORr20 EU automotive
CANr20 Pacific automotive CANr20 Pacific automotive CANr20 Pacific automotive
CHNr20 Pacific automotive CHNr20 Pacific automotive CHNr20 EU automotive
JPNr20 Pacific automotive JPNr20 Pacific automotive JPNr20 EU automotive
KORr20 Pacific automotive KORr20 Pacific automotive KORr20 EU automotive
TWNr20 Pacific automotive TWNr20 Pacific automotive TWNr20 EU automotive
USAr20 Pacific automotive USAr20 Pacific automotive USAr20 Pacific automotive
AUSr20 Pacific automotive AUSr20 Pacific automotive AUSr20 Pacific automotive
IDNr20 Pacific automotive IDNr20 Pacific automotive IDNr20 EU automotive
MEXr20 Pacific automotive MEXr20 Pacific automotive MEXr20 Pacific automotive
BRAr20 Pacific automotive BRAr20 Pacific automotive BRAr20 Pacific automotive
INDr20 Pacific automotive INDr20 Pacific automotive INDr20 EU automotive
USAr28 Pacific automotive USAr28 Pacific automotive USAr28 Pacific automotive
AUSr28 Pacific automotive AUSr28 Pacific automotive AUSr28 EU automotive
TURr29 EU automotive TURr29 EU automotive
BRAr13 Pacific automotive BRAr13 Pacific automotive
SVNr13 EU automotive SVNr13 EU automotive
RUSr20 EU automotive RUSr20 EU automotive
POLr28 EU automotive POLr28 EU automotive
CANr35 Pacific automotive CANr35 Pacific automotive
MEXr28 Pacific automotive MEXr28 Pacific automotive
TURr30 EU automotive
CANr34 Pacific automotive
MEXr39 Pacific automotive
ROUr20 EU automotive
BGRr13 EU automotive
BGRr20 EU automotive
RUSr28 EU automotive
POLr23 EU automotive
BRAr28 EU automotive
POLr30 EU automotive
DNKr20 EU automotive
FINr20 EU automotive
CZEr28 EU automotive
ESTr20 EU automotive
IRLr20 EU automotive
INDr28 EU automotive
MEXr34 Pacific automotive
34
References
[1] Cerina F, Zhu Z, Chessa A, Riccaboni M, World Input-Output Network, PLoS ONE 10(7), 2015.
Available in: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0134025
[2] Amador J, Cabral S, A Network Analysis of Value Added Trade, 2015. Available in:
https://www.researchgate.net/publication/277312869_A_network_analysis_of_value_added_tra
de
[3] Luu D T, Napoletano M, Fagiolo G, Roventini A, Sgrignoli P, Uncovering the Network Complexity
in Input-Output Linkages among Sectors in European Countries, Working Paper, 2017. Available in:
http://www.isigrowth.eu/2018/05/19/uncovering-the-network-complexity-in-input-output-
linkages-among-sectors-in-european-countries-2/
[4] Miller R E, Blait D B, Input Output Analysis – Foundations & Extensions, Cambridge University
Press, Cambridge, 2009
[5] World Input Output Database (WIOD), 2016 release. Available in:
http://www.wiod.org/release16
[6] Timmer P, Dietzenbacher E, Los B, Stehrer R, De Vries G J, An Illustrated User Guide to the
World Input–Output Database: the Case of Global Automotive Production, Review of International
Economics, April 2015.
[7] Page L, Brin S, The PageRank Citation Ranking: Bringing Order to the Web, Stanford, 1998.
Available in: http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf
[8] Bohlin L, Edler D, Lancichine A, and Rosvall M, Community detection and visualization of
networks with the map equation framework. Available in:
https://www.mapequation.org/assets/publications/mapequationtutorial.pdf
[9]Newman M E J, Girvan M, Finding and evaluating community structure in networks, PhysRev, February
2004
- Coding languages, packaging and tools:

Python Igraph: https://igraph.org/python/
R Igraph: https://igraph.org/r/
Numpy: https://numpy.org/doc/
Pandas: https://pandas.pydata.org/pandas-docs/stable/
Matplotlib: https://matplotlib.org/contents.html
Squarify (used for the treemaps of the betweenness and pagerank):
https://github.com/laserson/squarify
RAWgraph (used for the alluvial plots of the community detection of the external graph):
https://rawgraphs.io/
35

Analysing The Global Interindustry Trade (2000-2014) - An Exercise Through A Network Perspective

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Analysing The Global Interindustry Trade (2000-2014) - An Exercise Through A Network Perspective

Uploaded by

Copyright:

Available Formats

Analysing the global interindustry trade (2000-2014).

An exercise through a Network perspective.

Master in Big Data Analytics & Social Mining

- Università di Pisa, AA 2019-2020

Studente: Lorenzo Lodi

4.1) Analysis of the “Degree Centrality” p. 15

4.2) Analysis of the “Strength” centrality p. 17

4.3) Analysis of the “Betweenness” centrality p. 19

4.4) Analysis of the “Pagerank” centrality p. 21

4) Community detection p.32

Figure 1: "aggregate network" (left), "external network" (right) 2000

Figure 2: "aggregate network" (left) and "external network" (right) 2007

Figure 6: “average clustering coefficient” agg. and ext.

Figure 8: “community structure” 2007 agg. network

Figure 10: evolution of communities in the nation-to-nation network

Figure 11: evolution of communities ext. network

Figure 127: shifts in the “automotive communities”

3.1 Analysis of the “Degree Centrality” on the external network

Figura 13: “top 20 sectors2 for Out-degree

3.2 Analysis of the “Strength” centrality

Figura 17: “top 20 sectors” for Out-strength

Figura 18: “top 20 sectors” for In-strength

Figura 21:” top 20 sectors” for betweenness

3.4 Analysis of the “Pagerank” centrality

Figura 22: “top 20 sectors2 for Pagerank

Figura 24: structure of a simple input-output table

Figura 25: structure of a WIOD table

1.1) Country codes & names encompassed in WIOD tables

Figura 28: components of the cl. Coef. calculation

Figure 29: paths among nodes

2.2) Pagerank centrality:

Figure 30: prob. to reach nodes for a random surfer

Figure 31: components of the Jaccard similarity

- Coding languages, packaging and tools:

You might also like