Professional Documents
Culture Documents
Computer Networks
journal homepage: www.elsevier.com/locate/comnet
a r t i c l e i n f o a b s t r a c t
Article history: Large-scale disasters can severely disrupt Information Technology (IT) infrastructure, e.g., Data Centers.
Received 8 February 2018 Earthquakes, hurricanes, tsunamis, and other natural catastrophes may lead to such a scenario. Man-
Revised 14 September 2018
made threats, e.g., a High-Altitude Electromagnetic Pulse (HEMP), can also provoke such an aftermath. In
Accepted 29 October 2018
the HEMP case, however, the aftermath might include damage even to non-terrestrial IT infrastructure,
Available online 3 November 2018
such as satellites. After any disaster, it is important that critical data located in the affected region be
Keywords: evacuated to secure locations where it can be useful for emergency operations, mission-critical activities,
Post-Disaster recovery rescue and relief efforts, and society and businesses in general. To minimize the time it takes to per-
Satellite and terrestrial networks form this evacuation, we must use all available resources as efficiently as possible. This includes using
Data evacuation the remaining satellites to connect the affected regions of the network to the unaffected ones. Utilizing
Software-Defined networks the Software-Defined Network (SDN) paradigm applied to satellite networks, we propose an algorithm
Mega constellations
that can be executed by the SDN controller. This algorithm generates an evacuation plan for data located
Traffic engineering.
in possibly-isolated terrestrial systems, such as Data Centers, through the satellite network, towards fi-
nal destinations in the main network. The evacuation plan is a transmission schedule that maximizes
the amount of evacuated data. Considering the current industrial interest in mega satellite constellations,
we compare how two constellations of 66 and 720 satellites perform in terms of amount of data evac-
uated. Our results show how the evacuation is affected by different satellite constellation configurations
(i.e., buffers, inter-satellite link capacities, etc.). Since our approach allows for Traffic Engineering (TE) to
be performed, we also demonstrate how it enables fair resource utilization among different affected in-
frastructures during data evacuation. Our illustrative examples also compare our method to an approach
designed for Delay-Tolerant-Vehicle networks and show how our solution can evacuate up to 60% more
data after a disaster.
© 2018 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.comnet.2018.10.019
1389-1286/© 2018 Elsevier B.V. All rights reserved.
R.B. R. Lourenço, G.B. Figueiredo and M. Tornatore et al. / Computer Networks 148 (2019) 88–100 89
most impactful disasters. In this study, we focus on the occurrence ponents, avoiding unnecessary competition for potentially-scarce
of a HEMP, but our contributions are applicable to other disasters. post-disaster satellite resources.
Fig. 1 depicts a possible impact zone of a HEMP detonation. This We focus on a scenario where a part of the terrestrial network
zone might span hundreds of miles in which electronic equipment is severely damaged. Our contributions, however, also apply to a
might be completely damaged. In the example, nodes 28 and 29 strenuous scenario where the terrestrial network becomes frag-
stop functioning, which forces all communication coming from and mented (and maybe even the satellite, too), as a possible aftermath
going to nodes 34, 35, and 36 to go through the link from node of a HEMP. We focus on actions to be taken after the HEMP deto-
33 to node 40. As already observed in past disasters, a peak in nation.
offered traffic commonly occurs right after the disaster. After the To devise an efficient data-evacuation plan after a disaster, we
Great Hanshin-Awaji earthquake of 1995 (a.k.a., Kobe earthquake), leverage properties of satellite networks and Delay-Tolerant Net-
for example, a peak of 50x the usual traffic created a network con- works (DTN). We utilize knowledge such as the geographical loca-
gestion that lasted for five days [4]. Hence, it is very likely that the tions of terrestrial network infrastructure and satellite orbits to for-
link between node 33 and node 40 is heavily congested after the mulate an evacuation strategy from systems within the affected re-
disaster. In practice, this scenario is similar to a situation where gions to destinations in the main network. This evacuation strategy
nodes 34, 35, and 36 become completely disconnected from the consists of a transmission schedule that determines when satel-
main network, i.e., where the network becomes fragmented. lite GSs should transmit to specific satellites (and vice versa), and
After any disaster, it is important that critical data located in when satellites should transmit to other satellites (as well as when
the affected region be evacuated to safe destination storages in they should store data in their buffers). This schedule is such that
the unaffected part of the network (referred to as the main net- it maximizes the amount of data evacuated from the affected re-
work), where it might be useful to rescue efforts, mission-critical gion using satellite links that were not affected by the HEMP det-
operations, and society and businesses in general. To perform the onation. It also contains time information so that data that must
evacuation, the use of alternative means of communication, such as reach its destination more urgently can be sent at the appropri-
satellites, is very important whether the network has been severely ate transmissions during evacuation. Our numerical examples show
damaged (as in Fig. 1) or even fragmented. Nonetheless, since even how the evacuation plan performs for different sizes of constella-
the satellite constellation might have been damaged — particularly tions, and how TE can be beneficial to the evacuation. Our exam-
after the detonation of a HEMP — it might be the case that satellite ples also demonstrate how our method can evacuate up to 60%
services become intermittent. more data after a disaster, when compared to a solution specifi-
Low-Earth Orbit (LEO) satellites are constantly moving around cally designed for Delay-Tolerant-Vehicle Networks.
orbits of low altitude (typically around 2,0 0 0 km for circular or- The rest of this work is organized as follows: Section 2 briefly
bits). Accordingly, such systems are able to provide low-latency reviews the literature related to the impact of disaster on IT in-
communication. To stay in such orbits, LEO satellites must main- frastructure. Section 3 analyzes the impact of HEMPs on satellites.
tain constant motion with respect to Earths surface, taking them Section 4 details the system architecture and problem statement.
around 90 min to revolve around the globe. Thus, to provide con- Section 5 describes how our algorithm works, and Section 6 dis-
tinuous coverage to any point on Earths surface, multiple satellites cusses the scalability of the algorithm. Section 7 discusses the
are necessary. In contrast, one Geostationary (GEO) satellite can lessons learned using practical numerical examples. Section 8 con-
cover a region of Earths surface continuously, albeit at a distance cludes the study.
of around 36,0 0 0 km. A group of satellites working in concert is
called a constellation. Although LEO satellites are more suscepti-
2. Related works
ble (than higher-orbit satellites, such as GEO) to being harmed by
events such as a HEMP, even in the dire scenario of constellation
In this section, we review the relevant literature. To the best of
fragmentation, there are still different reasons why LEO satellites
our knowledge, no other work has focused on evacuating data from
are good candidates to evacuate critical data, as follows:
a disaster-affected region by utilizing both terrestrial and aerial
• Alternative evacuation means, such as physical evacuation of networks. Thus, in Section 2.1, we review studies that have focused
hard drives, would incur long delays; on evacuating data from disaster-affected regions (either through
• Higher-orbit unaffected satellites (e.g., GEO) would likely be wired or wireless networks, without using both).
used by emergency operations for always-on communications; Because this study utilizes methods similar to algorithms pro-
• Satellite Ground Stations (GSs) can be quickly deployed by posed for DTN, in Section 2.2, we briefly review some studies that
emergency operations and can possibly be located in radiation- have focused on routing in these environments.
protected silos; and
• Currently, there are commercial plans to deploy at least two
2.1. Post-Disaster data evacuation and first response
mega LEO constellations, one having more than 700 satellites
[5] and another more than 40 0 0 [6].
The vulnerability of terrestrial networks to disasters is a prob-
As Software-Defined Networking (SDN) technology is intro- lem that has been addressed in different studies. The importance
duced to satellite networks [7], it allows for complex orchestra- of disaster resiliency in communication networks was analyzed in
tion of the network elements, which further motivates the use [9]. The resiliency of Wavelength-Division Multiplexed (WDM) net-
of satellites to evacuate critical data after disasters. As will be works to disasters was investigated in [10]. The problem of disaster
further explored, we utilize the SDN control plane to compute survivability in optical networks was investigated in [2]. A progres-
a data-evacuation strategy and, accordingly, to enforce that strat- sive recovery approach for virtual infrastructure services after dis-
egy on the network elements. The SDN control plane also enables asters was investigated in [11]. A method to protect anycast and
the orchestration of different network domains [8], creating an ar- unicast flows against attacks and disasters was studied in [12]. The
rangement of different networks (or domains) that allows for the effects of electromagnetic pulses on communication networks have
maximization of the evacuated data, even when multiple net- been considered in [13]. None of the works above have focused on
work fragments are disconnected from the main network. In this actions to be taken after disasters in order to retrieve important
last scenario, the control plane can use Traffic Engineering (TE) information that was stored in the affected region, which is the
techniques to achieve fairness between different isolated com- subject of this work.
90 R.B. R. Lourenço, G.B. Figueiredo and M. Tornatore et al. / Computer Networks 148 (2019) 88–100
1
Sat2,8 E,8
1
DC,8
4 1
D,8
1
B,6 Sat2,6
1
Isolated,6 ∞
1
1
A,6
1
B,5 Sat2,5 E,5
∞ 1 1
(a) ∞
Isolated,5 DC,5
1 1
i j k l m n ∞ 1 1 Sink
A,5 Sat1,5 D,5
Sat 2→GS E
B,3 Sat2,3 E,3
∞ 1 2 1
GS B→Sat 2 ∞
1
Isolated,3 DC,3
Sat 1→GS D Source 1
1
1 2 ∞
GS A→Sat 1 A,3 Sat1,3 D,3
∞ B,2 E,2
1 2 3 5 6 8Time (time units) 1 1
1
Isolated,2 DC,2
(b) ∞ 1
1
2 1
A,2 Sat1,2 D,2
it j:
Mb 2s , 1M B,1
s, 1 bit
Z 1
i:1
bit Sat 1 k:3 1
s, 2M
s, 1M bit
Isolated,1
j:2 Y
1
l:5 1
bit GS A bit s, 1 GS D −, 1 A,1 Sat1,1
M s , 2M Mb M bit
−, 1 k:3 it
O X
Isolated −, 1
M
k:
3s, M bit DC
bit 2 Mb −, 1
it Fig. 5. Event-Driven Graph representation of the Contact Graph from Fig. 4. Bold
GS B l:5 GS E
s, 1M edges are buffers. Solid edges are non-intermittent links. Dashed edges are inter-
bit bit mittent links. Dotted lines connect a dummy source to all occurrences of the Iso-
m: s, 4M
6s, Sat 2 n:8 lated vertices and all DC vertices to a dummy sink. Edges with zero capacity are not
1M shown. All capacities are in Mbit.
bit
(c)
The partially-overlapping contact of Fig. 4b can be represented vertices B,1 and B,2 of the graph in Fig. 5, for example, are only in-
by the smaller completely-overlapping contacts of each interval cluded at the t = 2 XY-plane because of the non-intermittent links
(slice) i, j, k, l, m, and n. The output of Contact Atomization step Isolated → GS A and Isolated → GS B of the graph in Fig. 4. Because
is a list of either non-overlapping or completely-overlapping con- GS B has no buffer, there is no edge connecting B,1 and B,2 in Fig. 5.
tacts, each labeled α : δ , γ where α identifies the slice, δ the in- Since bidirectional connections prevail in both terrestrial and
stant when the contact happens, and γ the data transmission ca- satellite networks, we introduce a method to represent bidirec-
pacity of the contact. In Fig. 4b, contact GS A → Sat 1 is atomized tional contacts in the Event-Driven Graph GT . Because the contact
to edges i: 1s, 1Mbit; j: 2s, 1Mbit; and k: 3s, 2Mbit. atomization process slices partially-overlapping contacts, several
atomized contacts end up happening simultaneously. Compared to
5.3. Contact graph the overall evacuation period, terrestrial links, satellite-to-satellite
links, and satellite-to-GS links have negligibly-small propagation
This step consists of generating a graph GC that represents the delay, thus traversing from one node to any other at one single
topology of the network where contacts between satellites and GSs instant t is assumed to be possible (given that there is a route be-
(i.e., intermittent links) are represented by edges labeled accord- tween them). This process is performed in the translation from the
ing to the moment they occur and with capacity equal to the link undirected Contact Graph of Fig. 6a to the directed Event-Driven
transmission rate multiplied by the contact duration. In this ex- Graph of Fig. 6b. In this example, edges S1 → T0, T1 → S0, F1 → T0,
ample, since all contacts are atomized, the Contact Graph shown and T1 → F0 refer to the same instant t. Also, in Fig. 6, we omit the
in Fig. 4c has more intermittent edges than the original topology contacts that nodes S and T will perform in future t and t (i.e.,
of Fig. 4a. Note that we also include non-intermittent links in the edges of vertices S0, t and T0, t are not shown); but the buffer
Contact Graph. To do so, instead of atomizing and labeling these edges connecting these nodes at t to their respective representa-
edges, we set its time label to “ − ”, and its capacity equals to tions in the following t and t are shown. The capacities of the
the link data rate (see edges Isolated → GS A, Isolated → GS B, GS edges S0 → S1, T0 → T1, and F0 → F1 are the switching capacities
D → DC, and GS E → DC in Fig. 4c). SWA of each node A (in this work, we consider these infinite). The
link between nodes T and F is non-intermittent. Thus, nodes F0, t
5.4. Event-Driven graph and F1, t represent F at instant t. Note that it is possible to flow
data from F0 to S1 in the same instant t.
Based on the Contact Graph GC generated above, we now cre- We present a pseudo-code for this step in Algorithm 1, which
ate an Event-Driven Graph GT . The current example of Fig. 4c can creates the directed Event-Driven Graph GT from the bidirectional
be transformed into the Event-Driven Graph shown in Fig. 5. The Contact Graph GC . The notations S0, t; S1, t, T0, t; T1, t, Q0, t; Q1,
Event-Driven Graph can be understood as a 3D structure. The first t, P0, t; P1, t, U0, t; U1, t, and W0, t; W1, t all refer to different
dimension (X) represents intermittent links, the second dimension bidirectional event-driven vertices, such as the ones exemplified in
(Y) represents non-intermittent links, and the third dimension (Z) Fig. 6. It first creates a copy of the Contact Graph with all intermit-
represents buffer edges. tent edges removed. Then, the outer for loop of line 5 goes through
GT is a directed graph that removes the time dimension from all intermittent edges of the original Contact Graph. For each edge,
GC ’s edges by representing each physical node A with a set of ver- it adds a pair of vertices representing the edge’s source and des-
tices, one for each instant ti when node A participates in some tination (connecting them as was discussed) to set AllVertices. It
atomized contact (each of these vertices in GT being labeled “A, also adds corresponding edges to AllEdges. For the source and the
ti ”, where the set of ti are all the times when an atomized con- destination nodes of each intermittent edge, it goes through (for
tact involving node A happens). When a contact happens at ti loops of lines 11 and 19) each edge in the connected component
and the very next contact happens at tj , we connect A, ti to A, tj of the copied Contact Graph (without intermittent edges) that that
with an edge representing node A’s buffer (thus, the capacity of node is part of. Then, it adds a pair of vertices representing each of
such edge is equal to that node’s buffer). In Fig. 5, vertices Sat1,1, the graph component’s nodes to AllVertices and edges to AllEdges.
Sat1,2, Sat1,3, and Sat1,5 represent Sat1 of the Contact Graph in Once all intermittent edges have been considered, the rest of the
Fig. 4c. Note that bold edges Sat1,1 → Sat1,2, Sat1,2 → Sat1,3, and pseudo-code adds vertices and then edges to the Event-Driven
Sat1,3 → Sat1,5 represent the buffer of Satellite 1. Accordingly, an Graph (by first ordering by time-of-contact and then traversing the
intermittent edge between nodes A and B with a labeled time ti sets of all edges and vertices).
from graph GC is translated to an edge from A, ti to B, ti in GT . The As we will discuss in the next section, the performance of this
capacity of such an edge is the same for both GC and GT . In our step is directly proportional to the size of its input, the Contact
example, edge A,1 → Sat1,1 of Fig. 5 represents atomized contact GS Graph. Similarly, the performance of the flow-maximization algo-
A → Sat1 labeled i: 1 s, 1 Mbit of the graph in Fig. 4c. rithm, discussed in the next step, is directly dependent on the size
Note that we consider non-intermittent links in the Event- of the Event-Driven Graph generated in this step. Since both Con-
Driven Graph GT . To this end, if there is a contact at time t between tact and Event-Driven Graphs can be of large sizes, it is important
nodes A and B, we add vertices representing all nodes C, D, E, . . . to implement this algorithm efficiently.
that are reachable from A, and all nodes J, K, L, . . . that are reach-
able from B through non-intermittent links in GC . Each of these 5.5. Flow maximization
vertices carry the label t, for every t when there is a contact in a
node A that can reach them. The capacities of the edges that in- So far, all the procedures shown can be applied in the same
terconnect these vertices of GT are equal to the equivalent GC edge way whether the network was fragmented or not (and if so,
capacity multiplied by the duration of the contact that starts at t. whether it was fragmented into two or more fragments). Now,
Indeed, this consists of replicating the whole contiguous network however, some of these cases have to be treated separately.
attached at nodes A and/or B whenever there is a contact involv- To maximize the amount of data evacuated from the affected
ing these nodes. Thus, GT will have a set of vertices for each node Data Centers to the target Data Center in the main connected net-
C reachable from some node A for every time t that A participates work, we execute a flow-maximization algorithm on top of the
in a contact with some other node. Like before, the vertices repre- Event-Driven Graph created. The output of this step is a node-to-
senting C at each t, t , t , etc., are also connected by edges leaving node transmission schedule. Thus, we consider two possible post-
C, t and reaching C, t with capacity equal to the buffer of C. The HEMP scenarios:
94 R.B. R. Lourenço, G.B. Figueiredo and M. Tornatore et al. / Computer Networks 148 (2019) 88–100
stellations might evacuate data from the terrestrial topology pre- t , Cap = Y
sented in Fig. 1. The first one, consisting of 66 satellites evenly dis-
tributed in six orbits (loosely based on the current IRIDIUM system S (buf f erS )
[32]), generates a Contact Graph that contains about 100 vertices
and 70 0 0 edges, for an evacuation period of 6 hours. Its corre- t, Cap = X
sponding Event-Driven Graph contains about 160,0 0 0 vertices and
T (buf f erT )
440,0 0 0 edges. The second constellation, of 720 satellites (evenly
distributed in 18 planes, loosely-based on [5]), for the same evac- −, Cap = Z
uation period, yields a Contact Graph of 700 vertices and 30,0 0 0
t , Cap = W
edges. Its corresponding Event-Driven Graph contains almost 4 F (buf fF )
million vertices and more than 10 million edges.
We present some techniques, based on [33], to help shrink the (a)
size of the Event-Driven Graph. Shrinking this graph reduces mem-
ory consumption and makes the flow-maximization algorithm run
faster. These techniques perform shrinking of vertices into super- S0, t X T0, t Z· F0, t
δ
vertices and pruning of edges, while maintaining important infor- X δ Z·
mation about contact times so that the final transmission schedule SWS SWT SWF
can still be precisely generated, thus they do not remove accuracy
S1, t T1, t F1, t
from the evacuation plan. These techniques can be always used.
However, they might take some time to execute and, in a small buf f erS buf f erT
constellation, this increase might not compensate for the improve-
ment in the max-flow execution time. S0, t T0, t
(b)
6.2.1. Event-Driven graph pruning
We create the pruned graph GPT from the Event-Driven Graph Fig. 6. The bidirectional Contact Graph in 6 a can be represented as the Event-
GT by traversing the latter starting at the Source node and record- Driven Graph in 6 b. The duration of the slice that starts at t is δ .
ing in a set SReach all the vertices that were reached. Then, we tra-
verse GIn v (edge-inverted version of G ) starting at the Sink and
T T
Inv
recording in another set SReach all the reached vertices. Finally, we 7. Illustrative numerical examples
check ∀v ∈ VT if v was reached in both traversals, i.e., if v ∈ SReach
Inv . If not, we remove v and all its incident edges from
and v ∈ SReach In this section, we present and compare some illustrative re-
sults of the proposed approach. The algorithm was implemented
the GT . The resulting graph is GPT . To create GPT from GT of Fig. 5,
in Java, using Orekit [34] for orbit calculations, and the JGraphT
vertices B,1; B,2; A,5; A,6; E,2; E,3; E,5; and D,8 are pruned. This
library [35] to aid in graph-related computations.
procedure does not affect the evacuation calculation because the
We start by analyzing what factors impact the most the
vertices and edges that are pruned are either not reachable from
data evacuation (Section 7.1). Then, we study how TE can pro-
the affected system (i.e., there is no sequence of transmission pos-
vide fairness among multiple different network fragments, in
sible that allow such node to be reached at that time); or, if they
Section 7.2 (i.e., if the network is severely damaged, to the point of
are reachable, data cannot leave those nodes and reach the des-
disconnecting multiple elements). Finally, in Section 7.3, we com-
tination Data Centers (i.e., there is no sequence of transmissions
pare our solution to a traditional DTN-based approach that could
leaving that node at that time and reaching the destination Data
also be utilized to evacuate data after a disaster.
Centers).
66 Constellation
Data Evacuated After 12 Hours
No-buffer, No-ISL
• The combination of buffers and ISLs performs the best, but ISLs
evacuate from a GS during a contact also varies. The amount of
have more impact than buffers;
data received from a GS is also subject to the size of the satellite’s
• ISLs are more beneficial than buffers when the constellations
buffer, to the amount of data it can retransmit to other satellites
are not damaged, most notably for the 720 constellation with-
(i.e., its ISL), and, ultimately, to the capacity of the terrestrial net-
out damage. This benefit fades away as the fraction of dam-
work through which the evacuated data will flow before reaching
aged satellites increases, because the damaged satellites create
its destination Data Center.
holes in the covered area, minimizing the number of occasions
In Fig. 9, we explore how different buffers and ISL configura-
in which a GS in the affected region and a GS in the main net-
tions affect the amount of evacuated data for increasing Down-
work are covered by the constellation simultaneously. Hence,
link/Uplink bandwidths after 12 hours of evacuation for the un-
as the satellite network gets more damaged, the satellites start
damaged scenario of each constellation. In each graph, two groups
acting as physical data carriers (using their buffers).
of results are shown: for buffer sensitivity analysis, a group with
• The 720 constellation evacuates more data in the configurations
ISLs and a group without ISLs (dashed lines); for ISL sensitivity
“Buffer, No-ISL”, “No-buffer, ISL”, and “Buffer, ISL”. This is due to
analysis, a group with buffers and a group without buffers (dashed
a smaller reduction in the overall covered area for the 720 con-
lines).
stellation and to the higher amount of total transmitting capac-
In Fig. 9a, the impact of different buffer sizes is investigated
ity in that constellation;
for the 66 constellation. The sizes of buffer has a limited effect on
• A lack of buffers and ISLs hinders more the performance of the
the performance of this constellation (specially in the presence of
720 constellation than the 66 constellation. This is because one
ISLs), as most of its transmissions occur by relaying communica-
satellite in the 720 constellation rarely covers a GS in the af-
tion from GSs in the affected region to GSs in the main network
fected region and, at the same time, a GS in the main network
when both of them are inside the covered area. Thus, the amount
since its coverage area is smaller.
of evacuated data continuously increases as the throughput of the
The performance of the two constellations differs mainly due ground-to-satellite links grows, to the point where it is capped by
to the different covered area per satellite. As the coverage areas the maximum throughput of the terrestrial network.
of satellites in each constellation have different sizes (smaller in Fig. 9b shows the impact of different ISL capacities in the evac-
the 720 constellation), it takes less time for satellites in this con- uation strategy. As in Fig. 9a, the performance of the evacua-
stellation to cross it than it takes for satellites in the 66 constel- tion strategy is more impacted by the capacity of the ground-to-
lation to cross their coverage area. Because of this difference, the satellite links. Fig. 9b shows that the presence of ISLs has a much
amount of data that a satellite in each of these configurations can higher impact than the presence of buffers, while the capacities of
R.B. R. Lourenço, G.B. Figueiredo and M. Tornatore et al. / Computer Networks 148 (2019) 88–100 97
10 30 50 70 90 10 30 50 70 90
Downlink/Uplink (Gbps) Downlink/Uplink (Gbps)
(a) (b)
10 30 50 70 90 10 30 50 70 90
Downlink/Uplink (Gbps) Downlink/Uplink (Gbps)
(c) (d)
Fig. 9. Total evacuated data for the 66 constellation and the 720 constellation for different ground-to-satellite link throughputs.
the ISLs (i.e., when they are present) do not yield much difference, which gets saturated when the Downlink/Uplink capacity is larger
whereas the capacities of the buffers in Fig. 9a have greater influ- than 70 Gbps.
ence on the performance. The above results show that ISLs have a higher impact on the
Fig. 9c also explores different buffer sizes, but for the 720 con- overall amount of evacuated data. This impact is more intense in
stellation. As satellites in this constellation rarely cover a GS in the the 720 constellation. However, in both constellations, the benefits
affected region and a GS in the main network at the same time, the of the ISLs decline when the constellation is severely damaged, to
performance of this constellation is deeply tied to the presence of the point where satellites start acting as physical carriers of data.
buffers and ISLs. The biggest difference in performance occurs be-
tween the group with ISLs and the group without (again, the curve 7.2. Scenario b: Traffic engineering with two fragments
with ISLs and larger buffers perform best). In this case, however,
the amount of evacuated data converges to the maximum evacu- We show how TE — enabled by our approach — can be worth-
ation capacity of the terrestrial network with a smaller increase while. The satellite network considered is the 66 constellation with
in the ground-to-satellite link throughput than in the 66 constella- a 15% damage.
tion. Note, however, that such maximum is reached by three differ- Configuration: In this scenario, the network of Fig. 1 is frag-
ent configurations (i.e., 10 TB buffers and 10 Gbps ISLs; 1 TB buffers mented. Nodes 40 and 35 also stop working. Thus, the terres-
and 10 Gbps ISLs; and 10 TB buffers and no ISLs). The other config- trial network is divided in three: the main network; a component
urations do not achieve that maximum by simply increasing the consisting of nodes 34 and a Data Center; and third component
ground-to-satellite link capacities, i.e., the satellite network gets formed by nodes 36 and another Data Center. DC34 and DC36 (dis-
saturated before the terrestrial network does in these configura- connected from each other) are trying to evacuate their data to the
tions. main network through the impaired LEO satellite constellation to
Finally, Fig. 9d describes the impact of different ISL capacities which they are connected through GSs at nodes 34 and 36 (re-
for the 720 constellation. This plot confirms how the transmis- spectively). This topology is shown in Fig. 10.
sion rate of ISLs is more crucial in increasing the performance of We compare a Max Min Fair (MMF) fractional flow allocation
the network than the presence of buffers for the undamaged 720 policy to an earliest delivery (shortest-path) non-TE approach, sim-
constellation. Note how the performance of the 20 Gbps ISLs with ilar to the scenarios presented in the previous section. In both
buffers, the 30 Gbps ISLs with buffers, and the 30 Gbps ISLs with- cases, flows can be fractioned across different paths. MMF brings
out buffers are very similar. This shows that the limiting factor flows up to a Pareto optimum equilibrium, i.e., when no flows can
for these configurations is the capacity of the terrestrial network, be further increased without forcing some other flow to be low-
98 R.B. R. Lourenço, G.B. Figueiredo and M. Tornatore et al. / Computer Networks 148 (2019) 88–100
100
50
Fig. 10. Post-disaster topology of Scenario B. The network becomes fragmented be-
cause nodes 40 and 35 fail as well. 0
1 2 3 4 5 6 7 8 9 10 11 12
66 Constellation Hours After Disaster
Data Evacuated With and Without Traffic Engineering
120 Fig. 12. Data evacuated using our solution vs. a Delay-Tolerant-Vehicle network ap-
DC34 - No TE
Evacuated Data (TB)
DC36 - No TE proach. Our approach evacuates data to different DCs, while a DTN-based approach
100
DC34 - MMF only evacuate data to DC 1, since it is the only one directly connected to a GS (see
80 DC36 - MMF Fig. 1).
60
40
720 Constellation - Data Centers With Ground Stations
Comparison of Data Evacuated After 12 Hours
20
115%
0
Evacuated Data Relative to
Fig. 11. Scenario B: Data evacuated using MMF vs. a non-TE earliest delivery 110%
(shortest-path) approach.
105%
ered. The non-TE approach is the theoretical maximum that a dis-
tributed protocol (e.g., Contact Graph Routing [28]) can achieve
without utilizing extra tools to implement fairness. Non-TE flows 100%
can be routed through ISLs, are aware of the full contact plan, and
prioritize shortest paths in this plan.
The results shown in Fig. 11 demonstrate how TE can be im- 95%
plemented in all configurations of Table 1. It shows that a non-TE 0% 15% 30%
approach to evacuate data does not provide fairness among differ- Damaged Satellites
ent components. Without TE, DC34 gets to evacuate significantly
Fig. 13. Comparison of the amount of data evacuated after 12 hours using our so-
more data at the expense of DC36 evacuating significantly less. lution relative to that of a DTN-based approach.
DC34 ends up blocking resources from DC36, because its GS gets
to see satellites first. This effect accumulates such that, after 12
hours, DC36 evacuates 46% less data than it would if MMF had utilized; thus, no evacuation schedule can be calculated (whereas
been utilized instead. our method can still evacuate data in such situation). Because of
The summed amount of data evacuated from both DCs 34 and this, in this section, we first study a scenario where only one des-
36 is the same in both MMF and non-TE approaches, an observa- tination DC is locally connected to a GS (in Fig. 12); and, then, a
tion elucidated in [30], i.e., max-flow over multiple sources is the scenario where all destination and affected DCs are connected to
same as the max-flow over a single dummy source connected to local GSs (in Fig. 13).
the previous multiple sources. Because of geographical proximity In the results of Fig. 12, we utilize a 1 TB buffer, 10 Gbps ISL,
between GSs of nodes 35 and 34, they both have access to the 720 constellation (as described in Table 1, in line “Buffer, ISL”). The
same satellite resources, thus MMF is able to share capacity be- terrestrial network is shown in Fig. 1, and all links have 100 Gbps
tween the two DCs equally. capacity.
Fig. 12 demonstrates how our solution outperforms a DTN-
7.3. Comparison with DTN approaches based approach in a scenario where only one of the destination
DCs (i.e., DC at node 1) is directly connected to a GS, while other
We now study how our solution compares to a generic DTN- DCs are only accessible through the terrestrial network. Our ap-
based approach, as in [24]. As mentioned in Section 2, the main proach is able to evacuate data to DCs 1, 8, 14, and 22; while the
difference between [24] and ours is that, during the calculation of DTN-based approach can only reach DC 1. As a result, our solution
the evacuation plan, we consider the terrestrial topology that con- evacuates 60% more data than the DTN-based approach. It is not
nects GSs to one another. necessary that GSs will be collocated with DCs. Thus, for a proper
If either destination DCs or affected DCs are not directly con- evacuation of data after a disaster, our approach should be utilized,
nected to a local GS, the DTN-based approach cannot be directly instead of approaches that require DCs directly connected to GSs.
R.B. R. Lourenço, G.B. Figueiredo and M. Tornatore et al. / Computer Networks 148 (2019) 88–100 99
We also explore an opposite scenario where all DCs are lo- how several factors impact the performance of the data-evacuation
cally connected to GSs. Even in such a scenario, we note that our plan in two satellite networks of different constellation sizes. In
method either performs similarly or better than a DTN-based ap- the simulated scenarios, it is clear that ISLs have a higher contri-
proach. In fact, if the three following factors are true, our solution bution to the amount of evacuated data on undamaged satellite
outperforms a DTN-based approach: networks; and, as constellations are more damaged, buffers start
playing an important role. We showed how Traffic Engineering
1. The connection between a DC and its local GS provides less
can be implemented to enforce fairness among different discon-
bandwidth than the bandwidth between the GS and the af-
nected components. We compared our method to a DTN solution
fected DC provided by the satellite network (i.e., more data is
and showed that we can achieve up to 60% higher throughput. Fu-
being downloaded through some GS than its local DC can ab-
ture work includes studying how other aerial platforms with flexi-
sorb);
ble trajectories (such as planes and drones) might be used to min-
2. The local GS is also connected to a terrestrial network that
imize bottlenecks in the data evacuation plan.
can evacuate data to other, non-local DCs (i.e., because the DC
There is a growing interest in utilizing aerial platforms for com-
cannot absorb all the data being downloaded by the local GS,
munication purposes. The success in using these aerial platforms in
the GS must be able to evacuate the excess data to other DCs
post-disaster scenarios depends on efficient routing and schedul-
through the terrestrial network).
ing strategies that allow the maximum amount of data to be evac-
3. Connections between the other DCs and their local GSs are not
uated. Future work includes studying how other aerial platforms
completely utilized by the data being evacuated through those
might be used to minimize bottlenecks in the data evacuation
GSs (i.e., at least one other DC in the network can absorb the
plan. Also, the relationship between buffer sizes, ISL throughputs,
data being downloaded through the GS referenced above).
ground-to-satellite throughputs, and terrestrial network capacities
In the results of Fig. 13, we use the same topology of Fig. 1; must be investigated to devise an appropriate dimensioning of
however, only nodes 1, 30, 34, and 36 have DCs and GSs (DCs these elements, particularly when the goal of evacuating a specific
34 and 36 evacuate data to the others). We also consider a 720 amount of data in a certain amount of time is considered.
constellation where ISL and uplink/downlink capacities are all of
100 Gbps (the same capacity of the terrestrial network links). Acknowledgments
An example of the three factors above occurring is as follows:
(1) affected nodes 34 and 36 are evacuating data through the satel- R. Lourenço was funded by CAPES Foundation (Proc. 13220-13-
lite network; (2) the disaster created a hole in the satellite cover- 6). M. Tornatore acknowledges the research support from COST Ac-
age, and such hole is, at a certain moment, over GS 1 causing it tion CA15127. This work was supported in part by the Defense
to not be able to download any data from the satellite network Threat Reduction Agency grant HDTRA1-14-1-0047. We thank Dr.
(while GS 30 is still fully connected to the satellite network); and Paul Tandy of DTRA for many helpful discussions. We also thank
(3) at that particular moment, the amount of data being down- the anonymous reviewers for their helpful comments which sig-
loaded from the satellite network is larger than what its local DC nificantly improved the paper.
(30) can absorb. Thus, GS 30 must reach DC 1 through the ter-
References
restrial network so that the evacuated data that cannot be locally
stored in DC 30 can be retransmitted through the terrestrial net- [1] R.B.R. Lourenco, G.B. Figueiredo, M. Tornatore, B. Mukherjee, Post-disaster data
work and stored in DC 1. evacuation from isolated data centers through LEO satellite networks, in: Proc.
In Fig. 13, note that our approach performs just slightly bet- IEEE ICC, 2017.
[2] M.F. Habib, M. Tornatore, F. Dikbiyik, B. Mukherjee, Disaster survivability in
ter than the DTN-based solution for a scenario without damage optical communication networks, Comput. Commun. 36 (6) (2013) 630–644.
to the satellite network. This is because, our study utilizes evenly- [3] J.S. Foster Jr, et al., Report of the commission to assess the threat to the United
distributed satellites in the network with orbits that allow satellite States from electromagnetic pulse (EMP) attack: critical national infrastruc-
tures, Technical Report, DTIC Document, 2008.
coverage to be uniform. However, as Fig. 13 shows, when satellite
[4] F. Ranghieri, M. Ishiwatari, Learning from megadisasters: Lessons from the
coverage is not so uniform, e.g., due to satellite failures (i.e., the 15 great east japan earthquake, World Bank Publications, 2014.
or 30% damaged constellations), the factors listed above occur fre- [5] A. Vance, The new space race: one man’s mission to build a galac-
quently, resulting in more evacuated data with our solution when tic internet, 2015, http://www.bloomberg.com/news/features/2015-01-22/
the- new- space- race- one- man- s- mission- to- build- a- galactic- internet- i58i2dp6.
compared to the DTN-based approach2 Thus, our method achieves [6] D. Mosher, SpaceX just asked permission to launch 4,425 satellites,
high volumes of evacuated data not only when some DCs do not 2016, https://www.businessinsider.com/spacex- internet- satellite- constellation-
have access to local GSs, but also when all DCs are locally con- 2016-11.
[7] B. Barritt, T. Kichkaylo, K. Mandke, A. Zalcman, V. Lin, Operating a uav mesh
nected to GSs. internet backhaul network using temporospatial sdn, in: IEEE Aerospace Con-
ference, 2017, pp. 1–7.
8. Conclusion [8] S. Fichera, et al., On experimenting 5G: testbed set-up for SDN orchestration
across network cloud and IoT domains, in: 2017 IEEE NetSoft, 2017, pp. 1–6,
doi:10.1109/NETSOFT.2017.8004245.
We studied how to utilize satellite networks to reestablish com- [9] B. Mukherjee, M.F. Habib, F. Dikbiyik, Network adaptability from disaster dis-
munication after a disaster in order to evacuate data from dis- ruptions and cascading failures, IEEE Commun. Mag. 52 (5) (2014) 230–238.
[10] P.K. Agarwal, et al., The resilience of WDM networks to probabilistic geograph-
tressed regions in a post-disaster scenario. We focused on the oc-
ical failures, IEEE/ACM Trans. Netw. 21 (5) (2013) 1525–1538.
currence of High-Altitude Electromagnetic Pulse (HEMP), but our [11] M. Pourvali, F. Gu, K. Liang, K. Shaban, N. Ghani, Progressive recovery of virtual
contribution is also applicable for other types of disasters (namely, infrastructure services in optical cloud networks after large disasters, in: Proc.
the undamaged satellite network scenario). We proposed an al- OSA Optical Fiber Communication Conference, 2016, p. W1B.6.
[12] J. Rak, K. Walkowiak, Reliable anycast and unicast routing: protection against
gorithm which creates a node-to-node transmission schedule that attacks, Telecommun. Syst. 52 (2013) 889–906.
maximizes the amount of data evacuated from affected regions to [13] S. Neumayer, E. Modiano, Network reliability with geographically correlated
the main network. A possible architecture on top of which our pro- failures, in: Proc. IEEE INFOCOM, 2010, pp. 1–9.
[14] M. Liu, et al., The last minute: Efficient data evacuation strategy for sensor net-
posed algorithm can function was also shown. We demonstrated works in post-disaster applications, in: Proc. IEEE INFOCOM, 2011, pp. 291–295.
[15] S. Ferdousi, M. Tornatore, M.F. Habib, B. Mukherjee, Rapid data evacuation for
large-scale disasters in optical cloud networks, J. Opt. Commun. Netw. 7 (12)
2
Note that, in practical satellite networks, because coverage is not as uniform as (2015) B163–B172.
the coverage of our simulations, the three factors listed in this section might occur [16] A. Bianco, L. Giraudo, D. Hay, Optimal resource allocation for disaster recovery,
even without damage to the satellite constellation. in: Proc. IEEE GLOBECOM, 2010, pp. 1–5.
100 R.B. R. Lourenço, G.B. Figueiredo and M. Tornatore et al. / Computer Networks 148 (2019) 88–100
[17] J. Yao, P. Lu, Z. Zhu, Minimizing disaster backup window for geo-distributed [28] G. Araniti, et al., Contact graph routing in DTN space networks: overview, en-
multi-datacenter cloud systems, in: Proc. IEEE ICC, 2014, pp. 3631–3635. hancements and performance, IEEE Commun. Mag. 53 (3) (2015) 38–46.
[18] X. Xie, Q. Ling, P. Lu, W. Xu, Z. Zhu, Evacuate before too late: distributed [29] C. Sheng, Y. Tao, Worst-case i/o-efficient skyline algorithms, ACM Trans.
backup in inter-dc networks with progressive disasters, IEEE Trans. Parallel Database Syst. 37 (4) (2012) 26:1–26:22.
Distrib. Syst. PP (99) (2017). 1–1. [30] N. Megiddo, Optimal flows in networks with multiple sources and sinks, Math.
[19] L. Ma, et al., E-Time early warning data backup in disaster-aware optical in- Program. 7 (1) (1974) 97–107.
ter-connected data center networks, IEEE/OSA J. Opt. Commun. Netw. 9 (2017) [31] J.B. Orlin, Max Flows in O(nm) time, or better, in: Proc. ACM STOC, 2013.
536–545. [32] S.R. Pratt, R.A. Raines, C.E. Fossa, M.A. Temple, An operational and performance
[20] M. Erdelj, E. Natalizio, K.R. Chowdhury, I.F. Akyildiz, Help from the sky: lever- overview of the IRIDIUM low earth orbit satellite system, IEEE Commun. Surv.
aging UAVs for disaster management, IEEE Pervas. Comput. 16 (2017) 24–32. 2 (2) (1999) 2–10.
[21] D. Câmara, Cavalry to the rescue: drones fleet to help rescuers operations over [33] F. Liers, G. Pardella, Simplifying maximum flow computations: the effect
disasters scenarios, in: Proc. IEEE Conference on Antenna Measurements Ap- of shrinking and good initial flows, Discrete Appl. Math. 159 (17) (2011)
plications (CAMA), 2014. 2187–2203.
[22] I.F. Akyildiz, P. Wang, S.-C. Lin, Softair: a software defined networking architec- [34] V. Pommier-Maurussane, L. Maisonobe, Orekit: an open-source library for op-
ture for 5G wireless systems, Comput. Netw. 85 (2015) 1–18. erational flight dynamics applications, in: Proc. ESA ICATT, 2010.
[23] D. Hay, P. Giaccone, Optimal routing and scheduling for deterministic delay [35] B. Naveh, 2011, Jgrapht, Internet: http://jgrapht.sourceforge.net.
tolerant networks, in: Proc. IEEE WONS, 2009.
[24] F. Malandrino, C. Casetti, C.F. Chiasserini, M. Fiore, Optimal content download- Rafael B. R. Lourenço received the B. Eng. degree in com-
ing in vehicular networks, IEEE Trans. Mob. Comput. 12 (7) (2013) 1377–1391. munication and networks engineering from University of
[25] Y. Zeng, K. Xiang, D. Li, A.V. Vasilakos, Directional routing and scheduling for Brasilia, DF, Brazil, in 2012 and the Ph.D. degree in com-
green vehicular delay tolerant networks, Wireless Netw. 19 (2) (2013) 161–173. puter science from the University of California, Davis, CA,
[26] W. Brown, W. Hess, J. Van Allen, Collected papers on the artificial radiation USA, in 2018. In the Summers of 2016 and 2017 he in-
belt from the july 9, 1962, nuclear detonation, J. Geophys. Res. 68 (1963) terned at Google, working with their Software-Defined
605–606. Networks team. He was a grantee of the Brazilian Science
[27] D.J. Kessler, B.G. Cour-Palais, Collision frequency of artificial satellites: the cre- Without Borders Scholarship during his Ph.D.
ation of a debris belt, J. Geophys. Res. 83 (A6) (1978) 2637–2646.