Professional Documents
Culture Documents
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCNS.2019.2931862, IEEE
Transactions on Control of Network Systems
1
2325-5870 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCNS.2019.2931862, IEEE
Transactions on Control of Network Systems
2
Recovering
completely disconnect nodes from others but rather adjust
a. t=0 b. t=1
weights to connect more loosely with nodes with a higher
likelihood of infection. Instead of fixing the weights for the
whole spreading process, in the weight adaptation scheme, Weight restoring
each agent dynamically updates their weight in response to Weight decreasing
the state of the neighboring nodes. Weight adaptation is
different from changing the infection rate. The infection rate
d. t=3 c. t=2
is usually considered to be decided by some interior factors
like physiological or immunological states of individual. The
weight between two nodes is usually used to describe how Fig. 2. Weight adaptation scheme to mitigate the infection for the node in
the center over an undirected network. The line width indicates the weight.
strongly two nodes are connected. Changing the weight can Red nodes are the infected ones while blue nodes are the susceptible ones.
be interpreted as an exterior change. end Intro The susceptible node decreases the weight of its connection to an infected
We consider a directed weighted network where the nodes node. Once the infected node is recovered, the weight of the connection to
the recovered node will be restored as it is shown by process c to d. The
and the edges represent the agents and the connections be- purple link and the green link are the ones who adapt the weights.
tween the agents respectively. The directed connection be-
tween two nodes can be considered as one agent acquiring
information/data/packet from another agent. The weight be- each agent’s costate encodes the information about the network
tween two agents quantifies the frequency or the volume of structure and the infection of the whole network.
communication between two agents [6]. The original weight We use a centralized optimal control problem to serve as a
is pre-designed by multilateral agreement among agents to benchmark problem to study the efficiency of the decentral-
achieve certain goals or to optimize the system performance ized problem. Under centralized policies, the system operator
when there is no infection. For example, in distributed es- develops optimal weight adaptation scheme to achieve social
timation or learning problems over networks such as [6], optimum. Compared with the centralized solution, the open-
one agent needs to communicate with its neighboring agents loop NE solution is not the best from a system point of
at a sufficient rate to find the global estimate of the state. view since in the game, agents consider only their own cost.
The optimal weighting on the edges quantitatively captures Such inefficiency caused by selfish behavior of agents has a
the minimum required frequency of contacting neighboring significant impact on network and service management. One
nodes. As illustrated in Fig. 2, when there are wide-spreading example is the congestion in traffic network caused by selfish
virus or cyber attacks, the agent can decrease the likelihood drivers [7]. To address the inefficiency, we propose a dynamic
of being infected by reducing their weight with infected penalty approach by designing a mechanism in which each
neighbors. The agent then restores the connections when the agent pays for the infection cost of all agents that are reachable
infected neighbors are recovered. Deviation of the weights to him/her. We show that with this mechanism, the open-loop
from the optimal ones introduces cost induced by performance NE policy achieves the social optimum.
degradation and system inefficiency. Infected agents may not The equilibrium analysis and the mechanism design lead
function normally. The agents and the network system will to a distributed algorithm for the network operator and the
suffer losses. Thus, it is essential to consider the trade-off agents to compute the optimal weight adaptation where each
between malfunction cost caused by infection and inefficiency agent only has to know local information. We summarize the
or performance degradation cost caused by weight deviation. principal contributions as follows:
In this paper, an N-person nonzero-sum differential game-
based model is proposed to model the virus spreading and the 1) We propose a differential game model to develop a virus-
agents’ adaptive response to virus infection. This model cap- resistant weight adaptation scheme for cyber networks
tures the non-cooperative behaviors among agents, dynamic formed by a group of self-interested agents.
properties of spreading process, and the complexity of the 2) We study the structure of the open-loop NE for the
Paper’s local interactions. We characterize the Nash equilibrium (NE) differential game over complex networks and show the
sections for the game and investigate the network effects under the weight adaptation rule is based on the agents’ and its
non-cooperative strategies. We observe that under the open- out-neighbors’ infection level as well as the costate.
loop NE, each agent updates his weight based on its own 3) We discuss the inefficiency of the NE. A dynamic
infection level and its out-neighbors infection level as well as penalty scheme is proposed to achieve social optimum
the corresponding component of its costate. When the agent’s for the whole network.
own infection level is high, it does not care much about 4) An implementable distributed virus resistance algorithm
the weight of links to infected out-neighbors. When its out- is proposed to compute the NE-based control policy.
neighbor’s infection level is high, it lowers more weight of the Game theory has long been a useful tool to design strategies
corresponding connection. The corresponding component of on network systems for virus resistance purposes [3], [8], [9].
2325-5870 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCNS.2019.2931862, IEEE
Transactions on Control of Network Systems
3
In [8], the authors have proposed a network formation game the weight of the edge from node i to j. We assume that graph
that balances multiple partially conflicting objectives such as G has no self-loops.
the cost of installing links, the performance of the network and We denote the original weight adjacency matrix by W o =
the resistance to virus. In their work, an undirected unweighted [wioj ] ∈ R N ×N . Let Ni,o
out (N in ) be the set of out-neighbors
i,o
static network is formed. Hayel et.al. in [3] have studied (in-neighbors) under the original optimal weight pattern W o .
large population game with heterogeneous types of individuals.
They focus on group behavior of certain type in stead of
B. Virus Spreading Model
individual behavior. Besides game theory, other tools such as
impulse control [10], optimal control [11], and optimization With the fact that cyber network nodes do not have human-
[12] have been used to design strategies to mitigate malware like auto-antibody/vaccination which can prevent individual
attacks and virus spreading. from being infected again, we study the so-called susceptible-
Virus spreading over adaptive networks has first been stud- infected-susceptible (SIS) models. Consider a population of N
ied by Gross et al. in [13]. They investigated adaptive behavior agents. Each agent can be either susceptible (S) or infected (I).
in a homogeneous way where the whole network takes the Infected individuals infect others at rate βi ≥ 0. The intensity
same adaption. Based on the work on epidemic spreading over of interaction between vi and v j is described by the weight
time-varying networks [14], optimal control method has been wi j ∈ R. Denote wi = (wi1, ..., wi N )0 ∈ R N . We assume that
utilized to find the optimal time-varying topology response for the weight is bounded by w̄i j ∈ R. If vi ∈ V is susceptible
the network system in [11]. However, the centralized optimal while v j ∈ V is infected, there is possibility that vi will be
control method is not practical and lack of incentive. The infected after the interaction. In addition, each infected agent
effect of heterogeneous weight adaptation on virus spreading returns to the susceptible state at some rate σi . The state of a
has been studied by Yun et al. in [15]. In [15], the authors node i at time t ≥ 0 is a binary random variable Xi (t) ∈ {0, 1},
have proposed a weight adaptation rule without taking cost with Xi (t) = 0 (Xi (t) = 1), indicating that agent i is susceptible
into consideration. The weight adaptation rule is based on the (infected). The state vector of all N agents is denoted by
infection level of the whole network. X(t) = (X1 (t), X2 (t), ..., X N (t))0 ∈ {0, 1} N . With the adaptive
Vaccination and immunity have been studying for control of weight wi j (t) from agent i to j, the stochastic state transitions
virus spreading over decades [10], [12], [16]. But vaccination of node vi from time t to t + ∆t can be written as follows:
may not be efficient for some malware and virus due to P(Xi (t + ∆t) = 1|Xi (t) = 0, X(t))
their fast upgrading and undetectability. Also, getting every ÕN
individual vaccinated is costly. Quarantining [5] is equivalent = wi j β j X j (t)∆t + o(∆t), (1)
to removal of all connections of one agent. Compared with j=1
weight adaptation scheme, it is overreacting to disconnect all P(Xi (t + ∆t) = 0|Xi (t) = 1, X(t)) = σi ∆t + o(∆t).
links since connection with healthy agents cause no harm.
The paper is organized as follows. In Sect. II, preliminaries The model (1) is computationally challenging under large-
are given and the N-person nonzero-sum differential game scale networks due to the exponentially increasing size of the
framework is introduced. Section III describes the open-loop state space. Hence, we resort to mean-field approximation of
NE of the differential game and the weight adaptation scheme. the Markov process [14], [17], [18]. Denote xi (t) ∈ [0, 1] as the
Sect. IV studies the efficiency of the NE solution. Comparisons probability of agent i being infected at time t. The mean-field
of the differential game-based weight adaptation scheme with approximation then provides
the optimal control based scheme and other numerical results N
Õ
are given in Sect. V. Conclusions are contained in Sect. VI. xÛi (t) = (1 − xi (t)) wi j (t)β j x j (t) − σi xi (t), (2)
j=1
II. P RELIMINARIES AND P ROBLEM F ORMULATION for i = 1, 2, ..., N. To write this dynamics equation in a more
In this section, we introduce notations and preliminary compact form, denote x(t) = (x1 (t), ..., x N (t))0. We have
results needed in our derivations. Along the way, we describe xÛ (t) = G(x(t), W(t)), (3)
and develop the problem formulation.
where G(·, ·) : R N × R N ×N → R N which can be written as
G(x(t), W(t)) = (W(t)B−D)x(t)−Xd (t)W(t)Bx(t) where W(t) =
A. Graph Theory [wi j (t)] N ×N , B = diag(β1, ..., β N ), D = diag(σ1, ..., σ2 ), and
A weighted, directed graph can be defined by a triple G , Xd (t) = diag(x1 (t), ..., x N (t)).
(V, E, W). V , {v1, v2, ..., v N } represents a set of N nodes. According to the discussion in [14], the n-intertwined model
Define N , {1, ..., N }. A set of directed edges is denoted by (2) gives an upper-bound for the exact probability of infection,
E ⊆ V × V. The set of in-neighbors of node i is defined as xi (t). However, the mean-field approximation consider herein,
Niin , { j | j ∈ V, ( j, i) ∈ E}. Denote by | · | the cardinality while it is an approximation, is well constructed because the
of a set. So, the in-degree of vi is |Niin |. Similarly, the set scale of networks, i.e., N in our model is large and we focus
of out-neighbors of vi is Niout , { j | j ∈ V, (i, j) ∈ E}. The on the cases where β/σ is above the threshold [18].
out-degree of vi is |Niout |. The weight adjacency matrix G is The graph and the epidemic spreading process can be
denoted by an N × N matrix W , [wi j ] where wi j refers to viewed as physical constraints. The agents in the network
2325-5870 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCNS.2019.2931862, IEEE
Transactions on Control of Network Systems
4
are coupled by these constraints while trying to minimize and controls. Nodes interact with their neighbors. The network
their own cost. Such behaviors lead to differential games over topology is captured by wi j , i ∈ N, j ∈ N . The time-varying
networks, which will be introduced in the following section. property of the network is described by wi j (t) for t ∈ [0, T].
Remark 2. Information structure determines the state infor-
C. Differential Game Over Networks mation gained and recalled by players at time t. The reasons
As we mentioned in Section I, the self-interested agents aim why we adopt open-loop policies are three-fold. First, the
to minimize their own costs. One cost arise from malfunction obtained open-loop policy can be implemented as a feedback
caused by infection. Another cost for agent i is to describe policy [19] as is shown in Section III. Since the dynamics (3)
inefficiency or degradation of system performance caused by is determined, the state at any time can be computed and used
deviation from the original weight wioj for all j ∈ N . We to determine the control policy. Second, to obtain a strongly
consider the original weight as an optimal weight under which time-consistent optimal and individual feedback policies, we
the agent can achieve the most benefit. have to resort to techniques of dynamic programming. How-
For agent i, the infection cost function, given by fi : ever, a direct application of dynamic programming will not
[0, 1] → R+ , is a function of xi (t) ∈ [0, 1]. fi is assumed yield an individual feedback policy. Also, computation of the
to be monotonically increasing to capture the loss of being feedback control law derived from Hamilton–Jacobi–Bellman
infected. A weight cost function for edge from i to j is given equation requires solving nonlinear PDEs which increases the
by gi j (wi j (t) − wioj ) where gi j : R → R+ is convex. The difficulty of distributed implementation. Third, obtaining open-
function satisfies gi j (w) = 0 at and only at w = 0 for all loop policy resorts to maximum principle which well presents
i, j ∈ N because the original weight is optimal to the agent the structure of the optimal solution. This helps us to analyze
when there is no infection. It is optimal in terms of the tradeoff the inefficiency of the NE and obtain a penalty function to
between price and performance. The marginal cost of deviation achieve social optimum as is shown in Section IV.
from the optimal weight will increase as the distance from the
adapted weight to the optimal weight increases. Considering
a time duration from 0 to T ≥ 0, the cost function of agent i
III. A NALYTIC R ESULTS
during time interval [0, T] is given as follows by
∫T N
Õ The solutions to the N-person non-cooperative nonzero-sum
Ji = fi (xi (t)) + gi j (wi j (t) − wioj )dt. (4) differential game (5) played with an open-loop information
0 j=1 structure are open-loop Nash equilibria.
As each node determines its own weight adaptation policy, Definition 1. The weight adaptation trajectories or say the
it naturally leads to a differential game framework defined as control trajectories {w∗i , i ∈ N } constitute an open-loop NE
follows. Consider N agents in the network as N players with solution of the differential game (5) if the inequalities
an index set N = {1, ..., N }. The duration of the evolution of
the game is given by the time interval [0, T]. Denote x(t) = J1 (w∗1, w∗2, ..., w∗N ) ≤ J1 (w1, w∗2, ..., w∗N )
(x1 (t), ..., x N (t))0. Let X = {x ∈ R N |xi ∈ [0, 1], ∀i ∈ N } be
the permissible set of the states. For each fixed t ∈ [0, T], J2 (w∗1, w∗2, ..., w∗N ) ≤ J2 (w∗1, w2, ..., w∗N )
x(t) ∈ X. Let wi (t) = (wi1, ..., wi N ) be the controls of player .. (6)
.
i. The admissible control set for player i is Si = {0 ≤ wi j ≤
w̄i j , ∀ j ∈ N }, i.e., for each fixed t ∈ [0, T], wi ∈ Si ⊂ R N . A JN (w∗1, w∗2, ..., w∗N ) ≤ JN (w∗1, w∗2, ..., w N )
differential equation is given by (2) whose solution describes
the state trajectory of the game corresponding to the N-tuple hold for all control trajectories wi (t) ∈ Si, t ∈ [0, T]. We denote
of control functions {wi (t), 0 ≤ t ≤ T, i ∈ N } and the given xi∗ (t), t ∈ [0, T] the associated state trajectory for i ∈ N .
initial state x0 , (x1 (0), ..., x N (0))0 = (x10, ..., x N 0 )0. Define a
The definition states that at the open-loop NE, no agents
set-valued function ηi (·) for each i ∈ N to characterize the
have incentives to deviate unilaterally away from the optimal
information pattern of player i. We consider the open-loop
trajectory from time 0 to time T.
pattern in our case where ηi (t) = {x0 }, t ∈ [0, T]. We can state
our problem as the following differential game problem: To obtain the necessary conditions for the open-loop NE,
we make two mild assumptions.
min Ji (wi )
wi ∈Si Assumption 1. For each i ∈ N , the infection cost function
(5)
s.t. (2), fi (·) is to be of C 1 class.
where Ji (·) : R N → R and xi (0) = xi0, i = 1, 2, ..., N. Each Assumption 2. For each i, j ∈ N , the weight deviation cost
player aims to find a control policy µi (t, xi0 ) to generate a function gi j (·) is to be of C 1 class.
weight trajectory wi (t). Such control policies are open-loop
ones that depend on the initial condition of the individual state. Each player i ∈ N can decide to receive data or packets
from any other agent. The following observation narrows down
Remark 1. The game defined by (5) is a differential game the set of possible solutions of the open-loop NE.
over networks where the cost only depends on their own state
2325-5870 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCNS.2019.2931862, IEEE
Transactions on Control of Network Systems
5
2325-5870 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCNS.2019.2931862, IEEE
Transactions on Control of Network Systems
6
continuous for every i ∈ N, j ∈ N which means that there is always on the way to meet the minimum cost. Also, from
no switching in the optimal weight adaptation. theorem 3 (iii), we know for agent i who has no in-neighbors,
its out-link ui∗j (t) will never be 0 if αi β j /σi ≤ gi0j (−wioj ).
Remark 4. From (13), we know the weight between agents
This can be readily shown by φi j = pii (1 − xi∗ (t))β j x ∗j (t) ≤
i and j may be adapted to zero at certain time as one
(αi /σi )(1 − xi∗ )β j x ∗j (t) < αi β j /σi ≤ gi0j (−wioj ).
can see that ui∗j (t) = 0 if −φi j (t) ≤ gi0j (−ωioj ). That means
the connection between agent i and j may be disconnected
IV. I NEFFICIENCY OF NASH E QUILIBRIUM
temporarily which will be restored according to Theorem
3. For agent i, if all its out-links have weight zero, i.e., It is well known that the non-cooperative NE in nonzero-
−φi j (t) ≤ gi0j (−ωioj ) for all j ∈ Ni,o
out , we can view this agent sum games is generally inefficient [22]. There is need to
as being quarantined from infection. We say being quarantined develop a mechanism to attain a higher social welfare or
from infection because there might still be in-links connecting lower aggregate costs through cooperation behavior [23]. The
to agent i which means here, the concept of being quarantined notion of the price of anarchy has been introduced in [24]
is different from the concept in undirected graph. Besides, the to quantify the inefficiency. In the network, the social cost is
weight adaptation scheme we have proposed is different from the aggregate costs of all players. Let u = {u1, ..., u N } where
ou t
quarantining in a sense that the weight adaptation scheme ui (t) ∈ R | Ni, o | be the weight control variable for the whole
does not need to completely disconnect nodes from all other network with admissible set Uo = {u : ui j (t) ∈ [0, ωioj ], ∀i ∈
N, j ∈ Ni,o out , t ∈ [0, T]}. Denote by uo = {uo, ..., uo } the
others but rather adjust weights to connect more loosely with 1 N
particular nodes with a higher likelihood of infection. social optimal solution. The social optimum can be attained
by solving the optimal control problem:
Corollary 1. If gi j (·) is concave, i.e., the marginal cost of
deviation increases as the adapted weight becomes farther ∫T Õ
N N
Õ Õ
away from the optimal weight, the optimal control policy can min Jo = fi (xi (t))+ gi j (ui j (t) − wioj )dt
u∈Uo
be given as follows: 0 i=1 i=1 j ∈Ni,ou
o
t
Õ
gi j (−wioj ) s.t. xÛi (t) = (1 − xi (t)) ui j (t)β j x j (t) − σi xi (t),
0, φi j (t) ≥ ,
wioj
=
ui∗j (t) j ∈Ni,ou t
gi j (−wioj ) (14) o
wioj ,
φi j (t) < wioj , xi (0) = xi0, i = 1, 2, ..., N,
out , where φ (t) is defined in Theorem 2.
(15)
for i ∈ N, j ∈ Ni,o ij
ou t ou t
where Jo : R | N1, o | × · · · × R | NN, o | → R. The application of
We can see that if gi j (·) is concave, the optimal control maximum principle gives the following: the optimal control
policy switches between 0 and wioj . In this paper, we focus uo (t) and corresponding trajectory xo (t) must satisfy the
our study on the case when gi j (·) is convex. following so-called canonical equations:
Before stepping into the numerical computation of the
Õ
xÛio (t) = (1 − xio ) uioj (t)β j x oj − σi xio (t), xio (0) = xi0,
open-loop NE candidates, we go into further analysis and
j ∈Ni,ou
o
t
obtain other structural results that would be beneficial for an
(16)
insightful understanding of the weight adaptation mechanism.
λ(t)
Û = Γ(t, xo, uo, ..., uo )λ(t) + γ, λ(T) = 0,
1 N (17)
Theorem 3. The costate function and the open-loop control
u (t) = arg min H(t, xo (t), λ(t), u(t)),
o
(18)
trajectories have the following properties: u∈Uo
(i) Along the open-loop NE trajectory, pi j (t) ≥ 0 holds for for all i ∈ N , where Γ(t) is the same with the one given in
all i, j ∈ N, j , i and all t ∈ [0, T]. Furthermore, pii (t) (12) for the dynamics of the costate in the differential game
stays positive for all i ∈ N and all t ∈ [0, T). problem and γ(t) = [− f10(x1 (t)), − f20(x2 (t)), ..., − f N0 (x N )]0, the
(ii) The open-loop NE control trajectory ui∗j (t), for i ∈ N, j ∈ Hamiltonian of the optimal control problem is defined as
Ni,oout , satisfies u∗ (t) = w o at and only at t = T and for
ij ij H(t, x(t), λ(t), u(t))
t < T, ui∗j (t) < wioj .
in | = 0, i.e., the in-degree of player i is zero in N N
(iii) If |Ni,o Õ Õ Õ
= fi (xi (t))+ gi j (ui j (t) − wioj )
the original graph, under linear infection cost functions
i=1 i=1 j ∈Ni,ou t
(19)
fi (xi (t)) = αi xi (t), the components pii (t) are bounded o
above by αi /σi . That is, pii (t) ≤ αi /σi for t ∈ [0, T]. N
Õ
Õ
+ λi (t) (1 − xi (t)) ui j (t)β j x j (t) − σi xi (t) ,
(iv) If |Ni,o out | = 0, i.e., the out-degree of player i is zero in
ou t
the original graph, under linear infection cost function i=1 j ∈Ni, o
where fi (xi (t)) = αi xi (t), the costate component pii (t) is
and λ(·) : [0, T] → R N is the costate function, λi is its ith
strictly monotonically decreasing over t.
component. The Hamiltonian of the optimal control problem
Proof: See Appendix B-D. (19) is different from the individual Hamiltonian defined in
Theorem 3 indicates that during the time interval [0, T), the (11). But they are related. The Hamiltonian of the optimal
agents, with an incentive to lower their own costs, adapt their control problem includes the cost of all agents over the
weight accordingly to impede the spreading of virus. After network instead of just individual’s cost. Also, the costate λ(t)
the prescribed alert duration [0, T], a recovery of topology is corresponds to the state of all agents x(t). The counterpart of
2325-5870 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCNS.2019.2931862, IEEE
Transactions on Control of Network Systems
7
2325-5870 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCNS.2019.2931862, IEEE
Transactions on Control of Network Systems
8
Algorithm 1 DVR the penalty-based differential game is inline with the solution
o ; W ; .
Input: xi0, i ∈ N; βi , σi , i ∈ N; fi (·), gi j (·), i ∈ N, j ∈ Ni,ou t o
of the optimal control problem defined in (15).
Step 1: Each i ∈ N selects ui (t) ∈ Ui arbitrarily.
Step 2: Forward part: Obtain xi (t) from xi0 and (8) for each i ∈ N.
Initially, the input data includes initial infection data: xi0
if solving social problem (15) then for all i ∈ N ; infection rate βi , recovery rate σi for all i ∈
Step 3: Backward part: Obtain λ(t) from pÛ i (t) = Γ(t)pi (t) + γ̂i (t) and N ; the original topology W o ; the cost functions fi (·), gi j (·)
p(T ) = 0. out and a stopping value to stop the
else
for all i ∈ N, j ∈ Ni,o
Step 30 : Backward part: Obtain pi (t), i ∈ N from pÛ i (t) = Γ(t)pi (t) + algorithm. In the first step, each player arbitrarily selects a
γi (t) and pi (T ) = 0. continuous control trajectory within the admissible control set
end if
Step 4: Control update: Update control ûi (t) based on (13).
Ui for every out-link it has: ui j (t) for each i ∈ N , for all
j ∈ Ni,o out , and reports the weight adaptation scheme u (t) to
Step 5: Control policy check: i
if k û − u k ∞ ≥ then the network operator. In step 2, each player utilizes the initial
Set u = û. Return to Step 2.
else
infection data xi0, i ∈ N and the control policy ui, i ∈ N ,
Set u∗i = ûi . u∗i is the optimal weight adaptation scheme for agent i. solve (8) forward in time to obtain xi, i ∈ N and report it
end if to the network operator. If the system aims to achieve the
social optimal control problem, then the algorithm goes into
step 3. Otherwise, the algorithm steps into step 30. In step 3,
and the optimal control analysis, an algorithm is proposed the system operator utilizes the reported ui and xi, i ∈ N , the
to compute the optimal weight adaptation trajectory for the infection damage cost fi (·), i ∈ N to compute p(t) backward
system operator and the agents. based on (22) and sends pi (t) back to the corresponding player
i. In step 30, the system operator utilizes the reported ui and
xi, i ∈ N , the infection damage cost fi (·), i ∈ N , computes
A. Preliminaries
pi (t), i ∈ N backward based on (10) and sends pi (t) back to the
In the simulation, the infection cost function fi (·) is given corresponding player i. In the next step, each player updates
to be linear in xi (t), i.e., fi (xi (t)) = αi xi (t). Here, we set its control based on (13) which only requires its out-neighbors
αi = α, ∀i ∈ N . The weight adaptation cost is taken to be infection information and reports the updated control policy
quadratic where gi j (ui j − wioj ) = (1/2)di j (ui j − wioj )2 for all to the network operator. Denote by ûi (t) the updated control
out . Unless otherwise stated, let α = 1, d =
i ∈ N, j ∈ Ni,o i ij policy. If k û − uk∞ ≥ , the algorithm goes back to step 2.
d = 0.2 for all i ∈ N, j ∈ Ni,o out . Note that under this setting,
Otherwise, the latest updated policy ûi is the optimal control
assumptions 1 and 2 hold and gi j (·) is even and convex. policy for agent i.
The original network in the simulation is a bi-directional
scale-free network with 150 agents generated based on the
Barabási-Albert model [27], [28]. We select this model since C. Numerical Results
many kinds of computer networks, including the internet In this subsection, we present the numerical results. First,
and the web graph of the World Wide Web, have scale- we show the dynamics of the costate function for all players.
free properties. We generate the network by following the Then, we show the evolution of the weight adaptation, the
growth and preferential attachment properties given in section infection and the costate of selected agents to see individuals’
VII of [27]. For simplicity, the original weight is set to be behaviors. Second, we give the comparisons between the
wioj = 1 for all edge (i, j). Let hk in i hk out i be the average optimal control based-weight adaptation scheme (this scheme
in-degree (out-degree) of the network we generated. We have is equivalent to the penalized differential game based-weight
hk in i = hk out i = 7.545. adaptation scheme) and the differential game-based weight
For simplicity, we take same infection rates and curing rates adaptation scheme. The optimal control based adaptation
for all players. Unless otherwise stated, let βi = β = 0.04 and scheme is from solving optimal control problem (15). The
σi = σ = 0.1. From the result in [30] and the fact that the two schemes together with the case of weight adaptation. are
largest real part of the eigenvalues of matrix (W o B − D) is compared in terms of the total cost and the infection level of
0.5368, we say that the virus epidemic will outbreak in the the whole network.
original network. The initial infection level is also set to be From (13) and φi j (t) := pii (t)(1 − xi∗ (t))β j x ∗j (t), we know
the same for all players, xi0 = 0.16 for all i ∈ N . Table I is a that the weight adaptation of player i is based on its own
summary of the setups. infection, its out-neighbors, and the costate component pii .
The infection of player i and its neighbors are just local
information. From (10) and (12), we can see the effect of the
B. Computational Algorithm whole network’s situation is conveyed by costate component
Note that we aim to propose an implementable distributed pii to the weight adaptation strategy of player i. Thus, we
virus resistance (DVR) algorithm. Based on the algorithm pro- investigate the dynamics of pii in Fig. 4, where the costate
posed in [29] for computation of open-loop NE for nonzero- component pii ’s dynamics for all agents are plotted. As we
sum differential games, we present, in algorithm 1, the DVR can see, pii (t) is positive for all i ∈ N during the whole
algorithm to compute the candidate open-loop NE solutions time interval which corroborates Theorem 3. For most of the
for the differential game described by (7) and the penalty- players, the value of pii is high at the very beginning and
based differential game defined in Theorem 4. The solution of then decreases to 0. One interpretation is that players are more
2325-5870 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCNS.2019.2931862, IEEE
Transactions on Control of Network Systems
9
TABLE I
PARAMETERS USED FOR NUMERICAL STUDY UNLESS OTHERWISE STATED .
αi xi 1
2 di j (ui j − wioj )2 1 0.2 150 1 7.545 7.545 0.04 0.1 20
35 12 1
u1,150
u150,1 p1,1
1.2 x1 10 p150,150
30 x150 0.8
1
25 8
0.8 0.6
6
20 0.6 0.4
4
15 0.4
0.2 ū1
0.2 2
ū50
10 ū150
0 0 0
0 10 20 0 10 20 0 10 20
5 t t t
0 Fig. 5. The weight adaptation, the infection and the costate of agents 1, 50
0 5 10 15 20 and 150 over time. The first plot shows the weight adaptation of edge (1, 150)
pii (t) and edge (150, 1) and the infection of agent 1 and 150. The second plot of Fig.
5 shows the costate dynamics of the two agents. The Í third gives the average
Fig. 4. The dynamics of costate pi i (t) for all players i ∈ N. weight adaptation of agents 1, 50 and 100, where ūi = j ∈N ou t ui j /|Ni,ou t
o |. i, o
u83,1
infected due to its large degree. We can see that all weights 0.75
u1,2
u1,83 0.2 u83,60 0.2
equal 1 at and only at t = T = 20, which corroborates Theorem u1,150 u83,93
0.7 0 0
3. The weight u150,1 (t) is reduced to 0 for some time. This 0 10
t
20 0 10
t
20 0 10
t
20
the network’s performance and lowering the infection. So, Differential Game Based Adaptation
No Weight Adaptation
3000
the agents with higher out-degrees may reduce less weight Optimal Control Based Adaptation
2325-5870 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCNS.2019.2931862, IEEE
Transactions on Control of Network Systems
10
No Adaptation Optimal Control Based Adaptation Game Based Adaptation distributed algorithm over large-scale networks to control the
0.8 0.8
α=0.1 α=0.5 macroscopic behaviors of the virus spreading over networks.
0.7 0.7
Numerical examples have been used to illustrate the virus-
Infection Level
Infection Level
0.6 0.6
0.5 0.5 resistance of the proposed scheme and the inefficiency of the
0.4 0.4 Nash equilibrium. The differential game approach achieves a
0.3 0.3
better performance than its centralized counterpart in terms
0.2 0.2
0.1 0.1
of the mitigation of virus spreading. One future direction for
0 5 10 15 20 0 5 10 15 20
t t this work would be to study the steady behavior of long-term
virus-resistance scheme where the duration of virus spreading
0.8 0.8
0.7
α=0.8
0.7
α=1.2 is sufficiently long.
Infection Level
Infection Level
0.6 0.6
0.5 0.5
A PPENDIX A
0.4 0.4
0.3 0.3
L EMMAS
0.2 0.2
Lemma 1. The dynamics equation G(x(t), W(t)) is uniformly
0.1 0.1
0 5 10
t
15 20 0 5 10
t
15 20
Lipschitz in x and wi for each i ∈ N .
Proof. From (3), we have
Fig. 8. The dynamics of the whole network’s infection under differential
game-based weight adaptation scheme, the optimal control-based weight
kG(x(t), W(t)) − G(x̂(t), W(t))k∞
adaption scheme, and the scheme without weight adaptation. =k(W(t)B − D)x(t) − (W(t)B − D)x̂(t)
− Xd (t)W(t)Bx(t) + X̂d (t)W(t)Bx̂(t)k∞
Jo under the three schemes for different α. We observe that ≤k(I N − Xd (t))W(t)B(x(t) − x̂(t))k∞
no adaptation scheme cause the most total cost. For different + kD(x(t) − x̂(t))k∞ + k(Xd (t) − X̂d (t))W(t)Bx(t)k∞
values of α, the NE-based scheme always incurs a higher cost
≤2β M kW(t)k∞ kx(t) − x̂(t))k∞ + δ M kx(t) − x̂(t))k∞
than the optimal control-based scheme, which indicates the
inefficiency of the NE solution. From the plot, we see that a ≤ max{2β M , δ M }kx(t) − x̂(t))k∞ (1 + kW(t)k∞ ),
higher α causes more inefficiency. where β M B maxi βi and δ M B maxi δi , and I N B
Fig. 8 is presented to show the virus-resistance of the diag(1, 1, · · · , 1). Since wi j is bounded by w̄i j , (1 + kW(t)k ∞ )
proposed schemes. The black line shows the infection level is bounded. So, G is uniformly Lipschitz in x. The proof for
for the case without adaptation, the blue line shows the case G is uniformly Lipschitz in wi for all i ∈ N can be obtained
with the game-based scheme, and the green line shows the case by following the similar steps.
with the optimal control-based scheme. Even though the game-
based scheme is inefficient in terms of minimizing the total Lemma 2. Let xi , i ∈ N be the corresponding solution to the
cost, it outperforms the optimal control-based scheme since ODEs in (2). For all i ∈ N , given 0 < xi (0) ≤ 1, xi (t) ∈ (0, 1)
the infection level under the game-based scheme is always holds for all t ∈ (0, ∞).
lower than the infection level under the optimal control-
Proof. The proof follows Lemma 1 of [14]. It is clear that
based scheme. No matter in which case, the scheme we have
xi (t) is a continuous function of time. When xi (0) = 1,
proposed has proven to be virus-resistant and generated a
then from (2), we have xÛi (0) < 0, which means once xi (t)
lower total cost than the scheme without adaptation did.
reaches 1, it cannot stay there. On the other hand, when
xi (0) ∈ (0, 1), the solution xi (t) would always lie in (0, 1).
VI. C ONCLUSION AND F UTURE W ORK Otherwise, suppose that there exists t1 such that xi (t1 ) = 0 or
In this paper, we have established a differential game frame- xi (t1 ) = 1. In the first case, note that xÛi (t) ≥ −σi xi (t) holds for
work to develop decentralized virus-resistant mechanisms over the time interval (0, t1 ], which gives xi (t1 ) ≥ xi (0)e−σi t1 > 0.
complex networks. We have shown that weight adaptation It yields a contradiction. In the second case where xi (t1 ) = 1,
policies allow nodes to change weights to mitigate their we have xi (t) < 1 over time interval [0, t1 ). So, we have
infection. The differential game approach has captured the xÛi (t1− ) ≥ 0 which contradicts the fact obtained from (2) that
strategic and dynamic behaviors of a large number of self- xÛi (t1 ) < 0.
interested agents over time-varying networks. Each player
adapts its weight based on its own infection and its out- A PPENDIX B
neighbors infection. It has been observed that the higher levels PROOF
of its out-neighbors’ infection lead to lower weights. The effect
of non-local behaviors on the adaptation strategy has been A. Proof of Observation 1
encoded in the costate function. We have discussed the ineffi- Proof. Given the original weight pattern W o , suppose that
ciency of the open-loop Nash equilibrium and have proposed wioj = 0, i.e., j < Ni,o
out and w ∗ (t) > 0. Obviously, player i can
ij
a penalty-based mechanism to achieve efficiency by imposing lower his own cost by deviating wi j (t) from wi∗j (t) to 0, which
local costs induced by reachable nodes. The differential game contradicts the fact that wi∗j (t) is the open-loop NE. A similar
framework has enabled the design and implementation of a statement can be made for the case when wi∗j > wioj .
2325-5870 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCNS.2019.2931862, IEEE
Transactions on Control of Network Systems
11
2325-5870 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCNS.2019.2931862, IEEE
Transactions on Control of Network Systems
12
for agent 5 is {1, 4, 5, 2, 3}. Thus, the dynamics of p5 under [13] Gross, T., DLima, C.J.D. and Blasius, B., “Epidemic dynamics on
the differential game given in Corollary 2 can be written as an adaptive network”. Physical review letters, vol. 96, no. 20, 2006,
p.208701.
pÛ51 p51 − f 0(x1 ) [14] Paré, P.E., Beck, C.L. and Nedich, A., “Epidemic processes over time-
pÛ54 " u # p − f10(x ) varying networks”. IEEE Transactions on Control of Network Systems,
Γ 03×2 54 4 4 vol. 5, no. 3, pp. 1322-1334, Sept. 2018.
pÛ55 = lR5 p55 + − f 0(x5 ) ,
r 5 [15] Feng, Y., Ding, L., Huang, Y.H., Zhang, L. “Epidemic spreading on
pÛ52 ΓR5 ΓR
weighted networks with adaptive topology based on infective informa-
5
p52
0
tion”. Physica A: Statistical Mechanics and its Applications, vol. 463,
pÛ53 p53 0
2016, pp.493-502.
where [16] Huang, Y., Ding, L. and Feng, Y. “A novel epidemic spreading model
with decreasing infection rate based on infection times”. Physica A:
u∗ β x ∗ + σ1
Í
0 0 Statistical Mechanics and its Applications, vol. 444, 2016, pp.1041-
j ∈ {2,3} 1j j j
1048.
ΓR5 = −(1 − x1 )u14 β4 u4j β j x j + σ4 0 ,
u
∗ ∗ Í ∗ ∗
[17] Draief, M. and Massouli, L. Epidemics and rumours in complex net-
j ∈ {3,5} works. Cambridge University Press, 2010.
0 −(1 − x4∗ )u45 ∗ β σ5 [18] Van Mieghem, P., Omic, J. and Kooij, R., “Virus spread in networks”.
5
IEEE/ACM Transactions on Networking 17, no. 1 2009, pp. 1-14.
−(1 − x1∗ )u12 ∗ β
2 0 0 [19] Başar, T., Olsder, G.J. Dynamic noncooperative game theory, SIAM,
ΓlR5 = ∗ β ,
−(1 − x1∗ )u13 3 −(1 − x4 )u43 β3 0
∗ ∗ 1999.
[20] Young, D. M. Iterative solution of large linear systems. Elsevier, 2014.
u23 β3 x3∗ + σ2
∗
0 [21] Lin, C. T., “Structural controllability”. IEEE Transactions on Automatic
ΓrR5 = ∗ β .
−(1 − x2∗ )u23 3 σ3
Control, vol. 19, no. 3, 1974, pp. 201-208.
[22] Dubey, P.“Inefficiency of Nash equilibria”. Mathematics of Operations
Thus, if we let ci (t) = j ∈R i, o \{i } f j (x j ), the dynamics of pii
Í
Research, vol. 11, no. 1, 1986, pp. 1-8.
described by (22) is consistent with the dynamics of the ith [23] Başar, T., Zhu, Q. “Prices of anarchy, information, and cooperation in
differential games”. Dynamic Games and Applications, vol. 1, no. 1,
component of λ described by (17). By solving the optimization 2011, pp. 50-73.
problem (18), we know that the optimal control problem shares [24] Roughgarden, T., Tardos, É.“Bounding the inefficiency of equilibria in
the same control rule (13) with the differential game problem. nonatomic congestion games”. Games and Economic Behavior, vol. 47,
no. 2, 2004, pp.389-403.
Since pii (t) = λi (t) for every t ∈ [0, T], we can see the [25] Fonseca-Morales, A. and Hernndez-Lerma, O., “Potential differential
statement in Corollary 2 holds. games.” Dynamic Games and Applications, vol. 8, no. 2, 2018, pp. 254-
279.
R EFERENCES [26] Monderer, D., and Lloyd S. S., “Potential games.” Games and Economic
Behavior, vol. 14, no. 1, 1996, pp. 124-143.
[1] Farwell, J.P., Rohozinski, R. “Stuxnet and the future of cyber war”. [27] Albert, R., Barabási, A.L. “Statistical mechanics of complex networks”.
Survival, vol. 53, no. 1, 2011, pp.23-40. Reviews of Modern Physics, vol. 74, no. 1, 2002, p. 47.
[2] Falliere, N., Murchu, L.O. and Chien, E., “W32. stuxnet dossier”. White [28] Barabási, A.L., Albert, R. “Emergence of scaling in random networks”.
paper, Symantec Corp., Security Response, vol. 5, no. 6, 2011, p.29. Science, vol. 286, no. 5439, 1999, pp. 509-512.
[3] Hayel, Y., Zhu, Q. “Epidemic protection over heterogeneous networks [29] Holt, D., Muktjndan, R.“A Nash algorithm for a class of non-zero sum
using evolutionary Poisson games”. IEEE Transactions on Information differential games”. International Journal of Systems Science, vol. 2, no.
Forensics and Security, vol. 12, no. 8, 2017, pp. 1786-1800. 4, 1972, pp. 379-387.
[4] Guo, D., Trajanovski, S., van de Bovenkamp, R., Wang, H. and [30] Ganesh, A., Massoulié, L. and Towsley, D. “The effect of network
Van Mieghem, P. ”Epidemic threshold and topological structure of topology on the spread of epidemics,” In Proc. of the 24th Annual
susceptible−infectious−susceptible epidemics in adaptive networks”. Joint Conference of the IEEE Computer and Communications Societies
Physical Review E, vol. 88, no. 4, 2013, p.042802. (INFOCOM 2005), vol. 2, 2005, pp. 1455-1466.
[5] Khouzani, M. H. R., Altman E. and Sarkar, S. “Optimal quarantining
of wireless malware through reception gain control.” IEEE Transactions
on Automatic Control, vol. 57, no. 1, 2012, pp. 49–61.
[6] Zhu, Q., Fung, C, Boutaba, R. and Başar, T. “GUIDEX: A game-
theoretic incentive-based mechanism for intrusion detection networks.” Yunhan Huang received the B.Eng. degree in
IEEE Journal on Selected Areas in Communications, vol. 30, no. 11, Automation with honor from Wuhan University,
pp. 2220-2230, 2012. Wuhan, China, in 2016. From 2016-2017, he was a
[7] Wang, X., Xiao N., Xie, L., Frazzoli, E. and Rus, D. “Analysis of Price project officer at the School of Electrical and Elec-
of Total Anarchy in Congestion Games via Smoothness Arguments.” tronic Engineering, Nanyang Technological Univer-
IEEE Transactions on Control of Network Systems, vol. 4, no. 4, pp. sity. He is currently studying for a Ph.D in Electrical
876-885, Dec. 2017. Engineering, at New York University, NY, USA.
[8] Trajanovski, S., Kuipers, F.A., Hayel, Y., Altman, E. and Van Mieghem,
P. “Designing virus-resistant, high-performance networks: a game-
formation approach”. IEEE Transactions on Control of Network Systems,
vol. 5, no. 4, pp. 1682-1692, Dec. 2018.
[9] Hayel, Y., Trajanovski, S., Altman, E., Wang, H. and Van Mieghem, P.
“Complete game-theoretic characterization of SIS epidemics protection
strategies”. In Proc. of 2014 IEEE 53rd Annual Conference on Decision
and Control (CDC), pp. 1179-1184.
[10] Shulgin, B., Stone, L. and Agur, Z. “Pulse vaccination strategy in the Quanyan Zhu (SM02-M14) received B. Eng. in
SIR epidemic model”. Bulletin of mathematical biology, vol. 60, no. 6, Honors Electrical Engineering from McGill Univer-
2018, pp.1123-1148. sity in 2006, M.A.Sc. from University of Toronto
[11] Hu, P., Ding, L., Hadzibeganovic, T. “Individual-based optimal weight in 2008, and Ph.D. from the University of Illinois
adaptation for heterogeneous epidemic spreading networks”. Commu- at Urbana-Champaign (UIUC) in 2013. After stints
nications in Nonlinear Science and Numerical Simulation, 63, 2018, at Princeton University, he is currently an associate
pp.339-355. professor at the Department of Electrical and Com-
[12] Preciado V. M., Zargham, M., Enyioha, C., Jadbabaie, A., and Pappas, puter Engineering, New York University. His current
G. J., “Optimal Resource Allocation for Network Protection Against research interests include Internet of Things, cyber-
Spreading Processes,” IEEE Transactions on Control of Network Sys- physical systems, game theory, machine learning,
tems, vol. 1, no. 1, pp. 99-108, March 2014. cyber deception, network optimization and control.
2325-5870 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.