You are on page 1of 15

On the Privacy Guarantees of Gossip Protocols in

General Networks
Richeng Jin, Student Member, IEEE, Yufan Huang, Student Member, IEEE,
Huaiyu Dai, Fellow, IEEE

Abstract—Recently, the privacy guarantees of information dis- to provide a certain form of source anonymity since most nodes
semination protocols have attracted increasing research interests, don’t get informed directly from the source, and the origin of
among which the gossip protocols assume vital importance in the information becomes increasing blurred as the spreading
various information exchange applications. In this work, we study
the privacy guarantees of gossip protocols in general networks proceeds. In this regard, source identification and protection
arXiv:1905.07598v2 [cs.CR] 5 Feb 2021

in terms of differential privacy and prediction uncertainty. First, of gossip protocols have attracted significant research interests
lower bounds of the differential privacy guarantees are derived (see [6], [7] and the references therein). However, the existing
for gossip protocols in general networks in both synchronous approaches usually assume some specific network structures
and asynchronous settings. The prediction uncertainty of the (e.g., tree graphs) and attacking techniques (e.g., maximum
source node given a uniform prior is also determined. For the
private gossip algorithm, the differential privacy and prediction likelihood estimator) and don’t easily generalize.
uncertainty guarantees are derived in closed form. Moreover, To study the privacy of gossip protocols in a formal and
considering that these two metrics may be restrictive in some rigorous setting, the concept of differential privacy [8], which
scenarios, the relaxed variants are proposed. It is found that was originally introduced in data science, is adapted to measure
source anonymity is closely related to some key network structure the source anonymity of gossip protocols in [9]. However, their
parameters in the general network setting. Then, we investigate
information spreading in wireless networks with unreliable study is restricted to complete networks, which may not be
communications, and quantify the tradeoff between differential a good model in practice. For example, practical networks
privacy guarantees and information spreading efficiency. Finally, often have a network diameter much larger than 1 (41 for the
considering that the attacker may not be present at the beginning Facebook network [10]).
of the information dissemination process, the scenario of delayed In this work, we extend the study of the fundamental limits
monitoring is studied and the corresponding differential privacy
guarantees are evaluated.
on the privacy of gossip-based information spreading protocols
Index Terms—Information spreading, gossip protocols, differ- to general networks. Our main contributions are summarized
ential privacy, prediction uncertainty. as follows.
1) Lower bounds of the differential privacy guarantees of
I. I NTRODUCTION general gossip protocols are derived for general networks
It is well-known that most people are six or fewer social in both synchronous and asynchronous settings. The
connections away from each other. Recently, the explosive prediction uncertainty of the source node given a uniform
development in the Internet and social networks makes it easy prior is also determined.
for people to disseminate their information to the rest of the 2) For the private gossip algorithm [9], the differential
world. Gossip protocols, in which networked nodes randomly privacy guarantees and prediction uncertainty are derived
choose a neighbor to exchange information, have been widely in closed form. In addition, considering that the original
adopted in various applications for information dissemination differential privacy and prediction uncertainty may be
due to their simplicity and efficiency. For instance, they can restrictive in some scenarios, candidate set based dif-
be used to spread and aggregate information in dynamic ferential privacy and prediction uncertainty variants are
networks like mobile networks, wireless sensor networks, and proposed. Numerical results are presented to show the
unstructured P2P networks [1]–[3]. Combined with stochastic privacy guarantees of the private gossip algorithm in
gradient descent methods, gossip protocols are also adapted to different network structures.
implement distributed machine learning [4], [5]. In particular, 3) The privacy guarantees of standard gossip and private
the authors of [5] propose to transmit differentially private gossip protocols are further studied in a wireless setting,
gradient information through gossip protocols. Nonetheless, where communications are assumed to be unreliable. It
they focus on the privacy of the shared gradient information is found that wireless interference can enhance the differ-
rather than the anonymity of the source. ential privacy while slowing down the spreading process.
With the arising concerns of privacy exposure, the informa- Through analysis and simulations, the tradeoff between
tion sources often prefer to stay anonymous while disseminat- the differential privacy guarantees and the information
ing some sensitive information. Gossip protocols are believed spreading efficiency is revealed.
4) The effect of the additional uncertainty induced by delayed
R. Jin, Y. Huang and H. Dai are with the Department of Electrical and
Computer Engineering, North Carolina State University, Raleigh, NC, USA, monitoring on the privacy guarantees is shown.
27695 (e-mail: rjin2,yhuang20,hdai@ncsu.edu). The remainder of this work is organized as follows. Sec-
tion II introduces the System model. The privacy guarantee
3
of gossip protocols in general networks are presented in
Section III. The privacy-spreading tradeoff of gossip protocols
in wireless networks is discussed in Section IV. Section V 5
investigates the privacy guarantees of gossip protocols in the 2
4 1
delayed monitoring scenario. Conclusion and future works are 6
discussed in Section VI.
II. S YSTEM M ODEL
(i = 4, t = 0) Synchronous (i = 4, t = 0) Synchronous
(i = 2, t = 1) Observed Events (i = 4, t = 1) Actual Events Uninformed and
A. Gossip Protocol (i = 1, t = 3) (i = 2, t = 1) Informed Nodes
…. …. Sensor
In this work, we investigate the privacy of information source (i = 4|t = 0.5s) Asynchronous (i = 6|t = 0.2s) Asynchronous
(i = 2|t = 1.2s) Observed Events (i = 4|t = 0.5s) Actual Events
in gossip-based information spreading. The goal is to measure (i = 1|t = 2.2s) (i = 2|t = 1.2s)
Attacker

the capability of gossip protocols in keeping the information …. ….

source anonymous. Specifically, given a connected network


G = (V, E) of arbitrary topology, where V = {0, 1, ..., n − 1} Fig. 1: Sensor Monitoring and Observations
is the node set and E is the set of connecting edges, a node node set is updated accordingly, counted as one round. In the
(source) initially possesses a piece of information and needs to asynchronous time model, each node has its own internal clock,
deliver it to all the other nodes in the network. All the nodes which ticks according to a Poisson process, with the mean
are assumed to share the same communication protocol gossip. interval between two ticks equivalent to that of one round in
Each time an informed node i performs gossip, it will contact the synchronous model. The gossip action and update of the
one of its neighboring nodes j ∈ Ni uniformly at random informed node set is performed each time the clock of an
(i.e., with probability 1/di , where di is the degree of node active informed node ticks.
i). The whole information dissemination process terminates
after all the nodes are informed. Same as [9], we focus on the C. Threat Model
gossip protocols based on the “push” action1 in this work, and The goal of the attacker is to identify the source node based
consider the following two specific gossip protocols. on its observations (i.e., attack on confidentiality and privacy).
1) Standard Gossip: All informed nodes remain active (i.e., It is assumed that the attacker can monitor the ongoing
continuously performing gossip) during the spreading communications in the whole network, through, e.g., deploying
process. a sufficient number of sensors throughout the field. With a
2) Private Gossip [9] (i.e., Algorithm 1): Once an active probability of 0 < α ≤ 1, the sensors can correctly observe the
informed node (initially it is the source) performs gossip, identities of the active nodes at each gossip step. Specifically,
it turns inactive, and the newly informed node takes over as shown in Fig. 1, the observed event has the form of
the source role. S = ((i, t)), i ∈ V, t ∈ {0, 1, 2, · · · } in the synchronous
setting, which indicates that the attacker knows node i performs
Algorithm 1 Private Gossip Algorithm [9] the gossip action at time slot t. In the asynchronous setting,
1: Require: The number of nodes n, the source node k. however, the attacker does not know the exact time of each
2: Ensure: The informed node set I = {0, 1, · · · , n − 1} observed event, but only the relative order of the nodes’
3: Initialization: Informed node set I ← {k}, active node activities. The observed event in this case is represented by
set Ac ← {k} S = ((i|t)), where the condition t stands for the latent time
4: while |I| < n do information unknown to the attacker.
5: The active node in Ac performs gossip and another D. Privacy Model
node j is informed
In this work, differential privacy is adopted to measure the
6: I ← I ∪ {j}, Ac ← {j}
information leakage of the gossip protocols. In particular, a ran-
7: end while
domized algorithm R with domain N|χ| is (, δ)-differentially
private if for all S ⊆ Range(R ) and for any two databases
B. Time Model x, y that differ on a single element [8]:
Both synchronous and asynchronous time models are consid- P r[R (x) ∈ S] ≤ e P r[R (y) ∈ S] + δ, (1)
ered. In the former, all nodes share a global discrete time
clock. Each time the clock ticks, all active informed nodes where parameter  ≥ 0 is the privacy budget while δ ≥ 0 is the
perform the gossip action simultaneously, and the informed tolerance level for the violation of the  bound. Specifically,
given the privacy budget  and the tolerance level δ, Eq. (1)
1 In the corresponding “pull” action, uninformed nodes are active and try to implies that the randomized algorithm guarantees that the
solicit the information from informed nodes. The “push” action is dominant privacy loss is bounded by  with a probability of at least
for information spreading in social and mobile networks. In addition, such a
study is also conservative in the sense that it gives the attacker an advantage 1 − δ. Intuitively, the smaller the  and the δ, the better the
by only monitoring the “push” actions. privacy. Consider a source indicator database of the format
D (i) = [0, ..., di = 1, ..., 0] with exactly one nonzero value formula. Prediction uncertainty is an appealing metric in this
di = 1 if node i is the source. Given D , {D (i) }n−1
i=0 and the study as it measures the privacy guarantees from the source
graph G as the input, a gossip protocol can be treated as a prediction perspective with a much smaller cardinality than the
randomized algorithm with the output set S (i.e., the range) classic privacy budget (which requires the study of all pairs of
(i) (j)
consisting of all possible observation sequences by the attacker pG (S) and pG (S)). Moreover, given a prediction uncertainty
1
during the execution of the protocol. c, it can be shown that pG (I0 = {i}|S) ≤ c+1 , ∀i, S; therefore
a larger c indicates better source anonymity.
Definition 1. Given a general network G, a gossip protocol is
(, δ)-differentially private in G if for all observations S ⊆ S In order to further remedy the aforementioned issue of
and for any two source indicator vectors D (i) , D (j) , i, j ∈ V : differential privacy, a relaxed differential privacy variant,
(i) (j)
pG (S) ≤ e pG (S) + δ, (2) termed differential privacy within a candidate set, is proposed.
Its definition is given as follows.
(i)
where pG (S) = P r[S|G, D (i) ] is the conditional probability
of an observation event S given the network graph G and the Definition 3. Given a general network G, a gossip protocol
source indicator vector D(i) . is (, δ)-differentially private within a candidate set Q in G if
for all observations S ⊆ S and for any two source indicator
In this work, considering the fact that, due to the topological vectors D (i) , D (j) , i, j ∈ Q ⊆ V :
and observation model constraints, there may exist some (rare)
(j)
events S such that pG (S) = 0 (e.g., if Si,0 is the observed (i) (j)
pG (S) ≤ e pG (S) + δ, (4)
event that node i performs gossip at time 0 in the synchronous
(j)
setting, then pG (Si,0 ) = 0, ∀j 6= i), additional tolerance level (i)
is needed to ensure the privacy guarantees. Thus, for the privacy where pG (S) = P r[S|G, D (i) ] is the conditional probability
guarantees of general gossip protocols, we mainly focus on the of an observation event S given the network graph G and the
study of the tolerance level δ. For the private gossip algorithm source indicator vector D(i) .
(i.e., Algorithm 1) in the asynchronous setting, we are able to Remark 2. The only difference between differential privacy
derive the corresponding differential privacy level  and give and differential privacy within a candidate set is that the
some analysis. privacy guarantee is ensured for the source nodes falling
In addition to differential privacy, it is also desirable to study in a candidate set Q instead of the whole network V . Note
privacy guarantees of information dissemination protocols from that the notion of differential privacy is highly conservative
a more pertinent perspective, i.e., source identification through (considering the worst-case scenario). If there exists a node
prediction or detection. Reusing the above example, there (i)
(i) i such that pG (S) = 0, the probability of node i being
always exist some events S such that pG (S) > 0 for some the source is 0 from the attacker’s perspective. In such a
(j)
i but pG (S) = 0, ∀j 6= i ∈ V , which satisfy an arbitrary case, the attacker can always differentiate node i from other
(i)
privacy budget  with a tolerance level of δ (if pG (S) ≤ δ). nodes, which means that the differential privacy guarantee
However, the identity of the source (i.e., node i) can still will be violated with a high probability. However, on the
be easily inferred. Therefore, it is further required that some one hand, it does not necessarily mean that the gossip
prediction uncertainty be guaranteed for a given differentially protocols cannot provide effective source privacy protection.
private protocol, which is defined as [9]: Particularly, it may be difficult for the attacker to distinguish
Definition 2. Given a general network G, the prediction any pairs of other nodes except node i in the network (e.g.,
uncertainty of a gossip protocol is defined for a uniform prior pjG (S) = pzG (S), ∀j 6= z ∈ V \ {i}). On the other hand, in
pG (I0 ) on source nodes and any i ∈ {0, 1, ..., n − 1} as: practice, it may not be necessary for the source node to hide
  itself in the whole network. Instead, it may be enough to
pG (I0 6= {i}|S) make itself indistinguishable among a subset of the network
c = min
i,S⊆S pG (I0 = {i}|S) (e.g., its neighbors). Our definition of differential privacy
  (3)
1 (i) within a candidate set measures the privacy guarantee of
= min − 1, ∀pG (S) > 0,
i,S⊆S pG (I0 = {i}|S) the gossip protocols in such scenarios. Especially, as long
(j)
where I0 stands for the initial informed node set and its as pG (S) > 0, ∀j ∈ Q , δ = 0 is always feasible for
element represents the source node. differential privacy within the candidate set Q . In addition,
when Q = V , it is equivalent to the original definition of
Remark 1. The connection of prediction uncertainty and differential privacy. Therefore, it can be understood as a
differential privacy is illustrated below. If the attacker ob- relaxed version of differential privacy.
tains an observation S, differential privacy measures the
probabilities of it observing S given different sources while Similarly, the prediction uncertainty within a candidate set
prediction uncertainty considers the posterior probabilities of is defined as follows.
different sources given S. Especially, because of the uniform
P (j) Definition 4. Given a general network G, the prediction
pG (I0 6={i}|S) j6=i pG (S)
prior pG (I0 ), pG (I0 ={i}|S) = (i) holds by the Bayes’ uncertainty of a gossip protocol within the candidate set Q is
pG (S)
defined for a uniform prior pG (I0 ) on source nodes over the 0.300
1
candidate set Q as: 2
1
5 6
P 6 2 0.295

j6=i∈Q pG (I0 = {j}|S)



(i) 3

δ
3
c = min , ∀pG (S) > 0. 4
i∈Q ,S⊆S pG (I0 = {i}|S) 5
4
0.290

(5) 0.285
0 10 20 30 40 50
DG

III. P RIVACY OF G OSSIP P ROTOCOLS IN G ENERAL Fig. 2: Node 1 and node 6 are Fig. 3: Tolerance Level v.s.
N ETWORKS more distinguishable in the Network Diameter
In this section, the privacy guarantees of gossip protocols are right network. (α = 0.3, n = 50,  = 0.01).
investigated. For the general gossip protocols, general results
concerning the lower bounds of the tolerance level δ and the and
C1−α (i)
upper bounds of the prediction uncertainty c are obtained. For c ≤ min , (8)
i∈V α
the private gossip algorithm in the asynchronous setting, the
pure version of differential privacy (δ = 0) is feasible, and where C1−α (i) is the decay centrality of node i with decay
the corresponding differential privacy level  and prediction parameter 1 − α.
uncertainty c are derived.
Proof: Please see Appendix A.
A. General Gossip Protocols Remark 4. Some interpretations of the results of Theorem 1
In this subsection, the privacy guarantees of general gossip are in order. It can be observed that the asynchronous setting
protocols in general networks are studied. To facilitate our provides better privacy guarantees than the synchronous setting,
following analysis, we need the following lemma and the since the attacker has less information (i.e., the timing of the
definition of decay centrality. events) in this case. Note that differential privacy considers
the worst case scenario. In the synchronous setting, when
Lemma 1. Given any gossip protocol in a graph G, let the attacker detects the activity of a node at time 0, it can
(i) (j)
S ⊆ S and there are two constants wG (S), wG (S) such infer that the corresponding node is the source immediately.
(i) (i) (j) (j)
that pG (S) ≥ wG (S) and pG (S) ≤ wG (S). If the Therefore, the prediction uncertainty is 0 due to this worst-
gossip protocol satisfies (, δ)-differential privacy, then δ ≥ case event, and the privacy guarantees are determined by
(i) (j)
maxS,i,j (wG (S) − e wG (S)). the attacker’s sensing capability α in the synchronous setting.
In the asynchronous setting, however, the attacker could not
Lemma 1 readily follows from the definition of differential
directly infer the source solely based on the first-observed
privacy; its proof is omitted in the interest of space.
event due to the lack of associated timing. A counter example
Definition 5. [11] Given a network G and a decay parameter can be found in Fig. 1.
β, 0 < β < 1, the decay centrality of node i is defined as As a result, the structure of the network plays an important
X role in the asynchronous setting. In the context of information
Cβ (i) = β d(i,j) , (6) spreading, if two nodes are further apart, it takes more time
j6=i for the information to be spread from one to the other; this
where d(i, j) is the length of the shortest path between node duration gives the attacker more opportunities to differentiate
i and j. the detected events, which leads to potentially higher privacy
loss of the source node’s identity. For instance, in the left
Remark 3. Decay centrality measures the ease of a node network of Fig. 2, considering the event S1,0 , i.e., node 1’s
reaching out to other nodes in the network. A large decay activity being the first observed event by the attacker in the
centrality indicates the central positioning of a node and its asynchronous setting, the probability of this event given that
easiness to reach other nodes. The difficulty increases as β (6)
the source is 6 is pG (S1,0 ) ≤ (1 − α) according to (26)
decreases. and the probability of this event given that the source is 1 is
(1)
Our main result concerning the privacy guarantees of general pG (S1,0 ) ≥ α. But in the right network, the corresponding
(6) (1)
gossip protocols in a general network is given below. probabilities are pG (S1,0 ) ≤ (1 − α)5 and pG (S1,0 ) ≥ α,
which makes S1,0 a more distinguishable event in the right
Theorem 1. Given a connected network G with n nodes network. Therefore, the network diameter DG , as the distance
and diameter DG = maxi,j∈V,i6=j d(i, j), and considering the measure of the whole network, captures the potential privacy
observation model described in Section II-C with parameter α, loss and becomes a key factor of the differential privacy lower
if a gossip protocol satisfies (, δ)-differential privacy for any bound in (7); an example of the relationship between the
 ≥ 0 and c-prediction uncertainty, then we have δ ≥ α and differential privacy tolerance level bound and the network
c = 0 in the synchronous setting. In the asynchronous setting, diameter is shown in Fig. 3. The same logic is reflected on
the prediction uncertainty given in (8). The smaller the decay
 
1−α
δ ≥ max α − e (1 − α)DG , α − e (7) centrality a network has (i.e., the nodes are more distant
n−1
absorbing state (i.e., node i will not push its message to any
1/2 4 1/2 other nodes). Let Âm i denote the m-th power of the matrix
m−1
Âi . Then, given the source node j, Âmi [j, i] and Âi [j, i]
1/2 1/2 are the probabilities of node i being active at time m and
1 1/2 1/2 3 m − 1, respectively. Note that if node i is active at time
m − 1, it is active at time m with probability 1 since it is an
1/2 1/2 absorbing state. In this sense, given the source node j, the
probability of node i being active for the first time at time m
2 is Âm m−1
i [j, i] − Âi [j, i].
Fig. 4: The index of the active node in Algorithm 1 for a ring To facilitate the discussion, we introduce the following
graph with 4 nodes. Each node has two neighbors and the quantity.
probability of it contacting each neighboring node is 12 . For
Definition 6. The probability of secret message spreading
instance, if node 1 is active (i.e., the Markov chain is in state
from the source node j to another node i, denoted by P (j → i),
1) at time m − 1, then the probabilities of node 2 and node 4
is defined as the probability that, given the source node j, node
being active at time m are both 21 .
i becomes active before the attacker observes the first event.
Remark 6. In the asynchronous setting, P (j → i) measures
from each other), the more likely the attacker can identify the the similarity between node j and node i from the attacker’s
source node through its observations. Therefore, the inherent perspective. Intuitively, given source node j, if node i becomes
network structure imposes certain limit on privacy preserving active before the attacker observes the first event, the attacker
concerning the source node identity, which applies to all cannot differentiate node j and node i based on its observation.
information spreading protocols and calls for other privacy The larger the P (j → i), more difficult it is for the attacker
protection mechanisms, to be further explored in future work. to differentiate node j and node i.
In addition, it can be seen that as the attacker’s sensing
capability α increases the privacy guarantees decrease (i.e., δ With such consideration, the following lemma can be proved.
increases and c decreases). In particular, for an omnipresent Lemma 2. For Algorithm 1, given a general network G with
attacker with α = 1, we have δ = 1 and c = 0 even in the source node j and the observation model described in Section
asynchronous setting. II-C with parameter α, the probability of secret message
Finally, given (7), the lower bound of the differential privacy spreading from the source node j to another node i for the
level  in the asynchronous setting for a given δ can be obtained first time is given by
as follows ∞
X
−1
P (j → i) = α (1−α)m Âm
   
α−δ (α − δ)(n − 1) i [j, i] = α(I−(1−α)Âi ) [j, i],
e ≥ max , . (9)
(1 − α)DG 1−α m=0
(11)
B. Private Gossip Algorithm where I is an identity matrix and Âi is the transition probability
In this subsection, the differential privacy level  (with the matrix defined in (10).
tolerance level δ = 0) and the prediction uncertainty of Proof: Please see Appendix B.
Algorithm 1 in the asynchronous setting is examined.2 Given Lemma 2, the privacy guarantees of Algorithm 1 can
In this case, since only one node is active at each time slot, be shown as follows.
the index of the active node can be modeled as a Markov
chain with a transition probability matrix A. The (j, i)-th entry Theorem 2. Given a general network G and the observation
of A, denoted by A[j, i], measures the probability of node j model described in Section II-C with parameter α, Algorithm
contacting node i once it becomes active. Fig. 4 shows an 1 is (, 0)-differentially private in the asynchronous setting,
exemplary Markov chain of a ring graph with 4 nodes. In where
addition, to facilitate the discussion, we further define another
 
1
matrix Âi as follows.  = ln max . (12)
j6=i∈V α(I − (1 − α)Âi )−1 [j, i]

 The prediction uncertainty of Algorithm 1 in the asynchronous



 A[j, k], if j 6= i, setting is given by

Âi [j, k] = 0, if j = i and j 6= k, (10) X
 c = min α(I − (1 − α)Âi ))−1 [j, i]. (13)
 i∈V
1, if j = k = i.

j6=i∈V

Remark 5. Note that Âi corresponds to the transition Proof: Please see Appendix C.
 
probability matrix of the Markov chain by setting node i as an 1
Remark 7. Note that  = ln maxj6=i∈V P (j→i) and P (j →
P∞ m m
2 Note that in the synchronous setting, δ = 0 may not be achievable. i) = α m=0 (1 − α) Âi [j, i]. For any m < d(j, i), it can
30 100

25 Complete Graph
Complete Graph 80
Ring Graph
20 Star Graph
Ring Graph
Grid Graph
60

Diameter
15 Grid Graph Regular Graph


ER Graph
10 GR Graph 40
Regular Graph
5 20
0
0 100 200 300 400 500 0
Number of Nodes 0 100 200 300 400 500
Number of Nodes
Fig. 5:  v.s. number of nodes (α = 0.5)
Fig. 7: Diameter v.s. number of nodes
1.0

Complete Graph
An ER graph is a graph in which each node is randomly
Prediction uncertainty

0.8
Star Graph
Ring Graph connected to another node with a certain probability. A GR
0.6 Grid Graph
ER Graph
graph is constructed by randomly placing the nodes in some
0.4
GR Graph metric space with the euclidean distance and connecting two
Regular Graph
nodes by an edge if and only if their distance is less than a
0.2 specified parameter. A regular graph is a graph in which every
0 100 200 300 400 500
Number of Nodes
node is randomly connected to a fixed number of nodes. In our
simulation, we generate these three random graphs using the
Fig. 6: c v.s. number of nodes (α = 0.5) NetworkX package in Python such that the average degrees
are 10.
It can be observed that as the number of node increases,
be verified that Âm
i [j, i] = 0. On the other hand, (1 − α)
m
the differential privacy level  increases for all the graphs. On
decreases exponentially. In this sense, P (j → i) is supposed
the one hand, in general, as the number of nodes increases,
to decrease (and therefore  will increase) as d(j, i) increases,
the probability of each node i being active given the source
which verifies our discussion about the importance of the
node j (and therefore P (j → i)) decreases.3 . On the other
network structure (i.e., diameter) on differential privacy.
hand, the inequality in (2) should be satisfied for any pair
In particular, given the (, 0)-differential privacy guarantee, of nodes. That being said, as the number of nodes increases,
the corresponding tolerance level δ(0 ) can be obtained for any more inequalities need to be satisfied, which enforces a larger
0 > 0 using the following corollary. .
Corollary 1. Any (, 0)-differentially private mechanism is For prediction uncertainty, different graphs exhibit different
also (0 , δ(0 ))-differentially private for all 0 > 0, where trends. One possible reason is that for the complete graph
 0   0  and the regular graph, almost all the nodes are the same from
0  µ1 0  µ1 the attacker’s viewpoint. As a result, as the number of nodes
δ( ) = Φ − + −e Φ − − , (14)
µ1 2 µ1 2 increases, it becomes more difficult to identify the source node.
and µ1 is the solution to the following equation On the other hand, since prediction uncertainty considers the
    worse case scenario, as the number of nodes increases, the
 µ1  µ1 probability that there exists a node different from the other
Φ − + − e Φ − − = 0. (15)
µ1 2 µ1 2 nodes (e.g., with a small degree) increases in the ER graph and
Proof: Please see Appendix D. the GR graph. As a result, the prediction uncertainty decreases
Fig. 5 and Fig. 6 show the differential privacy level  and as the number of nodes increases.
prediction uncertainty c of Algorithm 1 for several well-known Given the same number of nodes, it can be seen from Fig.
graph topologies, respectively. In particular, in a complete 5 and Fig. 6 that the regular graph and the complete graph (a
graph, every node is connected to all the other nodes in the special regular graph with the degree of n − 1) perform well
graph. A star graph is a graph in which a central node is in terms of both differential privacy and prediction uncertainty.
connected to all the other nodes and the central node is their This may be due to the fact that in these graphs every node
only neighbor. A ring graph is a graph that consists of a single has the same degree while none of the nodes are distant from
cycle in which every node has exactly two edges incident the others. As a result, all the nodes look the same from the
with it. For the grid graph, we consider a two-dimensional attacker’s perspective. It is worth mentioning that despite that
square grid. These four graphs are deterministic (i.e., given the ring graph and the grid graph can be considered as regular
the number of nodes, the graph structures are fixed). We also
consider three random graphs, i.e., the regular graph, the Erdős 3 Note that in the private gossip, only one node is active at each time slot
Rényi (ER) graph and the Geometric Random (GR) graph. and the active node randomly selects its neighbors to push the message
100 1.0 Regular graph with rewiring, degree 10
Regular graph with rewiring, degree 20
80
0.8 Regular graph with rewiring, degree 50

Prediction Uncertainty
Decay centrality
60
0.6
40 Complete Graph
Ring Graph 0.4
20 Grid Graph
Regular Graph 0.2
0
0 100 200 300 400 500
Number of Nodes 0.0 0.2 0.4 0.6 0.8 1.0
rewiring probability
Fig. 8: Decay centrality v.s. number of nodes (α = 0.5)
Fig. 10: c v.s. rewiring probability (n = 500, α = 0.5)
20

18 necessarily be good measures for source anonymity. To further


Regular graph with rewiring, degree 10
16 Regular graph with rewiring, degree 20 illustrate this idea, the differential privacy and prediction
14
Regular graph with rewiring, degree 50 uncertainty of Algorithm 1 within a candidate set is examined.
More specifically, the following theorem can be proved.
ǫ

12

10
Theorem 3. Given a general network G and the observation
model described in Section II-C with parameter α, Algorithm
8
1 is (, 0)-differentially private within the candidate set Q ,
6 where
0.0 0.2 0.4 0.6 0.8 1.0
rewiring probability  
1
 = ln max ,
Fig. 9:  v.s. rewiring probability (n = 500, α = 0.5) k∈Q
/ ,j6=i∈Q α(I − (1 − α)Âi )−1 [j, i]
 (16)
(I − (1 − α)Âk )−1 [j, k]
.
graphs with degrees of 2 and 4, respectively,4 the diameters (I − (1 − α)Âk )−1 [i, k]
of these two graphs are large (which also leads to small decay The prediction uncertainty of Algorithm 1 within the candidate
centrality since a large diameter indicates a large shortest path set Q is given by
length between the nodes). Therefore, the attacker can still  X
differentiate two distant nodes. Fig. 7 and Fig. 8 show the c = min α(I − (1 − α)Âi )−1 [j, i],
diameters and the decay centrality of the aforementioned four i∈Q ,k∈Q
/
j6=i∈Q
graphs for a comparison. P
α(I − (1 − α)Âi )−1 [j, k]

j6=i∈Q
To further examine the idea that regular graphs with .
small diameters have good differential privacy and prediction α(I − (1 − α)Âi )−1 [i, k]
uncertainty performance, the impact of edge rewiring on the (17)
privacy guarantees of regular graphs is examined and the results Proof: Please see Appendix E.
are presented in Fig. 9 and Fig. 10. Particularly, a random
regular graph with 500 nodes and a degree of 10/20/50 is first Remark 8. We note that in (16) (similarly (17)), the first term
generated. Then, each edge (x, y) in the graph is replaced corresponds to the events for which the attacker observes the
by another edge (x, z) with a rewiring probability, where z activities from the nodes in the candidate set before those from
is randomly sampled from the node set V . As the rewiring the nodes outside the candidate set. Therefore, it is similar to
probability increases, more irregularity is introduced. It can that in Theorem 2. The second term corresponds to the events
be observed that the differential privacy level  increases and for which attacker first observes the activities from the nodes
the prediction uncertainty c decreases, which supports our outside the candidate set.
conjecture. Fig. 11 and Fig. 12 show the differential privacy level  and
In summary, given the number of nodes, a graph usually prediction uncertainty c of Algorithm 1 for the three random
has a smaller  and a larger c (and therefore better differential graphs within a candidate set, respectively. The average degrees
privacy and prediction uncertainty) if it has a small diameter of all the graphs are set as 10. In particular, the candidate
and all the nodes have similar degrees. set Q contains a randomly selected source node and its 1-hop
However, since both differential privacy and prediction neighbors. 15,000 Monte Carlo runs are performed for each
uncertainty consider the worst case scenario, they may not graph, and the average  and c within the candidate set Q are
4 Strictly speaking, the 2-dimensional square grid graph is not regular.
presented.
However, it is close to a regular graph since most of the nodes have the same Different from the results in Fig. 5 and Fig. 6, the GR graph
degree of 4. performs better than the ER graph and the regular graph. This
6.25 8

6.00
7
5.75
ER Graph - 1-hop neighbors
ǫ

5.50 GR Graph - 1-hop neighbors 6

ǫ
Regular Graph - 1-hop neighbors ER Graph - 1-hop neighbors
5.25
GR Graph - 1-hop neighbors
5
Regular Graph - 1-hop neighbors
5.00

100 200 300 400 500 4


Number of Nodes 20 40 60 80
Average degree
Fig. 11:  v.s. number of nodes (α = 0.5)
Fig. 13:  v.s. average degree (n = 500, α = 0.5)

0.4
0.30
Prediction uncertainty

Prediction uncertainty
0.25 0.3

ER Graph - 1-hop neighbors ER Graph - 1-hop neighbors


0.20
GR Graph - 1-hop neighbors 0.2 GR Graph - 1-hop neighbors
Regular Graph - 1-hop neighbors Regular Graph - 1-hop neighbors
0.15
0.1
0.10

100 200 300 400 500 0.0


Number of Nodes 20 40 60 80
Average degree
Fig. 12: c v.s. number of nodes (α = 0.5)
Fig. 14: c v.s. average degree (n = 500, α = 0.5)

is because, in the GR graph, there exist some clusters that as the average degree increases, ’s of the ER graph and the
are well connected locally. In addition, the 1-hop neighboring regular graph first increase and then decrease. Intuitively, 
nodes of a node is likely to be in the same cluster. As a result, depends on two parameters: the size of the candidate set and
it is more difficult for the attacker to distinguish two nodes in the connectivity of the graphs. As the average degree increases,
the candidate set for the GR graph. the size of the candidate set becomes larger, while the graphs
Remark 9. The above results show that the original differential become better connected. When the average degree is small,
privacy and prediction uncertainty may have limitations in mea- the impact of the candidate set size dominates that of the
suring the privacy guarantees of gossip protocols. Particularly, connectivity. As a result, increasing the average degree leads
Fig. 5 and Fig. 6 show that the regular graph has a smaller to a larger . When the average degree is large enough, the
differential privacy level  and a larger prediction uncertainty c impact of the connectivity dominates that of the candidate
than the GR graph. However, it does not necessarily mean that set size. Therefore, a larger average degree corresponds to
the regular graph always provides better privacy protection. a smaller . The same analysis can be applied to prediction
Fig. 11 and Fig. 12 show that the GR graph performs better uncertainty. On the other hand, for the GR graph, as the average
than the regular graph in terms of differential privacy and degree increases, the number of nodes in the candidate set
prediction uncertainty within the 1-hop neighbors. In practice, increases. Since the nodes in the candidate set are usually well
the attacker can often obtain some side information about connected (like a complete graph), it is similar to the complete
the source. For instance, the source of the leaked information graph with an increasing number of nodes. As a result, the
about a company usually has close connection to the company. trends of the curves for the GR graph are similar to those of
In this case, the source may not be interested to hide itself the complete graph in Fig. 5 and Fig. 6.
among all the nodes in the whole network, but instead among IV. P RIVACY-S PREADING T RADEOFF OF G OSSIP
those closely related to the company. That being said, if the P ROTOCOLS IN W IRELESS N ETWORKS
source node cares more about hiding itself among a subset
of nodes (e.g., its 1-hop neighbors), differential privacy and Considering that in many real world applications, the infor-
prediction uncertainty within the corresponding candidate set mation spreading between two nodes may be realized through
may serve as better privacy metrics. wireless communications [2], [12], the privacy guarantees of
gossip protocols in wireless networks are investigated in this
Fig. 13 and Fig. 14 show the impact of average degree on subsection. It is assumed that the communications between
 and c for the three random graphs. It can be observed that the network nodes and between the attacker and its deployed
sensors are prone to errors due to various interferences. To f
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

simplify the analysis, a failure probability is considered in this 3500 ER Network


GR Network
setting: Due to interferences, the communications will fail with 3000

Average Spreading Time


a probability of f between two nodes during the gossip step, 2500

and it is assumed that the attacker fails to receive a report from 2000
Sweet Operation Points
any of its deployed sensors about the detected events with 1500

the same probability f .5 Note that the failure probability f , 1000

500
induced by detrimental effects in wireless channels, is different
0
from the detection probability α that is due to the limitation in 0.1 0.2 0.3 0.4 0.5

the eavesdropping capability (e.g., computation power) of the


sensors. In this case, the privacy guarantees of gossip protocols Fig. 15: DP v.s. spreading speed in the synchronous setting
are characterized in the following theorem.
1) In the synchronous setting, the private gossip takes
Theorem 4. Considering the same setting as in Theorem CG /(1 − f ) rounds on average to inform all nodes in the
1, with the additional constraint that both the legitimate network, where CG is the cover time of a random walk
communication and the adversarial reporting fail with a in network G.
probability f , the gossip-based protocols can guarantee (, δ)- 2) In the asynchronous setting, the private gossip takes
differential privacy with δ ≥ α(1 − f ) and c-prediction CG /(1 − f ) time on average, while the standard gossip
uncertainty with c = 0 in the synchronous setting, and δ ≥ takes Tas /(1 − f ) time on average to finish spreading,
max[α(1 − f ) − e (1 − α(1 − f ))DG , α(1 − f ) − e 1−α(1−f
n−1
)
] where Tas is the spreading time of standard gossip when
C1−α(1−f ) (i)
and c ≤ min α(1−f ) in the asynchronous setting. the communication is perfect.
i∈V

The privacy guarantees of Algorithm 1 is characterized in Proof: Please see Appendix F.


the following theorem. Remark 10. For standard gossip in the synchronous setting,
Theorem 5. Consider the same setting as in Theorem 2, with multiple random walks can exist during the spreading process,
the additional constraint that both the legitimate communica- which renders the analysis of unreliable spreading challenging
tion and the adversarial reporting fail with a probability f , in general networks. But we conjecture that a similar result
Algorithm 1 is (, 0)-differentially private in the asynchronous as in the synchronous setting may hold.
setting, where The above results indicate a trade-off between privacy
  and spreading speed of gossip protocols, which is further
1
 = ln max . explored through simulations below. In particular, following
j6=i∈V α(1 − f )(I − (1 − α(1 − f ))Âi )−1 [j, i]
the existing literature in information spreading (e.g., [16],
(18)
[17]), ER networks and GR Networks with a total number
The prediction uncertainty of Algorithm 1 in the asynchronous
of n = 100000 nodes and average node degree of 10 are
setting is given by
X considered. Each point in the following figures is obtained
c = min α(1 − f )(I − (1 − α(1 − f )Âi ))−1 [j, i]. (19) through simulations with 5 network instances and 100 Monte
i∈V Carlo runs for each instance. The average 90% spreading
j6=i∈V
time is considered [12]. The privacy-spreading tradeoffs for
The proofs of the above theorems readily follow from the
ER and GR networks for standard gossip in the synchronous
previous results and the details are omitted in the interest of
and asynchronous settings are shown in Fig. 15 and Fig.
space.
16, respectively. It is assumed that α = 0.5 and privacy
Adding artificial noise is a typical way to enhance privacy in
budget  = 1 without loss of generality. The corresponding
practical applications [14]. In wireless networks, interference
privacy lower bounds δ in the x-axis are calculated for the
is a natural source for privacy enhancement as it hampers the
considered ER and GR networks using Theorem 4 given the
attacker’s observations of the network activities, which can
failure probability f (one-to-one correspondence). Similar
be further strengthened through approaches such as friendly
results are obtained for private gossip and omitted here due
jamming [15]. However, the information spreading process is
to the space constraint.
impeded as well in such scenarios. The information spreading
time of the standard and the private gossip protocols in this
Remark 11. Through analysis, it can be seen that the
case is given below.
spreading time is inversely proportional to 1 − f while the
Theorem 6. In a wireless network G in which the communi- privacy lower bound δ is proportional to 1 − f . From Fig.
cations fail with a probability of f , we have 15 and Fig. 16, it can be seen that when δ increases from
0.05 to 0.1, for GR networks, the average spreading time
5 As a first work in this area, this simplified assumption is adopted to
decreases from around 3600 and 2600 to 1800 and 1300 in
facilitate the characterization of the tradeoff between privacy and spreading
speed. More realistic assumption concerning two different but correlated failure the synchronous and asynchronous settings, respectively. This
probabilities [13] warrants further study. means that we can trade a small loss of privacy for dramatic
f
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
Theorem 8. Given a general network G and the observation
2500 ER Network model described in Section II-C with parameter α, if the
GR Network
attacker starts monitoring the information spreading process t
Average Spreading Time 2000
steps of gossip communications after it begins, Algorithm 1 is
1500
(, 0)-differentially private in the asynchronous setting, where
Sweet Operation Points
1000
e =
500
At [j, i] + At [j, k]α(I − (1 − α)Âi )−1 [k, i]
P
k6=i∈V
0
max P .
0.1 0.2 0.3 0.4 0.5
i∈V,j6=z∈V At [z, i] + k6=i∈V At [z, k]α(I − (1 − α)Âi )−1 [k, i]
(21)
Fig. 16: DP v.s. spreading speed in the asynchronous setting The prediction uncertainty of Algorithm 1 in the asynchronous
setting is given by
improvement in spreading time. On the other hand, for ER 
networks or GR networks with large δ (small f ), the average c = min
i∈V,z∈V
spreading time increases slowly as δ decreases. Therefore, the P t P t −1
privacy guarantees of gossip protocols can be strengthened j6=i∈V [A [j, z] + k6=z∈V A [j, k]α(I − (1 − α)Âz ) [k, z]]


with a small loss of spreading time (e.g., the sweet operation P


At [i, z] + k6=z∈V At [i, k]α(I − (1 − α)Âz )−1 [k, z]
points in Fig. 15 and Fig. 16), which suggests that methods (22)
like adding artificial noise can be useful in privacy-preserving
Proof: Please see Appendix H.
information spreading.
Remark 13. Note P that in (21), α(I − (1 − α)Âi )−1 [k, i] =

V. P RIVACY OF G OSSIP P ROTOCOLS IN D ELAYED P (k → i) = α m=0 (1 − α)m Âm i [k, i]. For any k 6=
M ONITORING 0
i, Âi [k, i] = I[k, i] = 0. Therefore,
In reality, the attacker may not monitor the whole information ∞
X
spreading process right from the beginning. In this section, we α(I − (1 − α)Âi )−1 [k, i] = α (1 − α)m Âm
i [k, i]
try to quantify the differential privacy of general gossip proto- m=1
∞ (23)
cols when the monitoring is delayed. To avoid complication, X
m
≤α (1 − α) = 1 − α.
it is assumed that the communications between nodes and the
m=1
reception at the attacker are perfect. In addition, the attacker
knows the global time in the synchronous setting or the number As a result,
X
of communication that has occurred in the asynchronous setting At [j, i] + At [j, k]α(I − (1 − α)Âi )−1 [k, i]
since the beginning of information spreading. k6=i∈V
X (24)
t
Theorem 7. Considering the same setting as in Theorem 1, < A [j, i] + At [j, k] = 1.
if the attacker starts monitoring the information spreading k6=i∈V
process t rounds (or t steps of gossip communications in the On the other hand,
asynchronous case) after it begins and t < DG , the gossip- X
based protocols can guarantee (, δ)-differential privacy with At [z, i] + At [z, k]α(I − (1 − α)Âi )−1 [k, i]
δ ≥ dt 1 α in the synchronous setting. In the asynchronous k6=i∈V
max
setting > A [z, i] min α(I − (1 − α)Âi )−1 [k, i]
t
k6=i∈V
 (25)
1
X
δ ≥ max t α − e (1 − α)DG −t , + At [z, k]α(I − (1 − α)Âi )−1 [k, i]
dmax (t + 1)! k6=i∈V
(20)
1 1 − dt 1(t+1)! α  ≥ min α(I − (1 − α)Âi )−1 [k, i].

α−e max
,0 , k6=i∈V
dtmax (t + 1)! n−1
Comparing Eq. (21) with Eq. (12), we can see that the  in
in which dmax = maxi∈V di is the largest node degree. Theorem 8 is smaller than that in Theorem 2, i.e., in delayed
Proof: Please see Appendix G. monitoring, the differential privacy guarantee is enhanced.
Similar result can be obtained for prediction uncertainty.
Remark 12. Gossip protocols are not able to protect the
source’s identity effectively during the early stage of infor- Fig. 17 and Fig. 18 show the impact of delay time t on the
mation spreading. As the spreading process continues, more differential privacy level  and the prediction uncertainty c,
and more randomness is introduced, leading to stronger and respectively. It can be observed that as the delay time increases,
stronger privacy. Therefore, in delayed monitoring, it becomes  (c) quickly decreases (increases) for the ER graph, the regular
more difficult for the attacker to identify the source node as graph and the complete graph. This is because, in these graphs,
the delay increases. within the delay time t, the information is quickly spread to
the rest of the nodes in the graph, which makes it difficult for
more appropriate privacy metrics and designing more effec-
40 tive privacy-aware gossip protocols for different observation
models (e.g., network snapshot [6]) and network structures are
30 Complete Graph interesting future directions.
ER Graph
ǫ

20 GR Graph R EFERENCES
Regular Graph
[1] R. Chandra, V. Ramasubramanian, and K. Birman, “Anonymous gossip:
10 Improving multicast reliability in mobile ad-hoc networks,” in Proceed-
ings 21st International Conference on DCS, 2001, pp. 275–283.
0 [2] A. D. Dimakis, A. D. Sarwate, and M. J. Wainwright, “Geographic
0 1 2 3 4 5 gossip: Efficient averaging for sensor networks,” IEEE Transactions on
Delay time Signal Processing, vol. 56, no. 3, pp. 1205–1216, 2008.
[3] A. J. Ganesh, A.-M. Kermarrec, and L. Massoulié, “Peer-to-peer
Fig. 17:  v.s. delay time (n = 500, α = 0.5) membership management for gossip-based protocols,” IEEE transactions
on computers, vol. 52, no. 2, pp. 139–149, 2003.
[4] P. Bianchi and J. Jakubowicz, “Convergence of a multi-agent projected
stochastic gradient algorithm for non-convex optimization,” IEEE
500
Transactions on Automatic Control, vol. 58, no. 2, pp. 391–405, 2013.
[5] Y. Liu, J. Liu, and T. Basar, “Differentially private gossip gradient
400
Prediction Uncertainty

descent,” in Proceedings of the 57th IEEE CDC, 2018, pp. 2777–2782.


Complete Graph [6] J. Jiang, S. Wen, S. Yu, Y. Xiang, and W. Zhou, “Identifying propagation
300 sources in networks: State-of-the-art and comparative studies,” IEEE
ER Graph
GR Graph Communications Surveys & Tutorials, vol. 19, no. 1, pp. 465–481, 2016.
200 [7] G. Fanti, P. Kairouz, S. Oh, K. Ramchandran, and P. Viswanath, “Hiding
Regular Graph
the rumor source,” IEEE Transactions on Information Theory, vol. 63,
100 no. 10, pp. 6679–6713, 2017.
[8] C. Dwork, A. Roth et al., “The algorithmic foundations of differential
0 privacy,” Foundations and Trends® in Theoretical Computer Science,
0 1 2 3 4 5 vol. 9, no. 3–4, pp. 211–407, 2014.
Delay time [9] A. Bellet, R. Guerraoui, and H. Hendrikx, “Who started this rumor?
quantifying the natural differential privacy guarantees of gossip protocols,”
Fig. 18: c v.s. delay time (n = 500, α = 0.5) arXiv preprint arXiv:1902.07138, 2019.
[10] L. Backstrom, P. Boldi, M. Rosa, J. Ugander, and S. Vigna, “Four
degrees of separation,” in Proceedings of the 4th Annual ACM Web
Science Conference, 2012, pp. 33–42.
the attacker to identify the source. For the GR graph, however, [11] M. O. Jackson, Social and economic networks. Princeton university
press, 2010.
during the delay time, it is likely that the information remains [12] H. Zhang, Z. Zhang, and H. Dai, “Gossip-based information spreading
in the same cluster that the source node belongs to. As a result, in mobile networks,” IEEE Transactions on Wireless Communications,
it is still not difficult for the attacker to distinguish the nodes vol. 12, no. 11, pp. 5918–5928, 2013.
[13] M. C. Vuran and I. F. Akyildiz, “Spatial correlation-based collabora-
across clusters. tive medium access control in wireless sensor networks,” IEEE/ACM
Transactions On Networking, vol. 14, no. 2, pp. 316–329, 2006.
VI. C ONCLUSIONS AND F UTURE W ORKS [14] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar,
and L. Zhang, “Deep learning with differential privacy,” in Proceedings
In this work, the privacy guarantees of gossip-based protocols of the 2016 ACM SIGSAC Conference on Computer and Communications
in general networks are investigated. Bounds on differential Security, 2016, pp. 308–318.
[15] J. P. Vilela, M. Bloch, J. Barros, and S. W. McLaughlin, “Wireless secrecy
privacy and prediction uncertainty of gossip protocols in regions with friendly jamming,” IEEE Transactions on Information
general networks are obtained. For the private gossip algorithm, Forensics and Security, vol. 6, no. 2, pp. 256–266, 2011.
the differential privacy and prediction uncertainty guarantees [16] B. Min, S.-H. Gwak, N. Lee, and K.-I. Goh, “Layer-switching cost and
optimality in information spreading on multiplex networks,” Scientific
are derived in closed form. It is found that these two metrics reports, vol. 6, p. 21392, 2016.
are closely related to some key network structure parameters, [17] A. Picu, T. Spyropoulos, and T. Hossmann, “An analysis of the
such as network diameter and decay centrality, in the (arguably) information spreading delay in heterogeneous mobility dtns,” in 2012
IEEE International Symposium on WoWMoM, 2012, pp. 1–10.
more interesting asynchronous setting. Moreover, considering [18] J. Dong, A. Roth, and W. J. Su, “Gaussian differential privacy,” arXiv
that differential privacy and prediction uncertainty may be preprint arXiv:1905.02383, 2019.
restrictive in some scenarios, the relaxed variants of these two
metrics are proposed. In wireless networks, through a simplified
modeling for unreliable communications, the tradeoff between
privacy and spreading efficiency is revealed, and it is suggested
that natural or artificial interference can enhance the privacy
of gossip protocols with the cost of a decrease in spreading
speed. Finally, in delayed monitoring, it is verified that the
privacy of gossip protocols is enhanced as the delayed time
increases, and the corresponding effect is quantified.
Many interesting problems remain open in this line of
research besides those already mentioned above. Investigating
Appendices A PPENDIX B
P ROOF OF L EMMA 2
A PPENDIX A
Proof: Let Pm (j → i) denote the probability that node
P ROOF OF T HEOREM 1
i is active at time m for the first time before the attacker
Proof: First, for the synchronous setting, let Si,0 be the observes the first event given the source node j. Then we have
event that node i’s activity is observed by the attacker’s sensors ∞
X
at time 0. Then, the probability that such an event happens P (j → i) = Pm (j → i). (31)
(i)
given the source node is i is pG (Si,0 ) = α. If the source node m=1
(j)
is any other node j 6= i, pG (Si,0 ) = 0 since node i cannot Recall from Remark 5 that the probability of node i being
initialize a communication if it is not a source node at time 0. active at time m for the first time is Âm m−1
i [j, i] − Âi [j, i].
Therefore, δ ≥ α and c = 0. Therefore, we have
In the asynchronous setting, let Si,0 be the event that
m−1
node i’s activity is observed by the attacker’s sensors Pm (j → i) = (1 − α)m [Âm
i [j, i] − Âi [j, i]]. (32)
as its first observed event. It can be seen that, if the
(i) (i) (i)
source node is i, then pG (Si,0 ) = pG (Si,0 |Ti,0 )pG (Ti,0 ) + Plugging (32) into (31) yields
(i) (i) ∞
pG (Si,0 |T i,0 )pG (T i,0 ) ≥ α, where Ti,0 stands for the event X
m−1
that the source node is detected during its first communication. P (j → i) = (1 − α)m [Âm
i [j, i] − Âi [j, i]]
m=1
If the source node is j, we can consider the following event, ∞ ∞
denoted as Od(i,j) : there is no communication detected by the X
(1 − α)m Âm
X
(1 − α)m Âm−1
= i [j, i] − i [j, i]
sensors in the network after d(i, j) gossip actions have been m=1 m=1
executed from the beginning. Then we have X∞ X∞

(j) (j)
\ (j)
\ = (1 − α)m Âm
i [j, i] − (1 − α)m+1 Âm
i [j, i]
pG (Si,0 ) = pG (Si,0 Od(i,j) ) + pG (Si,0 Od(i,j) ) m=1 m=1
(j) − (1 − α)Â0i [j, i]
\
= pG (Si,0 Od(i,j) ) (26)

(j)
≤ pG (Od(i,j) ) = (1 − α)d(i,j) ,
X
= α(1 − α)m Âm i [j, i] − (1 − α)I[j, i]
m=1
where
T the second equality is due to the fact that X∞
Si,0 Od(i,j) = ∅, as it takes at least d(i, j) communications =α (1 − α)m Âm
i [j, i].
for the information to be delivered to node i from node j. m=0
(i)
Since pG (Si,0 ) ≥ α, by applying Lemma 1, we have (33)

δ ≥ max(α − e (1 − α)d(i,j) ) = α − e (1 − α)DG . (27) Since Âi is a right stochastic matrix, it is known that its
i,j
eigenvalues |λÂi | ≤ 1. Therefore, the eigenvalues of (1 −
On the other hand, since
P (i)
pG (Sj,0 ) = 1, there exists a α)Âi are strictly less than 1. As a result, (33) can be written
j∈V
node l ∈ V such that alternatively as

(i) 1 X (i)
X
pG (Sl,0 ) ≤ pG (Sj,0 ) P (j → i) = α (1 − α)m Âm
i [j, i]
n−1 (34)
j∈V,j6=i m=0
(28) −1
(i) = α(I − (1 − α)Âi ) [j, i],
1− pG (Si,0 ) 1−α
= ≤ .
n−1 n−1 which completes the proof.
This implies δ ≥ α − e 1−α
n−1 .
By Eq. (27), we have A PPENDIX C

1−α
 P ROOF OF T HEOREM 2
δ ≥ max α − e (1 − α)DG , α − e . (29)
n−1 Proof: Let Si,0 denote the event such that node i’s activity
is observed by the attacker as its first observation. Then, for
(j)
Meanwhile, as we have pG (Si,0 ) ≤ (1−α)d(i,j) , the detection any j 6= i we have
uncertainty can be calculated as
pjG (Si,0 ) = P (j → i)piG (Si,0 ) ≤ piG (Si,0 ). (35)
(j)
!
d(i,j)
P P
j6=i pG (S) j6=i (1 − α)
c = min ≤ min On the other hand, by Lemma 2
i,S (i) i α
pG (S) (30) 1
C1−α (i) piG (Si,0 ) = pj (Si,0 )
= min . P (j → i) G
i∈V α 1 (36)
= pjG (Si,0 ).
α(I − (1 − α)Âi )−1 [j, i]
Therefore, according to Definition 1 Theorem 9. [18] A mechanism is µ-Gaussian differentially
private if and only if it is (, δ())-differentially private for all
1
e = max . (37)  ≥ 0, where
j6=i∈V α(I − (1 − α)Âi )−1 [j, i] 
 µ
 
 µ


δ() = Φ − + −e Φ − − . (41)
Taking logarithm on both sides of (37) gives the results in µ 2 µ 2
Theorem 2. Given any (, 0)-differentially private mechanism, the cor-
Regarding the prediction uncertainty, we have responding µ1 can be obtained by (41). Applying µ1 and the
above theorem completes the proof. Interested readers may
6 {i}|Si,0 )
pG (I0 = refer to [18] for more details.
pG (I0 = {i}|Si,0 )
P (j) A PPENDIX E
j6=i∈V pG (Si,0 ) P ROOF OF T HEOREM 3
= (i)
pG (Si,0 ) Proof: Let Si,0 denote the event such that node i’s activity
(38)
P −1 (i) is observed by the attacker as its first observation. For any
j6=i∈V α(I − (1 − α)Âi ) [j, i]pG (Si,0 )
= (i)
j 6= i ∈ Q , similar to the proof of Theorem 2, we have
pG (Si,0 ) (j) (i) (i)
X pG (Si,0 ) = α(I − (1 − α)Âi )−1 [j, i]pG (Si,0 ) ≤ pG (Si,0 ),
= α(I − (1 − α)Âi )−1 [j, i]. (42)
j6=i∈V
and
For any k 6= i ∈ V , it can be shown that (i) 1 (j)
pG (Si,0 ) = pG (Si,0 ). (43)
α(I − (1 − α)Âi )−1 [j, i]
pG (I0 =6 {i}|Sk,0 )
For any k ∈
/ Q and j 6= i ∈ Q , we have
pG (I0 = {i}|Sk,0 )
(j) (k)
P (j) pG (Sk,0 ) = α(I − (1 − α)Âk )−1 [j, k]pG (Sk,0 ), (44)
j6=i∈V pG (Sk,0 )
= (i) and
pG (Sk,0 )
(i) (k)
(k) P (j) pG (Sk,0 ) = α(I − (1 − α)Âk )−1 [i, k]pG (Sk,0 ), (45)
pG (Sk,0 ) + j6=i,k∈V pG (Sk,0 )
= (i) Therefore,
pG (Sk,0 )
(k) (k) (j) α(I − (1 − α)Âk )−1 [j, k] (i)
pG (Sk,0 ) + j6=i,k∈V α(I − (1 − α)Âk )−1 [j, k]pG (Sk,0 )
P
pG (Sk,0 ) = pG (Sk,0 ). (46)
= α(I − (1 − α)Âk −1
) [i, k]
(k)
α(I − (1 − α)Âk )−1 [i, k]pG (Sk,0 )
As a result,
1 + j6=i,k∈V α(I − (1 − α)Âk )−1 [j, k]
P 
=  1
e = max ,
α(I − (1 − α)Âk )−1 [i, k] k∈Q
/ ,j6=i∈Q α(I − (1 − α)Âi )−1 [j, i]
P −1 (47)
j6=k∈V α(I − (1 − α)Âk ) [j, k] (I − (1 − α)Âk )−1 [j, k]

≥ .
−1
α(I − (1 − α)Âk ) [i, k]
X (I − (1 − α)Âk )−1 [i, k]
≥ α(I − (1 − α)Âk )−1 [j, k] Taking logarithm on both sides of (47) gives the result in
j6=k∈V Theorem 3.
6 {k}|Sk,0 )
pG (I0 = Regarding the prediction uncertainty, recall that for any
= . S ⊆ S, we have
pG (I0 = {k}|Sk,0 )
(39) P (j)
pG (I0 6= {i}|S) j6=i∈Q pG (S)
= (i)
. (48)
As a result. pG (I0 = {i}|S) pG (S)

pG (I0 6= {i}|S)
 Similar to the proof of Theorem 2, for any event Si,0 ⊆ S and
c = min j 6= i ∈ Q , it can be shown that
i,S⊆S pG (I0 = {i}|S)
X (40) pG (I0 6= {i}|Si,0 )
= min α(I − (1 − α)Âi )−1 [j, i],
i
j6=i∈V
pG (I0 = {i}|Si,0 )
P (j)
j6=i∈Q pG (Si,0 )
which completes the proof. = (i)
pG (Si,0 )
(i)
(49)
A PPENDIX D −1
P
j6=i∈Q α(I − (1 − α)Âi ) [j, i]pG (Si,0 )
P ROOF OF C OROLLARY 1 = (i)
pG (Si,0 )
Proof: The proof of Corollary 1 readily follows from the
X
= α(I − (1 − α)Âi )−1 [j, i].
following results on Gaussian differential privacy. j6=i∈Q
For any event Sk,0 ⊆ S such that k ∈ Q , node k such that d(k, i) = t, the probability that i is informed
at round t is
6 {i}|Sk,0 )
pG (I0 = 6 {k}|Sk,0 )
pG (I0 =
≥ . (50)
pG (I0 = {i}|Sk,0 ) pG (I0 = {k}|Sk,0 ) (k)
Y 1 1 t
pG (i ∈ It ) ≥ ≥( ), (53)
For any event Sk,0 ⊆ S such that k ∈
/ Q , we have m∈pk→i
dm dmax
pk→i :L(pk→i )=t
6 {i}|Sk,0 )
pG (I0 =
pG (I0 = {i}|Sk,0 ) where pk→i is a path from node k to node i and L(pk→i )
(k) 1
P (j) is the length of this path. Then pG (Si,0 ) ≥ ( dmax )t α. It is
j6=i∈Q pG (Sk,0 )
= (i)
(j)
clear that pG (Si,0 ) = 0 since it takes at least DG > t rounds
pG (Sk,0 )
(51) for the information to be delivered to node i from node j.
P −1 (k)
j6=i∈Q α(I − (1 − α)Âk ) [j, k]pG (Sk,0 ) 1
Therefore, by Lemma 1, δ ≥ ( dmax )t α.
= (k)
α(I − (1 − α)Âk )−1 [i, k]pG (Sk,0 ) In the asynchronous setting, again, consider two nodes i, j
P −1 such d(i, j) = DG , and let Si,0 denote the event that node i’s
j6=i∈Q α(I − (1 − α)Âk ) [j, k]
activity is the first one observed by the attacker. If j is the
=
α(I − (1 − α)Âk )−1 [i, k] source node, denote the set of informed and active nodes after
Therefore, t steps of communications as IN At (j). From this set, find the
 X node k ∈ IN At (j) that has the shortest path to node i. Clearly,
c= min α(I − (1 − α)Âi )−1 [j, i], it requires at least d(k, i) (≥ (DG −t)) steps for the information
i∈Q ,k∈Q
/ to reach node i from any node in IN At (j). Consider OIN At →i
j6=i∈Q
P
α(I − (1 − α)Âi )−1 [j, k]
 as the event that no communication is observed by the attacker
j6=i∈Q
, during the process that the information flows from IN At (j)
α(I − (1 − α)Âi )−1 [i, k] to node i. Then,
(52)
(j) (j)
\
which completes the proof. pG (Si,0 ) = pG (Si,0 OIN At →i )
(j)
A PPENDIX F ≤ ≤ (1 − α)d(k,i) ≤ (1 − α)DG −t .
pG (OIN At →i )
P ROOF OF T HEOREM 6 (54)
Also, considering another node l such that d(l, i) = t, the
Proof: The spreading process can be considered as a
probability that node i is informed at the tth step from the
Markov Chain with each state representing the number of
beginning of information spreading is
informed nodes Ninf in the graph. Let Xk , {Ninf = k}
denote the k-th state of the Markov Chain. For standard gossip (l)
Y 1 1 1
in the asynchronous setting and private gossip (in both the pG (i ∈ It ) ≥ ( ) ≥ t , (55)
m∈pl→i
dm t! dmax t!
synchronous and the asynchronous settings), due to the fact pl→i :L(pl→i )=t
that only one node in the graph is active at any time during
the spreading process, the state can only move from Xk to where t!1 is the probability that all nodes in a path pl→i are
Xk+1 for all k ∈ {0, 1, ..., n − 1}. Each gossip action can activated (whose clocks tick) in a fixed order so that the
be considered as a Bernoulli trial (successful if a previously information reaches node i after t steps from node l. Finally,
uninformed node is informed). Given a failure probability of the probability that node i is activated and its gossip action
f , the corresponding successful probability decreases by a is observed by the attacker is t+1 α (l)
. Therefore, pG (Si,0 ) ≥
factor of 1/(1 − f ). In this sense, the expected interstate time, α
dtmax (t+1)! . By Lemma 1 and the same logic as Eq. (28), we
which is essentially the expected time between two consecutive have Eq. (20).
successful gossip actions, is amplified by a factor of 1/(1 − f ).
As a result, the final expected time to reach the final state,
i.e., the expected spreading time is 1/(1 − f ) times that of A PPENDIX H
the perfect communication scenario. Finally, private gossip is P ROOF OF T HEOREM 8
a single random walk in both synchronous and asynchronous (i)
settings, and CG is the expected time to inform all nodes in Proof: Let pG,t (Si,0 ) denote the probability of the attacker
the graph. observing Si,0 given source node i in the scenario where the
attacker starts monitoring the information spreading process t
A PPENDIX G steps of gossip communications after it begins. In this sense,
P ROOF OF T HEOREM 7 (j)
pG,0 (Si,0 ) is the probability of the event Si,0 if the source is
Proof: In the synchronous setting, consider two nodes i, j node j and the attacker starts monitoring in the beginning
such that d(i, j) = DG , and the event that node i’s activity of the information spreading process. In the following, the
(j) (j)
is observed by the attacker at the moment when it starts relationship between pG,t (Si,0 ) and pG,0 (Si,0 ) is derived, after
monitoring, which is denoted as Si,0 . Considering another which similar analysis to the proof of Theorem 2 can be applied
to obtain the differential privacy level . Particularly, for any
j ∈ V , we have
(j)
pG,t (Si,0 )
(k)
X
= At [j, k]pG,0 (Si,0 )
k∈V
(i) (k)
X
= At [j, i]pG,0 (Si,0 ) + At [j, k]pG,0 (Si,0 ) (56)
k6=i∈V
(i)
= At [j, i]pG,0 (Si,0 )
(i)
X
+ At [j, k]α(I − (1 − α)Âi )−1 [k, i]pG,0 (Si,0 ).
k6=i∈V

Therefore,
(j)
pG,t (Si,0 )
e = max (z)
i∈V,j6=z∈V pG,t (Si,0 )
A [j, i] + k6=i∈V At [j, k]α(I − (1 − α)Âi )−1 [k, i]
t P
= max P .
i∈V,j6=z∈V At [z, i] + k6=i∈V At [z, k]α(I − (1 − α)Âi )−1 [k, i]
(57)
Regarding the prediction uncertainty, recall that for any
j ∈ V and Sz,0 ⊆ S,
(j) (z)
pG,t (Sz,0 ) = At [j, z]pG,0 (Sz,0 )
(z)
X
+ At [j, k]α(I − (1 − α)Âz )−1 [k, z]pG,0 (Sz,0 )
k6=z∈V
(58)
Then, for any i, z ∈ V , we have
6 {i}|Sz,0 )
pG (I0 =
pG (I0 = {i}|Sz,0 )
P (j)
j6=i∈V pG,t (Sz,0 )
= (i)
pG,t (Sz,0 )
P t P t −1
j6=i∈V [A [j, z] + k6=z∈V A [j, k]α(I − (1 − α)Âz ) [k, z]]
= P ,
t t −1
A [i, z] + k6=z∈V A [i, k]α(I − (1 − α)Âz ) [k, z]
(59)
which completes the proof.

You might also like