You are on page 1of 8

Local Silencing Rule for Randomized Gossip

Algorithms
Ali Daher
McGill University
Montr eal, Canada
Email:ali.daher@mail.mcgill.ca
Michael G. Rabbat
McGill University
Montr eal, Canada
Email:michael.rabbat@mcgill.ca
Vincent K.N. Lau
HKUST
Clear Water Bay, Hong Kong
Email:eeknlau@ust.hk
AbstractThe problem of disseminating information in a net-
work in a decentralized way has received a lot of attention during
the past years. Gossip algorithms are among the most popular
approaches to deal with this problem. In existing randomized
gossip algorithms nodes continue gossiping even if their value
is close to their neighborhoods value, which induces redundant
transmissions. In practice, this might be pathological, but it is
difcult to specify a tight silencing condition without assuming
knowledge of the initial node values. In this paper, we show
that by locally specifying a silencing condition for each node
the algorithm requires fewer transmissions for certain initial
conditions, and never uses more iterations than a gossip algorithm
based on the classical approach.
I. INTRODUCTION
Gossip algorithms are simple, fully-decentralized proto-
cols for in-network information processing and information
dissemination. They have received a lot of attention in the
signal processing, systems and control, information theory,
and theoretical computer science communities during the
past ve years because they are simple to implement, robust
against unreliable wireless network conditions and changing
topologies, and they have no bottlenecks or single points of
failure.
In this paper, we focus on the average consensus problem
where each node initially has a measurement, and the goal
is to compute the average of all these measurements at all
nodes in the network. Although the average is an extremely
simple function, previous work has shown that it can be used
as a building block to support much more complex tasks
including source localization, compression, subspace tracking,
and optimization ( [1], [19] and [20] ). Randomized gossip [4]
solves the average consensus problem in the following manner.
Each node maintains and updates a local estimate of the
average, which it initializes with its own measurement. Each
node also runs an independent random (Poisson) clock. When
the clock at a node i ticks, signaling the start of a new iteration,
it contacts one of its neighbors (chosen randomly); they
exchange estimates, and then update their estimate by fusing
their previous estimate with the new information obtained from
their neighbor.
Previous studies of randomized gossip for information pro-
cessing have focused on studying scaling laws (how many
messages are needed as the network size tends to innity),
and on developing efcient randomized gossip algorithms for
typical models of wireless network topologies such as 2-d
grids and random geometric graphs. Since each wireless trans-
mission typically consumes a signicant amount of battery,
characterizing the number of transmissions used is an impor-
tant goal, and this number of transmissions is proportional to
the number of gossip iterations executed. Much previous work
has focused on characterizing the -averaging time, which is
the worst-case number of iterations the algorithm must be
run to guarantee with high probability that the estimates of
the average at all nodes are away from the true average,
relative to the initial condition
1
. In particular, these studies
are based on the worst-case initial condition. They suggest
rough guidelines for the number of iterations to execute, but
because the bounds are pessimistic by design, the number of
iterations specied can be signicantly larger than the actual
number of iterations required to get an accurate estimate at
all nodes. If one had an accurate model for typical initial
conditions across the network, then a more careful analysis
of the expected run-time could be carried out, and through
the use of large-deviation techniques, one could determine a
more accurate bound on the number of iterations required.
However, accurate models for measurements are often not
available, especially when deploying wireless sensor networks
for exploratory monitoring and surveying.
This paper describes implicit local silencing rules for
randomized gossip algorithms with theoretical performance
guarantees. Rather than xing a total number of iterations
to execute in advance, each node monitors its estimate and
decides to become silent when the estimate has not changed
signicantly after a prescribed number of iterations. When
a node decides to be silent, it no longer initiates gossip
exchanges with neighbors when its clock ticks. We prove that
the proposed scheme will stop almost surely after a nite
number of iterations. We also show how the nal error can be
controlled by adjusting the parameters of our silencing rule.
The nal error guaranteed by our algorithm is also absolute,
rather than being relative to the initialization.
II. BACKGROUND AND PROBLEM SETUP
Let the graph G = (V, E) denote the communication
topology of a network with n = [V [ nodes and edges (i, j)
1
A precise denition is given in Section II.
E V
2
if and only if nodes i and j communicate directly.
We assume that G is connected. We take V = 1, . . . , n to
index the nodes. Let x
i
(0) R denote the initial value at
node i V . In randomized gossip, nodes iteratively exchange
information and update their estimates, x
i
(t). Our goal is to
estimate the average x =
1
n

n
i=1
x
i
(0) at every node of the
network; that is, we would like x
i
(t) x for all i as t .
Following [4], [24], we adopt an asynchronous update
model where each node runs an independent Poisson clock
that ticks at a rate of 1 per unit time. In this model, the
probability that two clocks tick at precisely the same time
instant is zero. Let t
k
denote the time of the kth clock tick in
the network, and let i(k) denote the index of the node at which
this tick occurs. It is easy to show, using properties of Poisson
processes, that the sequence of nodes i(1), i(2), . . . , i(k), . . .
is independent and uniformly distributed over V , since all
nodes clocks tick at the same rate. Moreover, via simple prob-
abilistic arguments [15], [16], one can show that each block
of O(nlog n) consecutive nodes in the sequence i(k)

k=1
contains every node in V with high probability.
In the randomized gossip algorithm described in [4], when
i(k)s clock ticks at time t
k
, it contacts a neighboring node,
which we will denote by j(k) according to a pre-specied
distribution P
i,j
= Pr
_
i contacts j

i ticked
_
. Then i(k) and
j(k) update their values by setting
x
i(k)
(t
k
) = x
j(k)
(t
k
) =
1
2
_
x
i(k)
(t
k1
) +x
j(k)
(t
k1
)
_
, (1)
and all nodes v V i(k), j(k) hold their estimates at
x
v
(t
k
) = x
v
(t
k1
). The probability P
i,j
can only be positive
if there is a connection (i, j) E between nodes i and j. Let
A
i
= j : (i, j) E denote the set of neighbors of i. Often,
we use the natural random walk probabilities P
i,j
= 1/[A
i
[
for the graph G.
We assume that i(k) and j(k) exchange information in-
stantaneously at time t
k
. As mentioned above, no two clocks
tick simultaneously, so we can order the events sequentially
t
1
< t
2
< < t
k
< . . . . To simplify notation, we write
x
i
(k) instead of x
i
(t
k
) in the sequel, and we refer to the
operations taking place at time t
k
as the kth iteration.
We note that this problem setuphaving local clocks oper-
ate at a rate of 1 tick per unit time is purely for the sake of
analysis. In practice, one would tune the clock rate taking
into consideration radio transmission rates, packet lengths,
node transmission ranges, the average number of neighbors per
node, and interference patterns, and the rates could be chosen
sufciently large so that no two gossip events interfere with
high probability. Determining the appropriate rate is beyond
the scope of this paper and is an interesting open problem.
Pseudo-code for simulating randomized gossip is shown
in Algorithm 1. The silencing condition recommended in
previous work is to x a maximum number of iterations to
execute based on the worst-case initial condition and size of
the network. In particular, previous work has analyzed the -
averaging time, T

(P), for gossip algorithms. Let x(t) R


n
the estimates at each node at time t stacked into a vector, and
Algorithm 1 Randomized Gossip
1: Initialize: x
i
(0)
iV
and k = 1
2: repeat
3: Draw i(k) uniformly from V
4: Draw j(k) according to P
i,j

jV
5: x
i(k)
(k)
1
2
_
x
i(k)
(k 1) +x
j(k)
(k 1)
_
6: x
j(k)
(k)
1
2
_
x
i(k)
(k 1) +x
j(k)
(k 1)
_
7: for all v V i(k), j(k) do
8: x
v
(k) = x
v
(k 1)
9: end for
10: k k + 1
11: until Satisfying some silencing condition
let x denote a vector with all entries equal to the average,
x. Then the -averaging time for the algorithm dened by
neighbor-selection probabilities P is dened as
T

(P) = sup
x(0)
inf
_
t : Pr
_
|x(t) x|
|x(0)|

_

_
; (2)
that is, T

(P) is the smallest time t for which the error


|x(t) x| |x(0)| is small relative to the initial condition
x(0), with high probability, for the worst-case (and, thus, any)
initial condition x(0). Note that the matrix of probabilities
P captures the network topology G, since P
i,j
> 0 only if
(i, j) E, and so averaging time depends strongly on the
network topology.
The 2-dimensional random geometric graph [10], [18] is a
typical model for connectivity in wireless networks: n nodes
are placed in the unit square, and two nodes are connected if
the distance between them is less than the connectivity radius
r(n) = (
_
log(n)/n). Gupta and Kumar [10] showed that
this choice of connectivity radius guarantees the network is
connected with high probability. It was shown in [4] that, for
random geometric graphs, the -averaging time is
T

(P) = (nlog
1
) (3)
time units, regardless of whether P is the natural probabilities
or is optimized with respect to the topology. Since each
node ticks once per time unit, in expectation, this means that
the network is silenced after (n
2
log
1
) iterations. Each
iteration involves two transmissions, so this result implies that
the total number of transmissions required to gossip scales
quadratically in the size of the network.
Motivated to achieve better scaling, previous work has
focused on developing generalizations and variations on the
randomized gossip algorithm described above (see [1], [2], [6],
[7], [11][14], [17], [21][23], [25] and references therein).
These algorithms have signicantly improved the scaling laws,
and existing state-of-the-art schemes require a total number of
transmissions that scales linearly or nearly-linearly (e.g., as
npolylog(n)) in the network size.
However, a very practical problem remains unsolved: how
can nodes locally determine when their estimate is accurate
enough to be silent? The analyses involving -averaging time
are asymptotic and order-wise, and the constants in the bounds
such as (3) are generally unknown. This bound denes ac-
curacy as |x(t) x| |x(0)|, relative to the magnitude
initial condition, |x(0)|, and so one must also bound this
magnitude to guarantee an error of the form |x(t) x| .
Moreover, the time T

(P) is based on the worst-case initial


condition. In practice, this condition may be pathological,
but it is difcult to specify a tighter time without assuming
knowledge of the distribution of initial conditions, which is
generally not available in practice.
In a practical implementation of randomized gossip, one
would like to x a desired level of accuracy > 0 in advance
and have the algorithm run for as many iterations as are needed
to ensure that |x(k) x| with high probability. The
next section proposes a modication of randomized gossip
which incorporates a local silencing rule, allowing nodes to
adaptively determine when to be silenced. Subsequent sec-
tions analyze this local silencing rule and provide theoretical
guarantees.
III. ALGORITHM AND MAIN RESULTS
As nodes gossip, using the algorithm described in
the previous section, their local estimates change over time.
Previous results [4] show that gossip converges asymptotically,
in the sense that the error |x(k) x| vanishes as k .
Intuitively, once x(k) is close to x, the changes to each
nodes estimate become small. In particular, each node should
be able to examine the recent history of its iterations and
determine when to become silent.
We propose a local silencing rule based on two parameters:
a tolerance, > 0, and a positive integer C. In addition to
maintaining a local estimate, node i also maintains a count
c
i
(k) which is initialized to c
i
(0) = 0. Each time a node
gossips, it tests whether its local estimate has changed by more
than in absolute value. If the change was less than or equal
to then the count c
i
(k) is incremented, and if the change
was greater than then c
i
(k) is reset to 0. Note that the test
only occurs at nodes i(k) and j(k) for iteration k, and all
other nodes hold their counts xed.
After the absolute change in the estimate at node i has been
less than for C consecutive gossip rounds, or equivalently,
when c
i
(k) C, this node ceases to initiate gossip rounds
when its clock ticks. In order to avoid premature silencing,
even if c
i
(k) C, if node i is contacted by a neighbor then
it will still gossip and test whether its value has changed. In
this manner, even if the count c
i
(k
0
) C has exceeded C
at iteration k
0
, if at a future iteration k
1
> k
0
node i gossips
and its estimate changes by more than , then it will reset
c
i
(k
1
) = 0 and resume actively gossiping. If all nodes reach
counts c
i
(k) C, then no node will initiate another round of
gossip and we say the algorithm has been silenced. Pseudo-
code for simulating randomized gossip with the proposed local
silencing rule is given in Algorithm 2. Note that the test at
line 8 only needs to be performed once when simulating the
Algorithm 2 Randomized Gossip with Local silencing Rule
1: Initialize: x
i
(0)
iV
, c
i
(0) = 0 for all i V , and k = 1
2: repeat
3: Draw i(k) uniformly from V
4: if c
i(k)
(k 1) < C then
5: Draw j(k) according to P
i,j

jV
6: x
i(k)
(k)
1
2
_
x
i(k)
(k 1) +x
j(k)
(k 1)
_
7: x
j(k)
(k)
1
2
_
x
i(k)
(k 1) +x
j(k)
(k 1)
_
8: if [x
i(k)
(k) x
i(k)
(k 1)[ then
9: c
i(k)
(k) = c
i(k)
(k 1) + 1;
10: c
j(k)
(k) = c
j(k)
(k 1) + 1;
11: else
12: c
i(k)
(k) = 0;
13: c
j(k)
(k) = 0;
14: end if
15: for all v V i(k), j(k) do
16: x
v
(k) = x
v
(k 1)
17: c
v
(k) = c
v
(k 1)
18: end for
19: k k + 1
20: else
21: for all v V do
22: x
v
(k) = x
v
(k 1)
23: c
v
(k) = c
v
(k 1)
24: end for
25: end if
26: until c
v
(k) C for all v V
algorithm, since

x
i(k)
(k) x
i(k)
(k 1)

(4)
=

1
2
x
i(k)
(k 1) +
1
2
x
j(k)
(k 1) x
i(k)
(k 1)

(5)
=

1
2
x
i(k)
(k 1)
1
2
x
j(k)
(k 1)

(6)
=

x
j(k)
(k) x
j(k)
(k 1)

. (7)
Of course, in a decentralized implementation of the proposed
approach, each of the nodes i(k) and j(k) would perform the
test in parallel.
A number of questions immediately come to mind about
the proposed silencing rule: Are we guaranteed that all nodes
eventually stop gossiping? If they are all silenced, what is the
error in their estimates? Our main theoretical results answer
these questions as summarized in Theorem 1 below. The nal
error depends on characteristics of the network topology, and
so we rst introduce some notation. For a graph G = (V, E)
with n = [V [ nodes, let A R
nn
denote the adjacency
matrix; i.e., A
i,j
= 1 if and only if the graph contains the edge
(i, j) E. Also, let D denote a diagonal matrix whose ith
element D
i,i
= [A
i
[ is equal to the degree of node i. The graph
Laplacian of G is the matrix L = DA. Our bounds depend
on the network topology through: (1) the second smallest
eigenvalue of L, which we denote by
2
, (2) the number of
edges N = [E[ in the network, and (3) the maximum degree,
d
max
= max
i
D
i,i
. We assume that nodes know the maximum
degree in the network, d
max
.
Theorem 1: Let > 0 be given. Assume that |x(0)| < ,
and assume that P
i,j
correspond to the natural random
walk probabilities on G. After running randomized gossip
(Algorithm 2) with silencing rule parameters,
C = d
max
_
log(d
max
) + 2 log(n)
_
(8)
=

2
8N(C 1)
2
, (9)
the following two statements hold.
a) All nodes eventually stop gossiping almost surely; i.e.,
with probability one, there exists a K such that c
i
(k)
C for all i V and all k K.
b) Let K = mink : c
i
(k) C for all i V denote the
iteration when all nodes stop gossiping. With probability
at least 1 1/n, the nal error is bounded by
|x(K) x| . (10)
The proof of Theorem 1 is given in Section IV, but rst,
a few remarks are in order. Note the roles played by the
two silencing rule parameters, and C. Recall that C is the
number of consecutive times each node must pass the test
[x
i
(k) x
i
(k 1)[ before silencing. We need C to be
sufciently large so that nodes are not silenced prematurely,
and the choice of C above ensures, with high probability,
that before silencing, each node has recently gossiped with
all of its immediate neighbors and none of these updates
resulted in a signicant change to its estimate. This ultimately
guarantees that the desired level of accuracy is achieved
with high probability. In the simulation results presented in
Section V we show that even taking C = ,d
max
log d
max
|
generally sufces to achieve the target accuracy. On the other
hand, the parameter allows us to control the nal level of
accuracy, .
Another question of interest is: How long will it take until
all nodes are silenced? We investigate this issue via simulation
in Section V. Intuitively, because nodes only becomes silent
gossip rounds when their values are already close enough to
their neighbors, the rate of convergence of Algorithm 2 is
essentially the same as that of randomized gossip without
the local silencing rule (Algorithm 1). However, for certain
initial conditions, using the local silencing rule can result in
signicant savings in terms of the number of transmissions by
temporarily silencing certain nodes when they have nothing
interesting to tell their neighbors. For example, consider an
initial condition where all nodes have x
i
(0) = 0 except one
node that differs dramatically, e.g., x
1
(0) = 1000. In this
case, most nodes will have the same value as their neighbors
initially, and so they will cease to gossip until the energy from
node 1 propagates and reaches their region of the network. We
revisit this point and illustrate it further in Section V.
Finally, note that there is an overhead associated with using
a local silencing rule, in the following sense. Even if the
network is initialized to a consensus (i.e., x(0) = x), a
minimum number of gossip rounds must occur before the
network is silenced. This is the price one must pay for using
a decentralized silencing rule, and this price is precisely C,
the number of rounds each node must participate in before it
decides to become silent. In grids, d
max
= (1), and so C =
(log n). For random geometric graphs, d
max
= (log n)
with high probability, and so C = (log(n) log log(n)). In
any case, this is no worse than the best known scaling laws
for randomized gossip algorithms in wireless networks.
IV. ANALYSIS
A. Guaranteed silencing
We begin by proving part (a) of Theorem 1 which claims
that all nodes eventually become silent. Consider the squared
error, |x(k) x|
2
after iteration k. Since two nodes average
their values whenever they gossip, we are guaranteed that
|x(k) x|
2
is non-increasing, and we can quantify the
decrease at iteration k in terms of the values at nodes i(k)
and j(k).
Lemma 1: After i(k) and j(k) gossip at iteration k,
|x(k) x|
2
= |x(k1) x|
2

1
2
_
x
i(k)
(k1)x
j(k)
(k1)
_
2
.
(11)
From equations (6) and (7), we can also make the following
interesting observations about the relationship between values
at nodes i(k) and j(k) immediately after they gossip.
Lemma 2: After i(k) and j(k) gossip at iteration k,
a)

x
i(k)
(k) x
i
(k)(k 1)

> if and only if

x
j(k)
(k) x
j(k)
(k 1)

> ;
b)

x
i(k)
(k) x
i
(k)(k 1)

> if and only if

x
i(k)
(k 1) x
j(k)
(k 1)

> 2.
Let IA denote the indicator function of the event A. Since
by design, all nodes clocks tick according to independent
Poisson processes with identical rates, it follows that all nodes
tick innitely often, or limsup Ii(k) = v = 1 for all
v V . In particular, pathological sample pathse.g., where
one node ticks consecutively an innite number of times,
or where one nodes clock does not tick for an unbounded
period of timeoccur with probability zero. Formally, since
Pr(i(k) = v) = 1/n for all nodes v V , from the
second Borel-Cantelli Lemma [8] it follows that vs clock ticks
innitely often with probability 1.
Suppose, for the sake of a contradiction, that claim (a)
of Theorem 1 does not hold, and the network is not yet
silenced. This implies that there exists a node v V such
that limsup Ic
v
(k) < C = 1; i.e., v never reaches a
state where it permanently stops initiating gossip iterations.
According to steps 814 of Algorithm 2, one of two things
happens each time v participates in a gossip round: either
the absolute change in its estimate is small and it increments
c
v
(k), or the absolute change is larger than and it resets
c
v
(k) = 0. Since vs participates in innitely many gossip
rounds and limsup Ic
v
(k) < C = 1, it must be that v resets
it counter innitely often; i.e., limsup Ic
v
(k) = 0 = 1.
Let k
1
, k
2
, . . . , denote the iterations when v resets its counter.
Each time v resets its counter, it gossiped and the change
was greater than . By Lemma 2, this implies that each
time v resets its counter, the absolute difference between
x
v
(k
l
) and the value of the node it gossiped with is at least
2, and by Lemma 1, this implies that the squared error
|x(k
l
) x|
2
decreases by at least 4
2
at that iteration. By
assumption, the initial condition has nite norm, |x(0)| < ,
which implies that the initial squared error is also nite,
|x(0) x|
2
< . If the squared error decreases by 4
2
each
time v resets its counter, and if it resets its counter innitely
often, then |x(k) x|
2
as k . However, this
is a contradiction, since V (k) 0 by denition. Hence it
cannot happen that some node gossips innitely often, and so
all nodes eventually are silenced, which proves claim (a) of
Theorem 1.
B. Error When silenced
Next, we prove part (b) of Theorem 1, which bounds the
error |x(K) x| at the time K when all nodes becomes silent.
Our proof of the error bound involves two main steps. First,
we show that the choice of C = (d
max
log d
max
) ensures
that when all nodes are silenced, their estimates are relatively
close to all of their immediate neighbors. Then we show that
if all nodes estimates are close to their neighbors, then they
must be close to the average.
The rst part of the proof, is based on a standard results
from the study of occupancy problems, and in particular the
Coupon Collectors problem [15], [16]. In this problem, there
are d different types coupons. At each iteration, the coupon
collector is given a new coupon drawn uniformly from a pool
of coupons. The following is a standard tail-bound for the
number of iterations required to collect all types of coupons.
Lemma 3 (Coupon Collector [15], [16]): Let T be the
number of iterations it takes the coupon collector to get one
of each of the d types of coupons, and let 1. Then
Pr(T > d log d) d
1
. (12)
In particular, this bound suggests that after T = (d log d)
iterations, the collector have one of each coupon with high
probability. We apply this result to guarantee that a node has
recently gossiped with each one of its neighbors without seeing
a signicant change before it becomes silent. In particular,
for each node, we map its neighbors to coupons, and require
that it collect one coupon from each neighbor (which it does
only when gossiping with that neighbor results in an absolute
change of less than ) before silencing, with high probability.
Consequently, when a node is silenced, with high probability,
its estimate was recently close to all of its neighbors: if node
i is silenced at iteration K
i
, then min
l=0,...,C1
[x
i
(K
i
l)
x
j
(K
i
l)[ for all neighbors j A
i
. Unfortunately, this
is not sufcient to guarantee that [x
i
(K) x
j
(K)[ for all
pairs (i, j) E, since it could happen that after i and j gossip
with each other for the last time, i still gossips with another
neighbor. However, we can guarantee these differences do not
grow too large.
Lemma 4: If C = d
max
(log d
max
+2 log n), then at the time
K = infk : c
i
(k) C for all i V when the network is
silenced, with probability at least 1 1/n,
[x
i
(K) x
j
(K)[ 2(C 1) (13)
for all pairs of neighboring nodes, (i, j) E.
Proof: Let 1 whose value is to be determined, and let
B
i
denote the event that node i was silenced before contacting
all of its neighbors in the last C rounds. We associate with
each node i a coupon collector trying to collect d
i
= [A
i
[
coupons, so that B
i
= T
i
> d
i
log d
i
. By Lemma 3,
Pr(B
i
) d
1
i
d
1
max
. (14)
Applying the union bound, we can bound the probability that
some node where silenced without having contacted all of its
neighbors in the last C rounds by
Pr (
iV
B
i
)

iV
Pr(B
i
) = nd
1
max
. (15)
Then, taking = 1 + 2 log(n)/ log(d
max
), and accordingly
setting
C = d
max
log d
max
= d
max
(log d
max
+ 2 log n), (16)
we have that, with probability at least 1 1/n, all nodes
gossip with all of their neighbors in the iterations when c
i
(k)
goes from 1 to C. By Lemma 2, when i(k) and j(k) increment
their counts, c
i(k)
(k) and c
j(k)
(k), we know that [x
i(k)
(k1)
x
j(k)
(k 1)[ 2. Moreover, immediately after they gossip,
x
i(k)
(k) = x
j(k)
(k). Suppose that nodes i(k) and j(k) set
c
i(k)
(k) = 1 and c
j(k)
(k) = 1 at iteration k. In the worst case,
they each gossip C 1 more times with different neighbors
and their estimates change by each time, moving in opposite
directions (e.g., x
i(k)
(k) increasing and x
j(k)
(k) decreasing).
Then their nal estimates have drifted by 2(C 1). Since
this is true for every pair of nodes when they are silenced, we
have proved the Lemma.
We restrict P
i,j
to be the natural random walk probabilities
on G is in order to apply the standard form of the Coupon
Collectors problem, where all coupons have identical prob-
ability. The above result can be immediately generalized to
other distributions P
i,j
by application of variations of the
weighted Coupon Collectors problem [3].
We have established that when the network is silenced all
nodes have estimates at most 2(C 1) from their neighbors
with high probability. Next, we show that this implies all
nodes are close to the average. Even though neighboring nodes
have similar estimates, the difference between estimates can
propagate across the network. We will quantify how much this
error can propagate in terms of characteristics of the network
topology.
For a graph G, let A R
nn
denote the adjacency matrix;
i.e., A
i,j
= 1 if and only if the graph contains the edge
(i, j) E. Also, let D denote a diagonal matrix whose ith
element D
i,i
= [A
i
[ is equal to the degree of node i. The
graph Laplacian of G is the matrix L = DA. For a vector
x R
n
, it is easy to verify that
x
T
Lx =

iV

jNi
(x
i
x
j
)
2
. (17)
The results of Lemma 4 can be applied to each term on
the right-hand side of (17) to bound the magnitude of the
Laplacian quadratic form. The following lemma relates the
quadratic form on left-hand side of (17) to the squared error,
|x x|
2
.
Lemma 5: Let
1

2

n
denote the eigenvalues
of L sorted in ascending order. Then,
1

n
x
T
Lx |x x|
2

2
x
T
Lx. (18)
Proof: The proof follows from basic principles of linear
algebra and spectral graph theory. Let u
i
R
n

n
i=1
denote
the orthonormal eigenvectors of L, with u
i
being the eigen-
vector corresponding to eigenvalue
i
.
A well-known fact from spectral graph theory (see, e.g., [5],
[9]) is that, for a connected graph G, the smallest Laplacian
eigenvalue
1
= 0 is zero, and the corresponding orthonormal
eigenvector is u
1
=
1

n
1, where 1 denotes the vector of all
1s. Expanding L in terms of its eigendecomposition, we see
that
x
T
Lx = x
T
_
n

i=1

i
u
i
u
T
i
_
x (19)
=
n

i=2

i
x, u
i

2
, (20)
where x, u = x
T
u denotes the inner product between x and
u.
Next, consider the squared distance |x x|
2
from x to
its corresponding average consensus vector x. Recall that the
average consensus vector x can be written in terms of x as
x =
1
n
11
T
x = u
1
u
T
1
x = x, u
1
u
1
. (21)
Since the eigenvectors u
i
form an orthonormal basis for
R
n
, we can expand x in terms of u
i
:
x =
n

i=1
x, u
i
u
i
. (22)
Then, it is clear that subtracting x from x simply cancels
out the portion of x spanned by u
1
, leaving x x =

n
i=2
x, u
i
u
i
. Thus, the squared error is easily expressed
in terms of the eigenbasis u
i
as
|x x|
2
=
n

i=2
x, u
i

2
. (23)
Compare equations (20) and (23). To complete the proof,
observe that because we have ordered the eigenvalues in as-
cending order,
i
/
2
1 and
i
/
n
1 for all i = 2, . . . , n.
Thus,
n

i=2

n
x, u
i

i=2
x, u
i

i=2

2
x, u
i

2
, (24)
which is what we wanted to show.
Now, to complete the proof of Theorem 1(b) we just need to
put the various pieces together. Recall that N = [E[ denotes
0 0.02 0.04 0.06 0.08 0.1 0.12
0
500
1000
1500
2000
2500
3000
Edge difference at convergence
N
u
m
b
e
r

o
f

e
d
g
e
s


Predefined local error
Fig. 1. Distribution histogram of the edge differences |x
i
(K) x
j
(K)| for
a 0/100 initial condition in a 200 nodes network with C = dmax log(dmax).
the number of edges in G, and the sum on the right-hand side
of (17) contains two terms for each edge (once from i to j,
and once from j to i). Combining Lemma 4 and Lemma 5
gives the error bound,
|x(K) x|
2

1
2

iV

jNi
_
x
i
(K) x
j
(K)
_
2
(25)

8N(C 1)
2

2
, (26)
which holds with probability at least 1 1/n. Plugging in the
expression for from the statement of Theorem 1(b) yields
the desired bound, and thus completes the proof.
V. SIMULATION RESULTS
In this section we compare the simulation results of the
gossip with local silencing rule with different network ini-
tializations and network sizes. Important criteria that decide
the effectiveness of any gossip algorithm are the number of
transmissions required to convergence as well as the relative
error
||x(k) x||
||x(0) x||
when silenced. Unless otherwise specied, we
use a random geometric graph with 200 nodes.
Figure 1 shows the histogram of the differences between
edges at convergence [x
i
(K) x
j
(K)[ for an initial condi-
tion where half the nodes located in the same region have
x
i
(0) = 0 and the other half differs dramatically, e.g.,
x
i
(0) = 100 (later we denote this setting as 0/100 initializa-
tion). We can conclude that with very high probability, when
C = d
max
log(d
max
) all the edge differences are below .
For the realization of the graph depicted in the Figure 1, the
percentage of links having a difference above is equal to
0.15%.
Repeating the same simulation for a Spike initialization,
where all nodes have x
i
(0) = 0 except one node that differs
dramatically, e.g., x
1
(0) = 1000, all the edge differences
at convergence are situated below the threshold =0.1. This
0 0.1 0.2 0.3 0.4 0.5
10
2
10
3
10
4

N
u
m
b
e
r

o
f

t
r
a
n
s
m
i
s
s
i
o
n
s


0/100
GB
IID
Slope
Spike
Fig. 2. Number of transmissions for different node initializations. Each point
on this graph corresponds to the average the number of transmissions until
silencing for C = dmax log dmax and for values of ranging from 0.01 to
0.5.
shows that the local silencing rule achieves highly accurate
convergence for the case of Spike initialization
Decreasing C forces the gossip to be silenced prematurely
before achieving convergence, by simulating the same settings
as in Figure 1 with C= log(d
max
) we obtain that the percent-
age of links having a difference above is equal to 3.24%.
We next observe the number of transmissions to conver-
gence with respect to for many network sizes and initial-
izations. We examine the performance for a Slope linearly-
varying eld, a eld with the Spike signal, 0/100 initializa-
tion, Gaussian Bumps as well as the independent identically
distributed node initialization with mean 0 and variance 1. As
can be seen from Figure 2, the local silencing rule reduces the
number of transmissions for all the types of initial condition.
The reduction rate achieved by increasing is strickingly
higher for iid and spike initialization, while the reduction is
less pronounced for the case of 0/100 and Slope initialization.
Roughly speaking constant differences between neighbors
causes the GossipLSR to provide minimal gain for the Slope
initialization.
In Figure 3 we observe the number of transmissions to
convergence with respect to for different network sizes,
we evaluate the average number of transmissions for each
value of ranging from 0.01 to 0.5 at intervals of 0.01.
Figure 3 provides a better understanding of how the number of
transmissions is affected by different sized networks. It also
illustrates clearly that bigger networks achieves smaller re-
duction of the number of transmissions for the same silencing
parameter .
In Figure 4 we plot the performance of local silencing rule
for three different values of as a function of the number of
transmissions. As can be noticed, the number of transmissions
for =0 is similar to the case of the randomized gossip
0 0.1 0.2 0.3 0.4 0.5
10
2
10
3
10
4

N
u
m
b
e
r

o
f

t
r
a
n
s
m
i
s
s
i
o
n
s


N=50
N=200
N=400
Fig. 3. Number of transmissions with respect to for different network sizes.
Each point on this graph corresponds to the average number of transmissions
with respect to a certain value of where C = dmax log dmax.
0 1000 2000 3000 4000 5000 6000
10
2
10
1
10
0
Number of transmissions
R
e
l
a
t
i
v
e

e
r
r
o
r


Randomized Gossip
GossipLSR =0.1
GossipLSR =0.6
Fig. 4. Relative error vs Number of transmissions required for different
values of where C = dmax log dmax in a 200 nodes network deployed
according to a RGG topology. The randomized gossip without GossipLSR is
well illustrated when =0.
algorithm. Also observe the reduction achieved by the local
silencing rule in terms of the number of transmissions, recall
that when decreases, we have a tighter local condition that
implies a bigger number of transmissions and a smaller global
relative error.
Obviously as mentioned previously, the local silencing rule
reduces the number of transmissions, once the convergence
is reached, it takes some time that we dene as the latency
during which nodes wake up randomly and dont transmit
(since they are passive and their values satisfy the silencing
rule). During this time, the relative error changes very slowly
over time. Figure 5 shows the number of iterations required
by the local silencing rule to reach convergence. Obviously
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
10
3
10
2
10
1
10
0
Number of iterations
R
e
l
a
t
i
v
e

e
r
r
o
r


Randomized Gossip
GossipLSR =0.1
GossipLSR =0.6
Fig. 5. Number of iterations corresponding to different values of , where
C = dmax log dmax in a 200 nodes network deployed according to a RGG
topology and having a Spike initial condition. The randomized gossip without
GossipLSR is well illustrated when =0.
comparing Figures 4 and 5, we observe that the number of
iterations is higher than the number of transmissions and this
illustrates the latency effect of GossipLSR.
VI. CONCLUSION AND FUTURE WORK
One of the major challenges for next-generation wireless
systems is to be as resource-efcient as possible. Conserving
battery power in wireless sensor networks can be done by
reducing the number of wireless transmissions for most of
the network topologies. This paper present and investigate
a modied model for information gossiping in networks:
Local silencing rule. The main advantage of this approach
is to provide an accurate and simple gossip algorithm while
simultaneously reducing the number of transmissions. We
proved convergence theoretically and analyzed its performance
in terms of convergence speed and number of transmissions
for different network sizes, topologies and initializations.
The work in this paper paves the way for many new
directions. A future work might include a theoretical analysis
of the number of transmissions saved with the local silencing
rule compared to the classical randomized gossip case. Also,
we are keen on extending the idea of gossiping with local
silencing rule to gossip models other than the randomized
gossip such as geographic gossip [7], GGE [25] and multi-
scale gossip [23]. This plan of study will be both an interesting
and equally challenging problem in future work.
REFERENCES
[1] T. Aysal, E. Yildiz, A. Sarwate, and A. Scaglione. Broadcast gossip
algorithms for consensus. IEEE Trans. Signal Processing, 57(7):2748
2761, Jul. 2009.
[2] F. Benezit, A. Dimakis, P. Thiran, and M. Vetterli. Gossip along the way:
Order-optimal consensus through randomized path averaging. In Proc.
Allerton Conf. on Comm., Control, and Comp., Urbana-Champaign, IL,
Sep. 2007.
[3] P. Berenbrink and T. Sauerwald. The weighted coupon collectors
problem and applications. In Proc. COCOON, Niagra Falls, NY, Jul.
2009.
[4] S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah. Randomized gossip
algorithms. IEEE Trans. Inf. Theory, 52(6):25082530, Jun. 2006.
[5] F. Chung. Spectral Graph Theory. American Math. Society, 1997.
[6] A. Dimakis, S. Kar, J. Moura, M. Rabbat, and A. Scaglione. Gossip
algorithms for distributed signal processing. to appear, Proceedings of
the IEEE, Jan. 2011.
[7] A. Dimakis, A. Sarwate, and M. Wainwright. Geographic gossip:
Efcient averaging for sensor networks. IEEE Trans. Signal Processing,
56(3):12051216, Mar. 2008.
[8] R. Durrett. Probability: Theory and Examples. Cambridge Univ. Press,
4th edition, 2010.
[9] C. Godsil and G. Royle. Algebraic Graph Theory. Springer-Verlag,
2001.
[10] P. Gupta and P. R. Kumar. Critical power for asymptotic connectivity.
In Proc. IEEE Conf. on Decision and Control, Tampa, FL, Dec. 1998.
[11] B. Johansson and M. Johansson. Faster linear iterations for distributed
averaging. In Proc. IFAC World Congress, Seoul, South Korea, Jul.
2008.
[12] K. Jung, D. Shah, and J. Shin. Fast gossip through lifted Markov
chains. In Proc. Allerton Conf. on Comm., Control, and Comp., Urbana-
Champaign, IL, Sep. 2007.
[13] E. Kokiopoulou and P. Frossard. Polynomial ltering for fast con-
vergence in distributed consensus. IEEE Trans. Signal Processing,
57(1):342354, Jan. 2009.
[14] W. Li and H. Dai. Location-aided distributed averaging algorithms.
In Proc. Allerton Conf. on Comm., Control, and Comp., Urbana-
Champaign, IL, Sep. 2007.
[15] M. Mitzenmacher and E. Upfal. Probabilty and Computing: Randomized
Algorithms and Probabilistic Analysis. Cambridge Univ. Press, 2005.
[16] R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge
Univ. Press, 1995.
[17] B. Oreshkin, M. Coates, and M. Rabbat. Optimization and analysis
of distributed averaging with short node memory. IEEE Trans. Signal
Processing, 58(5):28502865, May 2010.
[18] M. Penrose. Random Geometric Graphs. Oxford University Press, 2003.
[19] M. Rabbat, J. Haupt, A. Singh, and R. Nowak. Decentralized compres-
sion and predistribution via randomized gossiping. In Proc. ACM/IEEE
Conf. on Information Processing in Sensor Networks, Nashville, TN,
April 2006.
[20] M. Rabbat, R. Nowak, and J. Bucklew. Robust decentralized source
localization via averaging. In Proc. IEEE ICASSP, Philadelphia, PA,
March 2005.
[21] V. Saligrama, M. Alanyali, and O. Savas. Distributed detection in sensor
networks with packet loss and nite capacity links. IEEE Trans. Signal
Processing, 54(11):41184132, Nov. 2006.
[22] O. Savas, M. Alanyali, and V. Saligrama. Efcient in-network processing
through information coalescence. In Proc. Dist. Comp. in Sensor Sys.,
San Francisco, Jun. 2006.
[23] K. Tsianos and M. Rabbat. Fast decentralized averaging via multi-
scale gossip. In Proc. IEEE Conf. on Distributed Computing in Sensor
Systems, Santa Barbara, CA, Jun. 2010.
[24] J. Tsitsiklis, D. Bertsekas, and M. Athans. Distributed asynchronous
deterministic and stochastic gradient optimization algorithms. IEEE
Trans. Automatic Control, AC-31(9):803812, Sep. 1986.
[25] D.

Ustebay, B. Oreshkin, M. Coates, and M. Rabbat. Greedy gossip
with eavesdropping. IEEE Trans. Signal Processing, 58(7):37653776,
Jul. 2010.