Research Article: Data Gathering Techniques For Wireless Sensor Networks: A Comparison

Hindawi Publishing Corporation
International Journal of Distributed Sensor Networks

Volume 2016, Article ID 4156358, 17 pages
http://dx.doi.org/10.1155/2016/4156358
Research Article
Data Gathering Techniques for Wireless Sensor Networks:
A Comparison
Giuseppe Campobello, Antonino Segreto, and Salvatore Serrano

Department of Engineering, University of Messina, Contrada Di Dio, Sant’Agata, 98166 Messina, Italy
Correspondence should be addressed to Giuseppe Campobello; gcampobello@unime.it
Received 2 November 2015; Revised 3 February 2016; Accepted 4 February 2016
Academic Editor: Haiping Huang
Copyright © 2016 Giuseppe Campobello et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
We study the problem of data gathering in wireless sensor networks and compare several approaches belonging to different research
fields; in particular, signal processing, compressive sensing, information theory, and networking related data gathering techniques
are investigated. Specifically, we derived a simple analytical model able to predict the energy efficiency and reliability of different
data gathering techniques. Moreover, we carry out simulations to validate our model and to compare the effectiveness of the above
schemes by systematically sampling the parameter space (i.e., number of nodes, transmission range, and sparsity). Our simulation
and analytical results show that there is no best data gathering technique for all possible applications and that the trade-off between
energy consumptions and reliability could drive the choice of the data gathering technique to be used. In this context, our model
could be a useful tool.
1. Introduction So, in contrast to many other wireless devices (e.g., cel-

lular phones, PDAs, and laptops), usually it is not expected
Wireless sensor networks (WSNs) are composed of a lot of to renew the energy supplied to a wireless sensor node
tiny, low power, and cheap wireless sensors, deployed in a during the life of the WSN. For this reason, each sensor
geographic area to perform distributed tasks, for example, node is required to work under very low power consumption
to monitor a physical phenomenon [1]. In 2004, MIT Tech- conditions.
nology Review ranked WSNs as the number one emerging In general, to design a highly energy-efficient WSN, it
technology [2] and today they are effectively employed for is extraordinarily important to take into account capture,
many applications, such as surveillance (e.g., real-time area transmission, and routing issues, that is, data gathering
audio or video surveillance), security (e.g., detection of techniques that specify how ordinary sensors work for gath-
biological agents or toxic chemicals), habit monitoring (e.g., ering information and delivering them to the sink. As a
environmental measurement of temperature, pressure, or consequence, data gathering is the main and more critical
mechanical vibration), home automation, military systems, function provided by a WSN.
and, in general, scientific experiments. The main aim of this paper is to compare some of the
In a typical WSN topology, we can distinguish between state-of-the-art data gathering techniques considering their
ordinary wireless sensor nodes and base stations named trade-off between reliability (i.e., packet loss and reconstruc-
sinks. The sink is usually connected to a power supply and tion error) and energy consumptions (i.e., network lifetime)
it is capable of performing more complex operations than the by taking into account both compression and networking
ordinary nodes. Ordinary wireless sensor nodes, which are aspects. To the best of our knowledge, this is the first paper
capable of transferring processed or raw sensed data to the that considers such type of comparison for data gathering
sink, due to economical reasons, are instead usually powered techniques belonging to different research fields (i.e., signal
by small size batteries that in most application scenarios are processing, compressive sensing, information theory, and
difficult or even impossible to replace or recharge. networking related techniques are discussed and compared
2 International Journal of Distributed Sensor Networks
in this paper). Specifically, we derived a simple analytical different trade-offs between reliability and energy saving.
model able to predict the energy efficiency and reliability Finally, in [17], the authors address the problem of maximiz-
of several data gathering techniques. Moreover, we carry ing network lifetime by taking into account also latency and
out simulations to validate our model and to compare the reliability.
effectiveness of the above schemes by systematically sampling However, with the exception of [15], the above papers do
the parameter space (i.e., number of nodes, transmission not perform any kind of data aggregation with the aim of not
range, and sparsity). introducing extra delay.
The rest of the paper is organized as follows. In Section 2, As shown in [18–21], by combining data aggregation and
we present a summary of related works. In Section 3, further routing mechanisms, efficient data gathering schemes can be
details about existing data gathering techniques are provided obtained.
by highlighting their advantages and drawbacks. In Section 4,
In particular in [18], the problem of jointly optimized
the simulation scenario used for comparisons is detailed and
routing and data aggregation is investigated; in [19], the
an analytical model able to predict the energy efficiency and
authors combine data compression and multipath routing
reliability of different data gathering techniques is derived.
techniques to obtain a reliable and low-latency data aggrega-
In Section 5, the metrics used to compare data gathering
tion scheme; in [20], an energy-balanced data gathering and
techniques are introduced. In Section 6, simulation results
aggregating scheme is proposed which integrates a clustering
are provided and the developed analytical model is validated.
hierarchical structure with the compressive sensing to opti-
Finally, in Section 7, some concluding remarks and future
mize and balance the amount of data transmitted; finally in
works are drawn.
[21] a data gathering technique based on the network coding
For the sake of clarity, symbols and notations used
paradigm is proposed.
throughout the paper are reported in Notations section.
Further details about the above works will be provided in
Section 3.4.
2. Related Works Several other papers exist which compare signal process-
In the past few years, several data gathering techniques have ing techniques in WSNs from energy efficiency and network
been proposed for WSNs with the main aim of reducing lifetime perspectives and several works highlight the effect of
energy consumptions in WSNs by exploiting correlations using different routing mechanisms for data gathering.
among sensory data. We can distinguish them into two broad Nevertheless, techniques belonging to different research
categories: compression-oriented and networking-oriented. fields such as compressive sensing, information theory, and
The first category, named compression-oriented, is focused networking are seldom evaluated against one another and this
on maximizing network lifetime by taking advantage of data is the main goal of this paper.
compression techniques [3–10]. In particular, [3, 4] analyze Only recently, such comparisons have started, for in-
different lossless compression schemes for WSNs exploiting stance, [22] where lossy data aggregation techniques are eval-
the temporal correlation in the sampled signals; in [5, 6], the uated and compared in terms of reconstruction errors and
authors exploit spatial correlation by using distributed source energy consumptions. However, authors do not consider the
coding techniques based on the Slepian-Wolf theorem; finally, impact of network reliability (i.e., packet loss) nor compare
[7–10] investigate the fundamental limits of data gathering networking-based data gathering techniques such as those
techniques based on the new paradigm of compressive based on the network coding paradigm.
sensing [11, 12]. A comprehensive review of existing data The aim of this paper is to fill this gap by investigating the
compression approaches in WSNs is provided in [13]. effectiveness of all the above data gathering techniques also in
Further details about compression-oriented data gather- terms of reliability. In particular, an analytical model able to
ing techniques will be provided in Section 3. In particular, sig- predict the energy efficiency and reliability of different data
nal processing, compressive sensing, and information theory gathering techniques is derived.
related techniques are discussed, respectively, in Sections 3.1,
3.2, and 3.3.
Since radio transmission is the primary source of power 3. Data Gathering Techniques
consumption in WSNs, a second category of data gathering
techniques, named networking-oriented, have dealt with the We can classify data gathering schemes on the basis of the
problem of maximizing network lifetime by taking into research field from which the technique used to exploit
account network protocols and, more specifically, forward- correlation among sensor nodes is drawn, that is,
ing/routing mechanisms [14–17]. (i) signal processing;
In particular, in [14], the authors show how it is possible
to maximize the lifetime of a WSN by exploiting routing (ii) compressive sensing;
algorithms. In [15], the authors study the problem of forest (iii) information theory;
construction for maximizing the network lifetime and adopt (iv) networking.
a simple data aggregation model where an intermediate sen-
sor can aggregate multiple incoming messages into a single Techniques belonging to the above fields are discussed
outgoing message. A different approach is proposed in [16] in the next subsections by highlighting their advantages and
where a smart splitting technique is used in order to achieve drawbacks.
International Journal of Distributed Sensor Networks 3
3.1. Signal Processing Techniques. Frequently high correla- therefore high compression ratios can be achieved using CS
tions (spatial and/or temporal) among sensor readings exist. (i.e., by transmitting CS measurements 𝑦 instead of raw data
In this case, it is inefficient to deliver the entire raw data 𝑥).
to the destination [8, 9] and signal processing, in particular Reconstruction is achieved by solving a complex opti-
Transforms and Encoding Compression (TEC) techniques, mization problem of the following form:
can be exploited in order to reduce the amount of data to send.
In the case of local TEC techniques, node collects meas- arg min ‖𝛼‖1
(1)
urements following the Shannon-Nyquist sampling theorem; s.t. 𝑦 = 𝐴𝛼,
these measurements are transformed and properly encoded
and the output of such transformation is stored in the payload where 𝐴 = ΦΨ. Once the above problem is solved, 𝑥 can be
of one or more packets and sent to the sink. In particular, recovered as 𝑥 = Ψ𝛼. Alternatively, when noised measures
either lossy or lossless techniques can be used depending on are considered, the following optimization problem can be
the particular application scenario. considered:
With lossy techniques [3], the original data is compressed
discarding some of the original information; this allows arg min ‖𝛼‖1
(2)
achieving higher compression ratios but at the receiver side 󵄩󵄩 󵄩2
one can only reconstruct the data with a certain accuracy. s.t. 󵄩󵄩𝑦 − 𝐴𝛼󵄩󵄩󵄩2 ≤ 𝜖,
However, in some types of monitoring, the accuracy where 𝜖 bounds the noise.
of observations is critical for understanding the underlying Several algorithms exist which are able to solve the above
physical processes. In other cases, it is not possible to have optimization problem (Basis Pursuit [25], OMP [26], and
an a priori knowledge about the magnitude of observational CoSaMP [27], to name just a few) and several theoretical
errors that are tolerable without affecting a correct data results exist describing when these algorithms recovered
gathering. Moreover, some application domains (e.g., body sparse solutions. In particular,
area networks BANs in which sensor nodes permanently
monitor and log vital signs) demand sensors with high (i) as proved in [28], a signal 𝑥 can be recovered with
accuracy and cannot tolerate measurements corrupted by high probability if Φ satisfies the Restricted Isometric
lossy compression processes. Property (RIP).
In all these kinds of WSNs, lossless data gathering is Formally, a matrix Φ satisfies the RIP if for all 𝑘-sparse
essential and desirable. Examples of local lossless compres- signals 𝑥 exists a 𝛿𝑘 ∈ (0, 1) such that
sion schemes have been proposed in [4, 23, 24].
Lossy compression techniques have been evaluated and (1 − 𝛿𝑘 ) ‖𝑥‖22 ≤ ‖Φ𝑥‖22 ≤ (1 + 𝛿𝑘 ) ‖𝑥‖22 . (3)
compared in terms of reconstruction errors and energy
consumptions in [22]; therefore, in this paper we concentrate Example of matrices that satisfy the RIP condition are
our attention on lossless techniques. ±1 Bernoulli matrix and Gaussian distribution matrix
For the sake of space, we did not consider distributed where 𝜙𝑖,𝑗 ∼ 𝑁(0, 𝜎Φ2 ) with 𝜎Φ2 = 1/𝑚.
TEC techniques in this paper but we refer the reader to
(ii) As proved in [29], in the noise-free case, exact recov-
[13]. Nevertheless, the major distributed approaches of signal
ery with Gaussian matrix can be obtained if
processing applied to WSNs will be discussed in the next
subsections. 𝑛 5
𝑚 = 𝑚∗ = 2𝑘 log ( ) + 𝑘 + 1. (4)
𝑘 4
3.2. Compressive Sensing. Compressive sensing (CS) is a new
paradigm introduced by Candes and Tao [11] and Donoho We will exploit the above results to derive a reliability
[12] used to capture and to compress signals in WSNs where model for CS.
compression and sampling are merged and carried out at the CS can be applied in cluster-based WSNs considering that
same time. Basically, CS compresses a signal while acquiring each sensor node in a cluster sends its reading 𝑥𝑖 to the cluster
data at its information rate (without relying to the Shannon- head which will multiply all received readings by random
Nyquist sampling theorem). CS theory states that if a signal coefficients 𝜙𝑖,𝑗 by generating 𝑚 weighted sums 𝑦𝑖 = ∑𝑗 𝜙𝑖,𝑗 𝑥𝑗
is sparse or compressible in a certain basis, then it can be with 𝑖 ∈ [1, . . . , 𝑚]. Next, values 𝑦𝑖 , named CS measurements,
reconstructed from a small number of linear measurements are sent to the sink through one or multiple packets.
by solving an 𝑙1 based convex optimization problem [7]. On the basis of the CS theory, under sparsity condition, by
More precisely, let us define 𝑘-sparse signals 𝑥 = (𝑥1 , . . . , collecting a sufficient number of CS measurements, the sink
𝑇
𝑥𝑛 ) as signals that can be expressed as 𝑥 = Ψ𝛼, where Ψ is an will be able to reconstruct the original sensor data 𝑥𝑖 .
orthonormal transform and 𝛼 is a vector with at most 𝑘 ≪ 𝑛 CS can be applied also in tree-based WSNs considering
nonzero entries; CS theory states that 𝑥 can be recovered from that the source node includes in its packet the sensing
𝑚 = 𝑂(𝑘 log(𝑛/𝑘)) linear combinations of measurements information which is the product of its acquired value and
obtained as 𝑦 = Φ𝑥, where Φ is an 𝑚 × 𝑛 matrix. a random coefficient and then sends it to its next hop node
Note that, considering that 𝑘 is a small value in compar- [30]. In such way, CS compression could be performed with
ison to 𝑛, it follows that 𝑚 can be much smaller than 𝑛 and a low complexity at source nodes [9] and data traffic over
the network is reduced [8]. However, in the last case a high data is then compressed with a code that identifies the unique
number of hops are needed, that is, ℎ = 𝑂(𝑛/𝑘), which may bin where the sampled value lies.
lead to high network latency. To better explain how DISCUS works let us consider a
In [7], comparisons of CS-based and conventional signal simple example.
processing techniques for WSNs have been carried out in Let us suppose that (quantized) measurements are in the
terms of energy efficiency and network lifetime. integer range [0, 7] and that all data sensed from different
However, there are several challenges that must be sensors at (almost) the same time differ by at most ±1.
addressed in order to use CS: Without DSC compression, three bits are needed for each
(i) Decoding time for reconstruction can be 𝑂(𝑛3 ) and sensor to represent sensory data. Instead, in the case of
therefore prohibitively expensive for large networks. DSC only the cluster head sends a three-bit value. The other
Less expensive algorithms exist (e.g., matching pur- sensor nodes can split the possible values into four bins
suit) but they provide less stable recovery and weaker {0, 4}; {1, 5}; {2, 6}; {3, 7} so that values in the same bin have
error bounds in the recovered solution. a minimum distance 𝑑 = 4 and encode them, respectively,
with {00}, {01}, {10}, {11}. So if the sink receives 01 from a
(ii) CS assumes that the sensed data has a known constant sensor node it knows that only two values are possible, that
sparsity, ignoring that the sparsity of real signals is, {1, 5}. Now let us suppose that the sink receives also the
varies in temporal and spatial domain. In particular, value 6 (properly encoded) from the cluster head; in this case,
the sparsifying basis Ψ is assumed to be given and the value 01 is immediately interpreted as 5 without ambiguity
fixed with time, but this is not the case for a real- as a consequence of the fact that sensed data can differ by at
istic WSN scenario, where the signal of interest is most 1.
unknown and its statistical characteristics can vary In the above example, only the cluster head transmits 3
over time. bits for each measurement while for all the other nodes 2 bits
(iii) CS-based techniques introduce not negligible losses are enough; therefore, compression is achieved.
(recovery errors) by reducing reliability and work best However, DSC relies on the assumption that statistical
for large scale networks (at least a thousand nodes). characteristics (i.e., correlation function) of the underlying
(iv) Quantization effects: CS theory has mostly focused on data should be known a priori, which is difficult to obtain in
real-valued measurements but in practice measure- practical scenario. For instance, the simple DISCUS scheme
ments must be represented with a finite number of discussed above works only if the difference between the
bits. As a consequence, a trade-off exists between the value sampled by the cluster head and all the other nodes
number of measurements 𝑚 and the number of bits in the same cluster is less than 𝑑/2. Moreover, losing side
per measurement 𝑏CS . information (i.e., cluster head data) will cause fatal errors to
the decoder, that is, low reliability.
In this paper, we concentrate on the last two problems, A simple manner to improve reliability is achieved by
by analyzing the effect of sparsity and quantization on energy retransmitting cluster head packets more times but this
saving and reliability. Further details on CS and how to exploit reduces compression efficiency. Therefore, a trade-off exists
it for WSNs will be given in the next sections. between energy consumptions and reliability on the basis of
the maximum allowed number of retransmissions. In this
3.3. Information Theory Related Techniques. In order to paper, we investigate such a trade-off.
exploit the correlation of data concurrently acquired by dif-
ferent sensors, DSC techniques, inspired by the Slepian-Wolf 3.4. Networking Techniques
theorem, can be applied [2]. The DSC techniques imply that
each sensor node sends its compressed outputs to the sink for 3.4.1. Routing-Based Techniques. Since radio transmission is
joint decoding. This means that the nodes need to cooperate the primary source of power consumption at the nodes, the
in groups of two or three so that one node provides the side design of energy-efficient routing is another important topic
information and another one can compress its information to investigate in the design of data gathering technique. The
down to the Slepian-Wolf or the Wyner-Ziv limit. Further- basic idea is to route the packet through the paths so as
more, DSC approaches are also difficult to be applied in to minimize the overall energy consumption for delivering
such scenarios since they work with the assumption that the the packet from the source to the destination. The problem
statistical characteristics of the underlying data distribution focuses on computing the flow and transmission power to
should be known in advance [9, 31]. maximize the lifetime of the network, which is the time at
The most practical and well-known implementation of which the first node in the network runs out of energy [18].
DSC is DISCUS [5, 32] where sensor nodes are considered Specifically, the energy consumption rate per unit of infor-
divided into clusters. For each cluster, a node (the cluster mation transmission for each node depends on the choice of
head) sends uncompressed data (as side information) while the next hop, that is, the routing decision. This choice can
all other nodes transmits encoded (i.e., compressed) data. influence the energy required to reach the sink [14].
To encode data, a sensor node firstly divides all possible One of the most recent works which addresses the prob-
values into disjoint sets (named bins) so that values in the lem of maximizing network lifetime taking into account the
same bin have a minimum distance 𝑑. Each piece of sensory routing mechanism is [17]. The authors try to achieve both
low latency and high reliability. They construct a data gather- have the ability to forward functions of received packets
ing tree based on a reliability model, schedule data transmis- (e.g., linear combinations). In this manner, throughput gain,
sions for the links on the tree, and assign transmitting power robustness, and energy saving can be achieved by exploiting
to each link accordingly. However, they do not perform any the fact that each newly generated packet carries information
kind of data aggregation or data compression with the aim of contained in several original packets [39].
not introducing extra delay. NC has received increasing attention also in WSNs as a
Data aggregation can be performed on top of the routing promising tool to improve network lifetime and reliability by
algorithm. The aggregation function is usually performed by exploiting the broadcast nature of the wireless channel [40].
extracting some statistical values (e.g., maximum, minimum, However, most of the proposed techniques developed so
and average) and then by transmitting only these [15]. In such far [41–43], whilst being useful for data dissemination (e.g.,
a way, it is possible to reduce the amount of communicating traffic from the sink node to the sensor nodes), cannot be
data in the dense sensor networks and reduce the power applied for data collection, which is the most important traffic
consumption. However, this technique loses much of the in WSNs.
structure of the original acquired data.
In fact, to apply NC for data gathering in WSNs, some
In particular, the authors of [15] study the problem of
issues have to be solved.
forest construction for maximizing the network lifetime.
They adopt a simple data aggregation model and assume that
(i) Header Overhead. NC schemes are mostly based on
an intermediate sensor can aggregate multiple incoming B-
random linear codes [44, 45] which allow implementing
bit messages, together with its own message, into a single
them in a distributed manner but introduce large overhead
outgoing message. Moreover, they provide a polynomial time
because coefficients used for linear combinations should be
algorithm to build the tree and demonstrate that it is close to
specified in packets header. The header size is proportional
optimal.
to the number of aggregated packets that, in the specific case
In [14, 33, 34], the author considered the problem of of data collection in WSNs, could be equal to the number of
maximizing the lifetime of WSNs by routing algorithms by nodes in the network.
recasting this problem as a linear programming problem
solvable in polynomial time. The proposed algorithm is a (ii) All-or-Nothing Problem. When 𝑛 packets are combined
shortest cost path routing whose link cost is a combination using NC, the sink has to receive at least 𝑛 packets in order to
of transmission and reception energy consumption and the be able to recover the original information. Thus, even if the
residual energy levels at the two end nodes. sink receives 𝑛 − 1 packets, it cannot recover any information.
In [18], the authors try to jointly optimize routing and Instead, graceful degradation is desirable in WSNs.
data aggregation so that the network lifetime can be extended
considering two dimensions. In the first dimension, the traffic (iii) Delay. The delay introduced by NC might be prohibitive
across the network is reduced by data aggregation, so that one for large networks where a large number of packets should
can reduce the power consumption of the nodes close to the be combined and decoded. Instead, many sensor networks
sink node. In the second dimension, the traffic is balanced applications, for instance, WSNs developed for control/auto-
to avoid overwhelming the bottleneck nodes. A smoothing mation or real-time audio/video streaming, require small
function is used to approximate an original maximization bounded delays.
function by exploiting the special structure of the network.
The necessary and sufficient conditions for achieving the (iv) Duty Cycling. Most of NC schemes are based on over-
optimality of this smoothing function were derived and a hearing; that is, nodes should remain in active mode to
distributed gradient algorithm was accordingly designed. participate in NC-based routing, which increases the energy
Yang et al. propose in [35–37] a joint design of energy consumption of the sensor nodes. So, it is difficult to couple
replenishment and data gathering by exploiting mobility. The NC paradigm and duty-cycling techniques commonly used
SenCar, a multifunctional mobile entity, periodically chooses in WSNs.
a subset of sensors to visit based on their energy status. It
utilizes wireless energy transmissions to deliver energy to the (v) Reliability. Full (or at least high) reliability is desirable
visited sensors and, meanwhile, it collects data from nearby in sensor networks and is mandatory in several scenarios,
sensors via short-range multihop communications and can for instance, in new scientific experiments, where accuracy
convey this data to the sink. of observations is critical, or in the case of biomedical
applications, where it is necessary to ensure that important
details are not lost causing errors in medical diagnosis. When
3.4.2. Network Coding Based Techniques. A different ap-
random codes are used, even in a reliable network, the
proach exploits the network coding (NC) paradigm.
original messages can be retrieved with “high probability”
NC is an effective information transmission approach (though not “certainty”), and high probability is achieved
originally introduced by Ahlswede et al. [38] to improve through the use of large finite fields (i.e., large coefficients and
network capacity of multicast networks. therefore large headers).
Differently from the classical store and forward network
paradigm where nodes simply replicate and forward incom- (vi) Complexity. NC techniques should be simple to cope with
ing packets, using NC intermediate nodes in the network low computational and memory resources of sensor nodes.
We refer the reader to [46] for further considerations on 4.1. Network Model. We assume a WSN where the sink is
the applicability of NC to WSNs. located in the center of a square sensing area of size 𝐺 ×
The above issues have been solved in [16] where the 𝐺 [m2 ] and sensor nodes are randomly distributed with
authors proposed a new forwarding technique for WSNs density 𝜌 [nodes/m2 ]. Each sensor node has a transmission
based on the Chinese Remainder Theorem (CRT) able to range equal to 𝑅 [m] (with 𝑅 ≪ 𝐺/2) and sends its data to
achieve different trade-offs between reliability and energy the sink through a multihop scheme.
saving. The network is partitioned into nonoverlapped clusters
Basically, CRT can be seen as a splitting technique able using the procedure described in [16, 49]. The above-men-
to transform an integer number 𝑍 into a vector of smaller tioned procedure is mainly based on the exchange of Initial-
numbers named CRT components, {𝑧𝑖 }. CRT components are ization Messages (IMs) and allows organizing the network in
obtained from number 𝑍 using modular arithmetic as 𝑧𝑖 = clusters minimizing the number of hops needed by a sensor
𝑍 (mod𝑝𝑖 ), where 𝑝𝑖 (with 𝑖 ∈ [1, . . . , 𝑁CRT ]) are prime node to reach the sink. The sink is supposed to belong to
numbers (or at least pairwise coprime integer numbers). cluster 1 (denoted as CL1 ) and generates a first IM with its own
CRT states that every integer number 𝑍 can be exactly address and a sequence number SN = 2. Each node, which
recovered from its CRT components if the product of prime receives an IM from its neighbors with a sequence number
𝑁CRT SN = ℎ, will belong to cluster CLℎ and will retransmit the IM
numbers 𝑃 = ∏𝑗=1 𝑝𝑗 satisfies the condition
with an increased SN value together with its own address and
𝑃>𝑍 (5) the list of the nodes that will be used as forwarders (which
it knows according to the source addresses specified in the
(henceforward named reconstruction condition). In particu- received IMs). On the basis of the received IMs, at the end
lar, the CRT always states that 𝑍 can be recovered through a of the above procedure, each node in the network will know
simple linear combination as its own next hops and which other nodes will use it as a
𝑁CRT next-hop. Further details on the initialization procedure are
𝑍 = ∑ 𝑐𝑗 𝑧𝑗 (mod 𝑃) . (6) reported in [49].
𝑗=1 We assume that the above initialization procedure is car-
ried out only one time so we neglect related energy consump-
Coefficients 𝑐𝑗 are given by 𝑐𝑗 = 𝑄𝑗 ⋅ 𝑞𝑗 , where 𝑄𝑗 = 𝑃/𝑝𝑗 tions.
and 𝑞𝑗 is its modular inverse obtained by solving 𝑞𝑗 𝑄𝑗 = In the following, nodes along the path from a source to
1 (mod 𝑝𝑗 ). the sink are referred to as relayers and nodes located one hop
CRT can be applied in WSNs to split packets produced by away from the source along the path to the sink are specifi-
sensor nodes. Such smaller packets (i.e., CRT components) cally called one-hop relayers.
can be sent through different paths by exploiting path diver- We assume that, independently of the specific data gath-
sity of WSNs. The fact that relayers nodes forward smaller ering technique, relayers transmit packets through a load-
packets allows reducing energy consumption [47]. balancing shortest-path scheme; that is, a node in cluster
Moreover, CRT has several advantages in comparison to CLℎ+1 will select randomly a reachable node in the next
other NC techniques: cluster toward the sink (CLℎ ), and this forwarding scheme is
repeated until the sink is reached. In this manner, information
(i) The set of prime numbers {𝑝𝑖 } can be chosen so that reaches the sink with the minimum number of hops.
information produced by the sensor nodes can be
reconstructed even if only a fraction of the CRT com- 4.2. Data Gathering Model. Until now, we have classified data
ponents are received by the sink, by improving relia- gathering techniques on the basis of the data aggregation
bility and solving the all-or-nothing problem. technique used. However, data gathering techniques can be
(ii) Differently from coefficients used for NC techniques, classified also considering the factors that drive data acquisi-
the set of prime numbers can be obtained directly by tion. In particular, four broad categories can be distinguished
the sink (i.e., CRTs avoid header explosion). [50]: event-driven, time-driven, query-based, and hybrid.
(iii) CRT can be efficiently combined with duty-cycling In event-driven category, data are generated when an
techniques [48] and distributed compression algo- event of interest occurs, while in the time-driven category
rithms [19] to achieve an efficient data aggregation data are periodically sent to the sink at constant interval of
technique. time; in query-based category, data are collected according to
sink requests. Finally, the hybrid approach is a combination
Considering the above advantages and the fact that this of one or more of the above.
paper is focused on data gathering techniques for WSNs, we For simulation purpose, with the aim of evaluating energy
will consider CRT as representative of networking-based data consumptions and reliability, all the above categories can be
gathering techniques. unified with an abstraction of the concept of event, that is, by
simply considering that data must be sent as a consequence
4. Simulation Scenario of an event.
In particular, in query-based network an event is trig-
In this section, we will discuss the WSN model used for gered by the reception of the query message while in time-
comparisons and simulations of data gathering techniques. driven networks the event can be associated with the rising
clock edge of the sampling unit or, more practically, when a So considering that 𝑁𝑚 nodes sense the event, the overall
sufficient number of measures has been collected and a packet number of bits transmitted for each event when DSC is used
is ready to be transmitted. is
Considering the above abstraction, energy consumptions
𝑀𝑤
and network reliability can be evaluated in terms of number 𝐵DSC = 𝑁𝑟,DSC 𝑀𝑤 + (𝑁𝑚 − 1) . (8)
of events (i.e., packets sent) by not taking into account who 𝐹DSC
drives the event.
(iii) CS. We assume that the cell header collects the packets
Therefore, in our simulation scenarios we will consider
of the other nodes in the same cell and sends them by
that 𝐸V events randomly occur in the sensor network and that,
applying CS. More precisely, the collected measures can be
for each event, 𝑁𝑚 nodes recognize the event and generate
represented by a matrix X of 𝑁𝑚 × 𝑀 values of 𝑤-bits each
a packet. More precisely, we assume that only nodes inside
where the 𝑖th column x𝑖 represents the measures taken by
the circular area of radius 𝑟, with center in the location of the
𝑁𝑚 nodes almost at the same time. Considering that such
event, detect the event and therefore need to send a packet.
values are highly correlated in both space and time by taking
Henceforward, we call the circular area related to an event a
a proper transform, we obtain with high probability a sparse
cell.
vector. For instance, we can assume that DCT is applied to
In event-driven networks, usually small packets are sent
each column vector x𝑖 and that only 𝑘 DCT coefficients will
to specify that an event has been detected (a single 𝑤-bit
be nonzero. In this case the cell head needs to send 𝑚 =
word could be sufficient in most cases). Instead in the case of
𝑂(𝑘 log(𝑁𝑚 /𝑘)) measurements for each column, that is, 𝑀⋅𝑚
time-driven or query-driven networks, packets represent 𝑀
measurements for each event.
measures collected in the time interval between two events.
Both cases can be taken into account considering that for each For simulation purpose, we suppose that CS measure-
event raw information of 𝑀𝑤 bits has to be sent for each node ments are represented by 𝑏CS bits and that those measure-
by fixing 𝑀 = 1 for event-driven networks and 𝑀 ≥ 1 in the ments are sent through 𝑚 packets of 𝐿 CS = 𝑀 ⋅ 𝑏CS bits each
case of time-driven or query-driven networks. through a load-balancing shortest-path scheme.
With the aim of reducing energy consumptions (i.e., So the overall number of bits transmitted by the cell head
the overall number of bits sent), raw data are not directly for each event when CS is used is
transmitted; instead, data are processed according to the
chosen data gathering technique. 𝐵CS = 𝑚𝐿 CS = 𝑚𝑀𝑏CS . (9)
More precisely, we have the following.
Other choices are possible without altering the overall
(i) TEC. Nodes using TEC techniques exploit temporal corre- number of transmitted bits 𝐵CS ; for instance, we could have
lation to reduce the number of bits. considered 𝑀 packets of 𝑚 ⋅ 𝑏CS bits, but the previous choice
Here we do not consider a specific TEC technique but will simplify comparisons. In particular, we will show that
assume that the compression factor, 𝐹TEC , of the TEC tech- with the above choice comparison results will be independent
nique used is known. As a consequence, we can state that of 𝑀 so our results will be valid for both event-driven (𝑀 =
using a TEC technique each piece of raw data of 𝑤-bits will be 1) and time/query-driven (𝑀 ≫ 1) techniques.
represented after compression with 𝑏TEC = 𝑤/𝐹TEC bits and
that for each event a node must transmit 𝐿 TEC = 𝑀𝑤/𝐹TEC (iv) CRT. CRT is exploited as shown in [16].
bits. So considering that 𝑁𝑚 nodes sense the event, the overall In particular, we suppose that for each event 𝑁𝑚 source
number of bits transmitted for each event when TEC is used is nodes send their packets to a common set of one-hop relayers
𝑁𝑚 𝑀𝑤 named CRT relayers. Henceforward, we indicate by 𝑁CRT the
𝐵TEC = 𝑁𝑚 𝐿 TEC = . (7) number of CRT relayers.
𝐹TEC
As in the case of CS, the collected measures can be
As already stated, we assume that packets are transmitted represented by a matrix X of 𝑁𝑚 × 𝑀 values of 𝑤-bits each
through a load-balancing shortest-path scheme; that is, a where the 𝑖th column x𝑖 represents the measures taken by 𝑁𝑚
node in cluster CLℎ+1 will select randomly a reachable node nodes almost at the same time.
in the next cluster toward the sink (CLℎ ), and this forwarding CRT relayers process the data of each column x𝑖 in two
scheme is repeated until the sink is reached. In this manner, steps:
information reaches the sink with the minimum number of
hops. (1) In the first step, received data are compressed with
a compression algorithm by obtaining a binary
(ii) DSC. According to DISCUS, we assume that only one sequence 𝑆.
node for each cell (henceforward named the cell head) sends (2) In the second step, CRT is applied to improve relia-
uncompressed measures (i.e., side information) into a packet bility by splitting the binary sequence 𝑆 so that each
of 𝑀𝑤 bits while all the other nodes send compressed packets CRT relayer forwards a CRT component.
of 𝑀𝑏DSC = 𝑀𝑤/𝐹DSC bits. Also in this case we assume that
all packets are transmitted through a load-balancing shortest- It is worth noting that each CRT relayer will indepen-
path scheme. However, to improve reliability we assume that dently compress the received packets by obtaining the same
the cell head transmits its packets 𝑁𝑟,DSC times. sequence 𝑆. This is possible mainly because as they receive the
same data set and apply the same compression algorithm, the considering 𝑋1 , . . . , 𝑋𝑁 obtained from quantization of con-
compressed sequence 𝑆 obtained is the same for all relayers. tinuous Gaussian variables 𝑌1 , . . . , 𝑌𝑁, the joint entropy is [51]
Henceforward, we indicate by 𝑤𝑆 the length of the com-
pressed sequence 𝑆. 𝐻 (𝑋1 , . . . , 𝑋𝑁) = ℎ (𝑌1 , . . . , 𝑌𝑁)
CRT relayers split the binary sequence 𝑆 they have (11)
1
constructed and forward it. Specifically, the sequence 𝑆 is = log2 ((2𝜋𝑒)𝑁 ⋅ |Σ|) ,
𝑤 −1 2
interpreted as an integer 𝑍𝑆 = ∑𝑖=0𝑆 𝑠𝑖 ⋅ 2𝑖 (where 𝑠𝑖 are bits of
𝑍𝑆 ) and by properly choosing the set of prime numbers {𝑝𝑗 } where |Σ|, known as generalized variance, is the determinant
each CRT relayer calculates and forwards the corresponding of the covariance matrix Σ and ℎ(⋅) is the differential entropy
CRT component 𝑧𝑗 = 𝑍𝑆 (mod 𝑝𝑗 ). (rigorously speaking, we have to distinguish between differ-
Note that ⌈log2 (𝑝𝑗 )⌉ is the number of bits needed to ential entropy ℎ(𝑌) for a continuous source 𝑌 (i.e., before the
represent 𝑧𝑗 , so the overall number of bits transmitted by the A/D conversion) and Shannon’s information entropy 𝐻(𝑋)
CRT relayers is for a discrete source 𝑋 (i.e., after quantization introduced by
the A/D conversion); however, it is straightforward to prove
𝑁CRT that their values coincide when unitary quantization step is
𝐵CRT = 𝑀 ∑ ⌈log2 (𝑝𝑗 )⌉ . (10) considered, as done in this paper). In particular for Gaussian
𝑗=1 sources with the same correlation coefficient 𝜌𝑐 and variance
𝜎2 , the generalized variance is |Σ| = 𝜎2𝑁 ⋅ [1 + (𝑁 − 1)𝜌𝑐 ] ⋅ (1 −
From the theory of CRT, the sink will be able to recon- 𝜌𝑐 )𝑁−1 and therefore
struct all raw measurements from the CRT components
provided that the reconstruction condition is satisfied (i.e.,
𝑁CRT 𝐻 = 𝑁 log2 (√2𝜋𝑒 (1 − 𝜌𝑐 ) ⋅ 𝜎)
∏𝑗=1 𝑝𝑗 ≥ 2𝑤𝑆 ).
Note that the reconstruction condition can be satisfied (12)
1 + (𝑁 − 1) 𝜌𝑐
by multiple sets of prime numbers; however, to reduce the + log2 (√ ).
number of bits needed to represent values 𝑧𝑗 , and therefore 1 − 𝜌𝑐
the overall number of bits sent, it is preferable to choose
the smallest possible set of primes, which we refer to as the Moreover, considering that for a broad range of values
Minimum Primes Set (MPS). (i.e., 𝜌𝑐 ∈ [0, 0.99], 𝑁 ≥ 8) the second term is negligible, it
For instance, if 𝑁CRT = 4 and 𝑤𝑆 = 40, the MPS will follows that
be {1019, 1021, 1031, 1033}. In fact, this is the set of the
smallest four consecutive primes that satisfy the relationship 𝐻 ≈ 𝑁 log2 (√2𝜋𝑒 (1 − 𝜌𝑐 ) ⋅ 𝜎) . (13)
𝑁CRT
∏𝑗=1 𝑝𝑗 ≥ 240 .
However, when the set of primes is chosen as above, Therefore, ideal (maximum) lossless compression factor
the message can be reconstructed if and only if all the for Gaussian variables considering blocks of 𝑁 correlated
CRT components are correctly received by the sink. So, to values of 𝑤-bit each can be obtained as
take into account the possible losses due to the wireless 𝑤⋅𝑁 𝑤
medium unreliability, we use the MPS with 𝑓 admissible 𝐹𝐶,ideal = = .
𝐻 log2 (√2𝜋𝑒 (1 − 𝜌𝑐 ) ⋅ 𝜎)
(14)
failures (MPS − 𝑓), that is, the set of the smallest consecutive
primes that satisfy the reconstruction condition even if 𝑓
CRT components are lost. As shown in [16], when 𝑤𝑆 , 𝑁CRT , Throughout the paper, we assume that
and 𝑓 are fixed, the MPS − 𝑓 set is unique so CRT relayers
can obtain the MPS − 𝑓 in a distributed manner. 𝐹DSC = 𝐹TEC = 𝐹𝐶,ideal (15)
(i.e., maximum lossless compression factor). As a conse-

4.3. Source Model. As shown in [13] and references therein, quence, we have
differences among two consecutive samples of several real-
world data (temperature, humidity, solar radiation, etc.) fit 𝑏DSC = 𝑏TEC = log2 (√2𝜋𝑒 (1 − 𝜌𝑐 ) ⋅ 𝜎) . (16)
well with Gaussian distributions. So, in this paper we consider
that sensed data 𝑥𝑖 are approximated by a Gaussian distribu- As regards CS, the actual compression factor is related to
tion and that they are correlated in both space and time. the sparsity level 𝑠 = 𝑘/𝑁𝑚 in fact
This choice is motivated also by the fact that several
analytical results are well known for Gaussian distribution 𝑁𝑚 𝑀𝑤 𝑤 1
and can be readily exploited to obtain the maximum lossless 𝐹CS = = ⋅ . (17)
𝐵CS 𝑏CS 2𝑠 log (1/𝑠) + (5/4) 𝑠 + 1
compression factor for correlated Gaussian sources.
As is well known when compression of discrete sources is Note that 𝐹CS is a decreasing function of 𝑠.
considered, Shannon’s entropy 𝐻 gives the lossless compres- So we consider two cases: an ideal sparsity level 𝑠ideal such
sion limit. that 𝐹CS = 𝐹𝐶,ideal and a slightly greater value 𝑠󸀠 = 1.2 ⋅ 𝑠ideal .
For Gaussian correlated data, under suitable assump- Finally in the case of CRT we consider that the simple
tions and without loss of generality, it can be shown that, MinDiff algorithm proposed in [4] is used for compression.
Basically, MinDiff encodes a set of uncompressed data where 𝑒𝑏 is the energy spent by a node to transmit a single bit
𝑈 = {𝑥𝑖 } with another set of compressed data 𝐶 = {𝜇, 𝑑1 , . . . , and 𝐵𝑋 is the overall number of bits transmitted considering
𝑑𝑛 }, where 𝜇 = min {𝑥𝑖 } is the minimum of the values in the specific data gathering technique derived in Section 4 (see
𝑈 and 𝑑𝑖 are the differences 𝑑𝑖 = 𝑥𝑖 − 𝜇 represented with (7)–(10)).
𝑏𝑑 = ⌈log2 (max {𝑑𝑖 } + 1)⌉ bits each. For instance, in the case of TEC-based data gathering is
The number of bits 𝑏𝑑 needed to represent the set of 𝐵TEC = 𝑁𝑚 𝑀𝑏TEC = 𝑁𝑚 𝑀𝑤/𝐹TEC and as a consequence,
differences and the value of 𝜇 are necessary for proper
reconstruction and therefore an overhead of 𝑤 + log2 (𝑤) bits 1
ERFTEC = 100 ⋅ (1 − ). (21)
must be considered. 𝐹TEC
Therefore, its compression factor considering blocks of 𝑁
values of 𝑤-bit each can be obtained as However, in the case of lossy network, the number of
transmitted and received bits is different and the expected
𝑤⋅𝑁
𝐹𝐶,MinDiff = . (18) energy reduction factor has to be expressed taking into
𝑤 + log2 (𝑤) + 𝑁 ⋅ 𝑏𝑑 account the actual number of bits forwarded.
For comparison purpose, we decided to evaluate energies
4.4. Energy Model. Similarly to other works (e.g., [16, 19]), considering nodes belonging to cluster 2 (i.e., CL2 ).
we consider a simple energy model where for each bit to be We restrict our analysis to the nodes of the second cluster
transmitted a node spends an energy equal to 𝑒𝑏 . Apparently, for two reasons. Firstly, these nodes are the most critical as
it seems that the model neglects the energy needed for they represent the sinks neighbors. In fact, if these nodes run
computation and for reception but this is not true if we out of energy, the sink remains isolated. Secondly, network
reflect on the fact that in sensor networks the number of bits lifetime is defined as the time until the first node in the
transmitted, the number of bits received, and the number of network dies and with high probability, if not certainty, this
processing operations are all proportional to the number of node belongs to CL2 considering that all messages are routed
sensed measures. So energy needed for computation and for to the sink through these nodes.
reception can be easily included in 𝑒𝑏 . Finally, considering that network lifetime is related to
For instance, let us suppose that a node for sensing and the maximum energy consumed by a node in this paper,
processing 𝑀 measures of 𝑤 bits needs an energy equal to we investigate also the energy reduction factor related to the
𝑀𝑤 ⋅ 𝑒𝑐 and that, using a proper compression technique with maximum energies:
a compression factor equal to 𝐹𝑐 , it reduces the number of bits
to be transmitted from 𝑀𝑤 to 𝑀𝑤/𝐹𝑐 . In this case, the overall 𝐸RAW,max − 𝐸𝑋,max
ERF𝑋,max = . (22)
energy is 𝑀𝑤 ⋅ 𝑒𝑐 + (𝑀𝑤/𝐹𝑐 ) ⋅ 𝑒𝑇𝑋 which we can rewrite as 𝐸RAW,max
(𝑀𝑤/𝐹𝑐 ) ⋅ 𝑒𝑏 considering 𝑒𝑏 = 𝐹𝑐 𝑒𝑐 + 𝑒𝑇𝑋 .
Finally, if also the energy needed for reception must be Concerning reliability, we consider that a node fails to
included and it differs from the energy used for transmission, forward a packet with probability 𝑝𝑒 and evaluate the ratio
considering that for almost all nodes the number of bits 𝑃𝑅,𝑋 between the number of raw measurements that are
received is equal to the number of bits transmitted, it will be obtained by the sink and the number of raw measurements
sufficient to use 𝑒𝑏 = 𝐹𝑐 𝑒𝑐 + 𝑒𝑇𝑋 + 𝑒𝑅𝑋 . generated from source nodes or, equivalently,
Therefore, the main simplification introduced by our
𝑀lost,𝑋
model is that we consider 𝑒𝑏 distance-independent; that is, 𝑃𝑅,𝑋 = 1 − , (23)
we do not consider that 𝑒𝑇𝑋 could be adaptively changed by 𝑀 ⋅ 𝑁𝑚
the MAC layer on the basis of distance between source and
where 𝑀lost,𝑋 is the number of raw measurements that are lost
destination node.
due to the network and/or reconstruction errors.
𝑃𝑅,TEC can be easily estimated by assuming a perfect
5. Performance Metrics decoding technique where all received data are correctly
decoded by the sink.
In order to estimate the energy efficiency of the above tech- In fact if ℎ is the number of hops needed to reach the
niques, let us introduce the Energy reduction factor, ERF𝑋 , sink and 𝑝𝑒 is the probability that a node fails to forward a
which represents the percentage reduction of the energy
spent using a specific data gathering technique (𝐸𝑋 ) as com- packet, the probability that a packet is lost is 𝑝𝑛 = 1−(1−𝑝𝑒 )ℎ
pared to the case when raw measures are directly sent (𝐸RAW ). and therefore the expected number of lost data is 𝑀lost,TEC =
This metric is defined as 𝑁𝑚 𝑀𝑝𝑛 . As a consequence
𝐸 − 𝐸𝑋 𝑀lost,TEC ℎ
ERF𝑋 = 100 ⋅ RAW , (19) 𝑃𝑅,TEC = 1 − = 1 − 𝑝𝑛 = (1 − 𝑝𝑒 ) . (24)
𝐸RAW 𝑀 ⋅ 𝑁𝑚
where 𝑋 ∈ {TEC, CS, CRT, DCS}. Note that the reliability of TEC techniques is related only
When an ideal lossless network is considered, the ERF𝑋 to network parameters 𝑝𝑒 and ℎ and cannot be improved
can be evaluated as without relying on channel coding techniques (e.g., FEC).
𝑁 𝑀𝑤 ⋅ 𝑒𝑏 − 𝐵𝑋 ⋅ 𝑒𝑏 Differently from TEC techniques, for all the other data
ERF𝑋,ideal = 100 ⋅ 𝑚 , (20)
𝑁𝑚 𝑀𝑤 ⋅ 𝑒𝑏 gathering techniques, reliability can be improved with a
1
PR,TEC (pe = 0.01)
Nm = 10
1 Nm = 100
0.9
0.9
0.8 PR,TEC (pe = 0.05)
0.7 0.8
PR,CRT
0.6
PR,DSC
0.5 Nm = 10
0.4
0.3 0.7
0.2 Nm = 100
0.1
60 0.6
50 6
40 5
N 30 4
CRT 3
20 2 f 0.5
10 0 1 1 2 3 4 5 6
Nr,DSC
Figure 1: 𝑃𝑅,CRT for different values of 𝑁CRT and 𝑓 when 𝑝𝑛 = 0.04.
Nm = 10, pe = 0.05 Nm = 10, pe = 0.01
Nm = 100, pe = 0.05 Nm = 100, pe = 0.01
proper settings of design parameters. Nevertheless, a trade- Figure 2: 𝑃𝑅,DSC for different values of 𝑁𝑚 , 𝑁𝑟,DSC , and 𝑝𝑒 when ℎ =
off exists between reliability and energy saving as briefly 4.
discussed below.
(i) CRT. In the case of CRT data gathering primes, numbers where 𝑁lost,DSC is the number of lost packets by taking into
can be selected so that all raw measures can be reconstructed account that all packets related to the same event are lost if
even if at most 𝑓 CRT components are lost. no one of the cell head packets arrives. Note that the expected
As shown in [16], when 𝑓 is fixed the reliability can be 𝑁
value of 𝑁lost,DSC is 𝑁lost,DSC = 𝑁𝑚 𝑝𝑛 𝑟,DSC + (𝑁𝑚 − 1)(1 −
estimated as 𝑁𝑟,DSC
𝑝𝑛 )𝑝𝑛 .
𝑓
𝑁CRT 𝑖 𝑁 −𝑖 In Figure 2, we can compare reliability of TEC and DSC
𝑃𝑅,CRT = ∑ ( ) 𝑝𝑛 (1 − 𝑝𝑛 ) CRT , (25) for two different values of the loss probability per hop (𝑝𝑒 =
𝑖=0 𝑖
0.01 and 𝑝𝑒 = 0.05) and a fixed number of hops, ℎ = 4, when
where 𝑝𝑛 = 1 − (1 − 𝑝𝑒 )ℎ is the probability that a CRT com- different values of 𝑁𝑟,DSC and different number of source
ponent is lost. nodes 𝑁𝑚 are considered.
As general rule, high reliability can be obtained by fixing As it is possible to observe by fixing 𝑁𝑟,DSC = 3 we
𝑓 so that 𝑓 = 𝑁CRT 𝑝𝑛 + 𝑘𝑓 where 𝑘𝑓 is a small constant on have 𝑃𝑅,DSC ≥ 𝑃𝑅,TEC for a broad range of 𝑝𝑒 and 𝑁𝑚 (i.e.,
𝑝𝑒 ∈ {0.01, 0.05} and 𝑁𝑚 ∈ {10, 100}). Note also that further
the order of √𝑁CRT 𝑝𝑛 .
increasing 𝑁𝑟,DSC does not improve 𝑃𝑅,DSC so much. This
This result can be justified by considering that 𝑃𝑅,CRT (see
can be justified considering that, for high values of, that is,
(25)) can be approximated by the cumulative distribution
𝑁𝑟,DSC → ∞, it follows that 𝑃𝑅,DSC ≤ 1 − (𝑁𝑚 − 1)𝑝𝑛 /𝑁𝑚 =
function of a normal variable with mean 𝑁CRT 𝑝𝑛 and vari-
1 − 𝑝𝑛 + 𝑝𝑛 /𝑁𝑚 ≈ 𝑃𝑅,TEC .
ance 𝑁CRT 𝑝𝑛 (1 − 𝑝𝑛 ).
Obviuosly, reliability increases with number of retrans-
For instance, in Figure 1 we show the reliability 𝑃𝑅,CRT
missions 𝑁𝑟,DSC but at the cost of reducing energy saving.
for different values of 𝑁CRT and 𝑓 when 𝑝𝑛 = 0.04. As it
So in the next section we investigated the trade-off between
is possible to observe, small values of 𝑓 (e.g., 𝑓 = 6) are
reliability and energy consumptions for different values of
sufficient to achieve high values of reliability (>0.99).
𝑁𝑟,DSC .
Higher reliability can be obtained by further increasing
the value of the parameter 𝑓. However, by increasing 𝑓
(iii) CS. By indicating with 𝑚 the number of packets sent with
energy consumptions are increased too so in the next section
CS, the probability to receive at least 𝑚∗ packets is
we investigated the trade-off between reliability and energy
consumptions for different values of 𝑓. 𝑚−𝑚∗
𝑚 𝑚−𝑖
𝑃CS = ∑ ( ) 𝑝𝑛𝑖 (1 − 𝑝𝑛 ) . (27)
(ii) DSC. Also in the case of DSC, the probability to lose a 𝑖=0 𝑖
packet is 𝑝𝑛 = 1 − (1 − 𝑝𝑒 )ℎ . However, DSC compressed mea- Therefore, by increasing 𝑚 it is possible to guarantee
sures cannot be recovered if side information (i.e., packets
that, with the desired probability 𝑃CS , at least 𝑚∗ packets are
generated by the cell head) is not received.
received by the sink. As a general rule, high reliability can be
So, in order to improve the reliability we considered that
obtained by fixing 𝑚 so that 𝑚 = 𝑚∗ /(1 − 𝑝𝑛 ) + 𝑘𝑚 , where 𝑘𝑚
cell head transmits 𝑁𝑟,DSC times the side information.
is a small constant.
In this case, DSC reliability can be evaluated as
However, differently from all previous techniques, CS
𝑁lost,DSC reliability is not related only to the number of received packets
𝑃𝑅,DSC = 1 − , (26)
𝑁𝑚 so that 𝑃CS , henceforward named network reliability, is not
the actual CS reliability (this justifies why we used notation 100

𝑃CS instead of 𝑃CS ). In fact, CS techniques are based on
reconstruction algorithms that could introduce errors so that 80
reconstructed values 𝑥 ̂ 𝑖 could differ from original raw data 𝑥𝑖 .
Nevertheless, if we assume that raw data 𝑥𝑖 are quantized
values (as usual in WSNs where raw measures come from 60
ERF
ADCs), we can state that quantized measures can be exactly
recovered if the reconstruction error |̂ 𝑥𝑖 − 𝑥𝑖 | is smaller than 40
quantization error Δ 𝑥 /2, where Δ 𝑥 is the quantization step
used for quantizing 𝑥𝑖 .
Therefore, in our simulations, the reconstruction error, 20
𝑥𝑖 − 𝑥𝑖 |, is evaluated for each measure and raw data is
|̂
considered lost if this error is greater than Δ 𝑥 /2. 0
From simulation point of view, actual CS reliability can be 8 16 32 64
evaluated as 𝜎
𝑀lost,CS TECMODEL TEC
𝑃𝑅,CS = 1 − , (28)
𝑀 ⋅ 𝑁𝑚 DSCMODEL DSC
CRT MODEL CRT
where 𝑀lost,CS is the number of lost measures considering CSIMODEL CSI
both packet loss and reconstruction error. CS2MODEL CS2
We show in the next section that by choosing 𝑏CS = 𝑤 + 3
and 𝑚 = 𝑚∗ /(1 − 𝑝𝑛 ) + 2 we can obtain 𝑃𝑅,CS ≈ 1. Simulation Parameters
Two reconstruction algorithms are considered for CS in 𝜌 0.1
this paper: ideal (i.e., oracle-based) reconstruction, where R 55 m
r 18 m
exact positions of nonzero values are assumed to be known
pe 0
at the sink node, and CoSaMP [27]. To distinguish among 𝜌c 0.55
them, we indicate their reliability as 𝑃𝑅,CSI and 𝑃𝑅,CoSaMP , M 50
respectively. f 0
Nr,DSC 1
m m∗
6. Simulations Results
Figure 3: ERF of data gathering techniques in the case of reliable
In this section, we compare data gathering techniques in network (𝑝𝑒 = 0), low variance (𝜎 ∈ {8–64}), and medium cor-
terms of energy consumptions and reliability. relation (𝜌𝑐 = 0.55).
The results have been obtained through a custom C++
simulator. For each set of parameters mean results are
reported considering 20 random topologies where nodes are The same results are obtained for different values of 𝑀,
uniformly distributed in a square area of size 600 × 600 [m2 ], that is, by changing the number of raw measures per packet.
with density 𝜌 [nodes/m2 ]. We also assume that 𝐸V events This can be easily justified by the fact that 𝐸RAW = 𝑁𝑚 𝑀𝑤𝑒𝑏
randomly occur in a faraway cluster (e.g., CL5 so that ℎ = 4 and 𝐸𝑋 = 𝐵𝑋 𝑒𝑏 are both proportional to 𝑀 so their ratio and
hops are needed to reach the sink) and that each event is therefore also the ERF (see (19)) are independent of 𝑀. So
detected by 𝑁𝑚 = 𝜌𝜋𝑟2 source nodes (where 𝑟 is the sense simulation results for different values of 𝑀 are not shown for
radius). the sake of space.
If not otherwise stated, 𝐸V = 300 events are considered On the basis of the previous results, we can state that in
and raw data are represented with 𝑤 = 12 bits (which is a the case of reliable networks TEC and DSC have a greater ERF
typical number for ADCs used in sensor networks). and therefore it seems that they should be preferred to the
other analyzed data gathering techniques. However, as shown
6.1. ERF with Reliable Networks. To asses the simulator we in the next section, this result is not true when reliability is an
first analyzed an ideal (fully reliable) WSN and evaluated the issue.
ERF for different values of 𝜎 and 𝜌𝑐 when different data gath- Note that in our simulations ERFDSC and ERFTEC are
ering techniques (TEC, CRT, DSC, and CS) are considered. almost the same because we fixed the same values for spatial
In particular, for CS two cases have been considered: an ideal and temporal correlation coefficients. Obviously, when spa-
case (CSI) where the sparsity level 𝑠 = 𝑘/𝑁𝑚 is fixed to the tial and temporal correlation are not the same different results
minimum value 𝑠 = 𝑠ideal such that 𝐹𝐶,CS = 𝐹𝐶,ideal (see (17) can be obtained.
and (14)) and a second case (CS2) where a slightly greater Finally, note that although CRT appears to be the worst
value 𝑠󸀠 = 1.2 ⋅ 𝑠ideal is used. in terms of ERF it is the only data gathering technique
As shown in Figures 3 and 4, the results obtained through where an actual compression algorithm (MinDiff) has been
the analytical model (see (20)) and those reported by the sim- considered (for all the other techniques, ideal compression
ulator are very close to each other for all the values of 𝜎 and 𝜌𝑐 factors have been assumed). So previous simulations results
considered. These results confirm the validity of our model. allow quantifying the penalty in using a simple compression
100 100
80 80
60 60
ERF
ERF
40 40
20 20
0 0
32 64 128 256 8 16 32 64
𝜎 𝜎
TECMODEL TEC TECMODEL TEC

DSCMODEL DSC DSCMODEL DSC
CRT MODEL CRT CRT MODEL CRT
CSIMODEL CSI CSIMODEL CSI
CS2MODEL CS2 CS2MODEL CS2
Simulation Parameters Simulation Parameters

𝜌 0.1 𝜌 0.1
R 55 m R 55 m
r 18 m r 12 m
pe 0 pe 0
𝜌c 0.97 𝜌c 0.55
M 50 M 50
f 0 f 0
Nr,DSC 1 Nr,DSC 1
m m∗ m m∗
Figure 4: ERF of data gathering techniques in the case of reliable Figure 5: ERF of data gathering techniques in the case of reliable
network (𝑝𝑒 = 0), high variance (𝜎 ∈ {32–256}), and high cor- network (𝑝𝑒 = 0) and lower sensing radius (𝑟 = 12).
relation (𝜌𝑐 = 0.97).
without altering the overall number of source nodes (i.e.,

algorithm (MinDiff) instead of more complex techniques (at 𝑁𝑚 = 𝜋 ⋅ 𝜌 ⋅ 𝑟2 = 45 in both cases).
least when Gaussian data are considered).
By comparing Figures 3 and 5, we can see that CRT and 6.2. ERF with Unreliable Networks. In Figures 7 and 8, reli-
CS achieve different values of ERF when the sensing radius, ability and ERF of data gathering techniques for 𝑝𝑒 = 0.01
𝑟, changes. In particular, ERFCSI is no more able to reach the and different values of 𝜎 are reported.
same values of ERFDSC for low values of 𝑟.
On the basis of the simulation results, we can state that
This can be justified considering that when 𝑟 decreases, even in the case of unreliable networks TEC and DSC have
the sparsity level 𝑠 ∝ 1/𝑁𝑚 increases, and, as a consequence, greater ERF (see Figure 8). However, their reliability is fully
the compression factor 𝐹CS decreases too (see (17)). determined by the packet loss probability and cannot be
It is worth nothing that our analytical model is able to improved; instead, by using CRT and CS higher reliability can
anticipate this result. be achieved by increasing 𝑓 and 𝑚, respectively.
Also in the case of CRT, the ERF slightly decreases for In particular, as shown in Figure 7, by fixing 𝑓 = 8 a
lower values of 𝑟 due to the fact that when 𝑁𝑚 decreases reliability higher than 0.975 can be achieved for CRT and even
the overhead of the MinDiff algorithm is more relevant and higher values can be obtained using CS when 𝑚 = 𝑚∗ /(1 −
𝐹𝐶,MinDiff is not able to approach 𝐹𝐶,ideal (see (18)). 𝑝𝑛 ) + 2 CS measures are sent.
Similar considerations can be made about the node
It is important to note that the reliability plotted for CS
density 𝜌. In fact, as it is possible to observe by comparing
is the actual reliability obtained after reconstruction with an
Figures 3 and 6, ERFCSI and ERFCRT decrease for lower values
ideal (oracle-based) reconstruction algorithm (i.e., 𝑃𝑅,CSI and
of 𝜌 (i.e., lower values of 𝑁𝑚 ).
It is worth noting that Figures 6 and 5 report quite similar not 𝑃𝑅,CS ).
ERF values despite different values of 𝜌 and 𝑟 being used for By choosing 𝑚 so that 𝑚 = 𝑚∗ /(1 − 𝑝𝑛 ) + 2, we have
simulation. This result can be justified by considering that with high probability (i.e., 𝑃𝑅,CS > 0.99) that the sink receives
network density and sensing radius have been changed but at least 𝑚∗ CS measures, that is, the minimum number of
100 1
0.995
80
0.99
0.985
60
0.98
PR
ERF
40 0.975
0.97
20 0.965
0.96
0 8 16 32 64
8 16 32 64 𝜎
𝜎
TEC TECMODEL
TECMODEL TEC DSC DSCMODEL
DSCMODEL DSC CRT CRT MODEL
CRT MODEL CRT CSI
CSIMODEL CSI CS2
CS2MODEL CS2
Simulation Parameters
Simulation Parameters 𝜌 0.1
𝜌 0.045 R 55 m
R 55 m r 18 m
r 18 m pe 0.01
pe 0 𝜌c 0.55
𝜌c 0.55 M 50
M 50 f 8
f 0 Nr,DSC 3
Nr,DSC 1 m m∗ /(1 − pn ) + 2
m m∗
Figure 7: Reliability of data gathering techniques for 𝑝𝑒 = 0.01.
Figure 6: ERF of data gathering techniques in the case of reliable
network (𝑝𝑒 = 0) and lower density (𝜌 = 0.045).
𝑏CS and sparsity levels when 𝑤 = 12. Simulation parameters
are those reported in Figure 3.
measures sufficient for reconstruction, and therefore original As it is possible to observe, when 𝑏CS < 15 reliability
measurements can be perfectly recovered (so that 𝑃𝑅,CS = 1). quickly decreases.
In the next subsection, we will show that this is true only if Similar results have been obtained for different values of
𝑏CS = 𝑤 + 3, as chosen for our simulations. 𝑤.
Obviously, high reliability is achieved at the cost of a lower However, in practice, actual reliability depends also on the
ERF but, by comparing Figures 8 and 3, we can state that reconstruction algorithm used. For the sake of completeness,
the impact on the ERF is quite low (both ERFCRT and ERFCS we report in Figure 10 CS reliability when the CoSaMP [27]
decrease by a few percent). algorithm is used for reconstruction.
As a consequence when high reliability is needed even As it is possible to observe also in this case 𝑏CS = 15
with unreliable networks, CRT and CS should be preferred. is needed for achieving high reliability. Nevertheless, perfect
reliability is not obtained with CoSaMP when 𝑚 = 𝑚∗ . So in
some cases CRT could be preferred to CS because reliability
6.3. CS Reliability. In all the previous simulations, we have can be better predicted.
fixed the number of bits 𝑏CS used for representing quantized
CS measures equal to 𝑤 + 3. A careful reader could observe
6.4. ERF𝑚𝑎𝑥 and Network Lifetime. The ERF metric is an
that this choice is questionable and that by reducing 𝑏CS
expression of mean energy consumptions; instead, network
higher values of ERFCS can be obtained. This consideration lifetime is more closely related to maximum energy consump-
is partially true: effectively, ERFCS increases for lower values tions.
of 𝑏CS , but our simulation results show that using values of 𝑏CS In Figure 11, we report ERFmax instead of ERF for different
below 𝑤+3 is not possible to have perfect reconstruction (i.e., data gathering techniques. ERFmax is evaluated on the basis of
𝑃𝑅,CS = 1). (22), that is, considering the maximum energy consumptions
To convince the reader in Figure 9, we report simulation for nodes belonging to cluster CL2 . Maximum energies have
results about the actual reliability 𝑃𝑅,CS for different values of greater variations in comparison to mean values so in order
100 1
80 0.8
60 0.6
PR,CS
ERF
40 0.4
20 0.2
0 0
8 16 32 64 8 16 32 64
𝜎 𝜎
TECMODEL TEC CSI (bCS = 15) CoSaMP (bCS = 14)

DSCMODEL DSC CoSaMP (bCS = 15) CoSaMP (bCS = 13)
CRT MODEL CRT
CSIMODEL CSI Figure 10: 𝑃𝑅,CoSaMP for different values of 𝑏CS and 𝜎.
CS2MODEL CS2
100
Simulation Parameters
𝜌 0.1
R 55 m 80
r 18 m
pe 0.01
𝜌c 0.55 60
ERFmax
M 50
f 8
Nr,DSC 3 40
m m∗ /(1 − pn ) + 2
Figure 8: ERF of data gathering techniques for 𝑝𝑒 = 0.01. 20
0
8 16 32 64
1 𝜎
0.99
TEC CSI
0.98 DSC CS2
0.97 CRT
0.96 Simulation Parameters
PR,CS
0.95 𝜌 0.1
0.94 R 55 m
r 18 m
0.93 pe 0.01
0.92 𝜌c 0.55
M 50
0.91 8
f
0.9 Nr,DSC 3
13 14 15 m m∗ /(1 − pn ) + 2
bCS
Figure 11: ERFmax of data gathering techniques for 𝑝𝑒 = 0.01.
CSI 𝜎 = 8 CS2 𝜎 = 8
CSI 𝜎 = 16 CS2 𝜎 = 16
CSI 𝜎 = 32 CS2 𝜎 = 32
CSI 𝜎 = 64 CS2 𝜎 = 64 As it is possible to observe, TEC and DSC have higher
Figure 9: 𝑃𝑅,CSI for different values of 𝑏CS and sparsity levels when ERFmax and therefore greater network lifetime can be
𝑤 = 12. expected when DSC and TEC are used for data gathering.
Finally, note that CRT and CS have similar performance
if the actual sparsity degree of CS is 20% more than the
to have higher confidence we increased the number of events minimum (ideal) value. These observations can be extended
to 𝐸V = 3000. also to all the previous simulations.
7. Conclusions and Future Works ERF𝑋 : Energy reduction factor of the data gather-
ing technique 𝑋 ∈ {TEC, CS, DSC, CRT}
In this paper, we have compared several data gathering 𝑃𝑋 : Reliability of the data gathering technique
techniques used in WSNs by using both simulation results 𝑋 ∈ {TEC, CS, DSC, CRT}
and analytical models. In particular, the effectiveness of the 𝜇: Minimum value within set {𝑥𝑖 }; that is, 𝜇 =
above techniques has been investigated in terms of reliability min {𝑥𝑖 }
(packet loss and reconstruction errors) and energy efficiency 𝑑𝑖 : Difference with the minimum element of
(i.e., ERF and network lifetime) by systematically sampling {𝑥𝑖 }; that is, 𝑑𝑖 = 𝑥𝑖 − 𝜇
the parameter space (i.e., number of nodes, transmission 𝑦𝑖 : Compressed sensing measurement
range, and sparsity). Basically we can summarize our results 𝑚: Number of CS measurements
as follows: 𝑚∗ : Minimum/sufficient number of CS mea-
surements for reconstruction
(i) DSC and TEC techniques should be preferred for
𝑘: Number of nonzero components of a 𝑘-
maximizing network lifetime.
sparse signal
(ii) CS should be preferred when high reliability is 𝑠: Sparsity level of CS measurements; that is,
needed. 𝑠 = 𝑘/𝑛
(iii) CRT should be preferred for its inherent low complex- 𝜌𝑐 : Correlation coefficient of raw measure-
ity. ments
𝜎: Standard deviation of raw measurements
As a consequence, we can state that there is no best 𝑅: Maximum transmission distance (coverage
solution for all possible applications and that only the trade- radius)
off between energy consumptions, reliability, and complexity 𝑟: Sense radius of a node
can drive the choice of the data gathering technique to be used 𝑁CRT : Number of CRT relayers
for a specific application. 𝐺: Grid size (edge in meter)
As future work, we plan to refine and improve the model 𝜌: Node density
to deal with actual correlated measurements (i.e., not only 𝑒𝑏 : Energy spent for each transmitted bit
Gaussian data) and more realistic propagation channels (i.e., 𝐻: Shannon’s entropy
by taking into account actual distance between nodes). 𝑁𝑟,DSC : Number of retransmissions for the clus-
ter/cell head when DSC is used
CLℎ : Cluster number ℎ
Notations ℎ: Number of hops between nodes in clusters
TEC: Transform and Encoding-Based CLℎ+1 and CLℎ
Compression 𝐸V : Number of events used for simulations.
CS: Compressive sensing
DSC: Distributed source coding Competing Interests
CRT: Chinese Remainder Theorem
𝑋: Generic data gathering technique; that is, The authors declare that they have no competing interests.
𝑋 ∈ {TEC, CS, DSC, CRT}
𝑈 = {𝑥𝑖 }: A set of uncompressed values (raw References
measurements)
𝑤: Number of bits used to represent a raw [1] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, “A
value (without compression) survey on sensor networks,” IEEE Communications Magazine,
𝑀: Number of measurements/words per vol. 40, no. 8, pp. 102–105, 2002.
packet [2] Z. Xiong, A. D. Liveris, and S. Cheng, “Distributed source
𝑁𝑚 : Number of source nodes coding for sensor networks,” IEEE Signal Processing Magazine,
X: Matrix of 𝑁𝑚 × 𝑀 raw data of 𝑤-bits each vol. 21, no. 5, pp. 80–94, 2004.
x𝑖 : 𝑖th column of the matrix X (measures [3] D. Zordan, B. Martinez, I. Vilajosana, and M. Rossi, “On the
taken by 𝑁𝑚 nodes almost at the same performance of lossy compression schemes for energy con-
strained sensor networking,” ACM Transactions on Sensor
time)
Networks, vol. 11, no. 1, article 15, 2014.
𝐹𝑋 : Compression factor of the data gathering
[4] G. Campobello, O. Giordano, A. Segreto, and S. Serrano,
technique 𝑋 ∈ {TEC, CS, DSC, CRT}
“Comparison of local lossless compression algorithms for wire-
𝑏𝑋 : Number of bits used to represent a less sensor networks,” Journal of Network and Computer Appli-
compressed measure with data gathering cations, vol. 47, pp. 23–31, 2015.
technique 𝑋 [5] J. Chou, D. Petrovic, and K. Ramachandran, “A distributed and
𝐿 𝑋: Number of bits per packet when data adaptive signal processing approach to reducing energy con-
gathering technique 𝑋 is used sumption in sensor networks,” in Proceedings of the 22nd Annual
𝐵𝑋 : Overall number of bits transmitted for Joint Conference of the IEEE Computer and Communications
each event when the data gathering (INFOCOM ’03), vol. 2, pp. 1054–1062, IEEE, San Francisco,
technique 𝑋 is used Calif, USA, March-April 2003.
[6] X. He, X. Zhou, M. Juntti, and T. Matsumoto, “Data and error on Computing, Networking and Communications (ICNC ’15), pp.
rate bounds for binary data gathering wireless sensor networks,” 911–917, IEEE, Garden Grove, Calif, USA, Feburary 2015.
in Proceedings of the IEEE 16th International Workshop on Signal [23] F. Marcelloni and M. Vecchio, “An efficient lossless compres-
Processing Advances in Wireless Communications (SPAWC ’15), sion algorithm for tiny nodes of monitoring wireless sensor
pp. 505–509, Stockholm, Sweden, June 2015. networks,” The Computer Journal, vol. 52, no. 8, pp. 969–987,
[7] C. Karakus, A. C. Gurbuz, and B. Tavli, “Analysis of energy 2009.
efficiency of compressive sensing in wireless sensor networks,” [24] Y. Liang and W. Peng, “Minimizing energy consumptions in
IEEE Sensors Journal, vol. 13, no. 5, pp. 1999–2008, 2013. wireless sensor networks via two-modal transmission,” ACM
[8] H. Zheng, S. Xiao, X. Wang, X. Tian, and M. Guizani, “Capacity SIGCOMM Computer Communication Review, vol. 40, no. 1, pp.
and delay analysis for data gathering with compressive sensing 12–18, 2010.
in wireless sensor networks,” IEEE Transactions on Wireless [25] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decom-
Communications, vol. 12, no. 2, pp. 917–927, 2013.
position by basis pursuit,” SIAM Review, vol. 43, no. 1, pp. 129–
[9] H. Zheng, F. Yang, X. Tian, X. Gan, X. Wang, and S. Xiao, “Data 159, 2001.
gathering with compressive sensing in wireless sensor networks:
[26] J. A. Tropp and A. C. Gilbert, “Signal recovery from random
a random walk based approach,” IEEE Transactions on Parallel
measurements via orthogonal matching pursuit,” IEEE Trans-
and Distributed Systems, vol. 26, no. 1, pp. 35–44, 2015.
actions on Information Theory, vol. 53, no. 12, pp. 4655–4666,
[10] L. Zhu, B. Ci, Y. Liu, and Z. D. Chen, “Data gathering in wireless 2007.
sensor networks based on reshuffling cluster compressed sens-
ing,” International Journal of Distributed Sensor Networks, vol. [27] J. A. Tropp and D. Needell, “CoSaMP: iterative signal recovery
2015, 13 pages, 2015. from incomplete and inaccurate samples,” Applied and Compu-
tational Harmonic Analysis, vol. 26, no. 3, pp. 301–321, 2009.
[11] E. J. Candes and T. Tao, “Near-optimal signal recovery from ran-
dom projections: universal encoding strategies?” IEEE Transac- [28] E. Candes and M. Wakin, “An introduction to compressive
tions on Information Theory, vol. 52, no. 12, pp. 5406–5425, 2006. sampling,” IEEE Signal Processing Magazine, vol. 25, no. 2, pp.
21–30, 2008.
[12] D. L. Donoho, “Compressed sensing,” IEEE Transactions on
Information Theory, vol. 52, no. 4, pp. 1289–1306, 2006. [29] V. Chandrasekaran, B. Recht, P. A. Parrilo, and A. S. Willsky,
“The convex geometry of linear inverse problems,” Foundations
[13] T. Srisooksai, K. Keamarungsi, P. Lamsrichan, and K. Araki,
of Computational Mathematics, vol. 12, no. 6, pp. 805–849, 2012.
“Practical data compression in wireless sensor networks: a
survey,” Journal of Network and Computer Applications, vol. 35, [30] X. Wang, Z. Zhao, Y. Xia, and H. Zhang, “Compressed sensing
no. 1, pp. 37–59, 2012. for efficient random routing in multi-hop wireless sensor net-
[14] J.-H. Chang and L. Tassiulas, “Maximum lifetime routing in works,” in Proceedings of the IEEE GLOBECOM Workshops (GC
wireless sensor networks,” IEEE/ACM Transactions on Network- Wkshps ’10), pp. 266–271, IEEE, Miami, Fla, USA, December
ing, vol. 12, no. 4, pp. 609–619, 2004. 2010.
[15] Y. Wu, Z. Mao, S. Fahmy, and N. B. Shroff, “Constructing [31] A. Razi, K. Yasami, and A. Abedi, “On minimum number of
maximum-lifetime data-gathering forests in sensor networks,” wireless sensors required for reliable binary source estimation,”
IEEE/ACM Transactions on Networking, vol. 18, no. 5, pp. 1571– in Proceedings of the IEEE Wireless Communications and Net-
1584, 2010. working Conference (WCNC ’11), pp. 1852–1857, IEEE, Cancun,
Mexico, March 2011.
[16] G. Campobello, A. Leonardi, and S. Palazzo, “Improving energy
saving and reliability in wireless sensor networks using a simple [32] S. S. Pradhan and K. Ramchandran, “Distributed source coding
CRT-based packet-forwarding solution,” IEEE/ACM Transac- using syndromes (discus): design and construction,” in Proceed-
tions on Networking, vol. 20, no. 1, pp. 191–205, 2012. ings of the Data Compression Conference (DCC ’99), pp. 158–167,
[17] D. Gong and Y. Yang, “Low-latency SINR-based data gathering March 1999.
in wireless sensor networks,” IEEE Transactions on Wireless [33] J.-H. Chang and L. Tassiulas, “Routing for maximum system
Communications, vol. 13, no. 6, pp. 3207–3221, 2014. lifetime in wireless ad-hoc networks,” in Proceedings of the 37th
[18] C. Hua and T.-S. P. Yum, “Optimal routing and data aggre- Annual Allerton Conference on Communication Control and
gation for maximizing lifetime of wireless sensor networks,” Computing, pp. 1191–1200, Monticello, Ill, USA, September 1999.
IEEE/ACM Transactions on Networking, vol. 16, no. 4, pp. 892– [34] J.-H. Chang and L. Tassiulas, “Energy conserving routing in
903, 2008. wireless ad-hoc networks,” in Proceedings of the 19th IEEE
[19] G. Campobello, S. Serrano, L. Galluccio, and S. Palazzo, “Apply- Annual Joint Conference of the IEEE Computer and Communi-
ing the Chinese remainder theorem to data aggregation in cations Societies (INFOCOM ’00), vol. 1, pp. 22–31, March 2000.
wireless sensor networks,” IEEE Communications Letters, vol. 17, [35] S. Guo, Y. Yang, and C. Wang, “DaGCM: a concurrent data
no. 5, pp. 1000–1003, 2013. uploading framework for mobile data gathering in wireless
[20] X. Xing, D. Xie, and G. Wang, “Energy-balanced data gather- sensor networks,” IEEE Transactions on Mobile Computing, vol.
ing and aggregating in wsns: a compressed sensing scheme,” 15, no. 3, pp. 610–626, 2016.
International Journal of Distributed Sensor Networks, vol. 2015, [36] S. Guo, C. Wang, and Y. Yang, “Joint mobile data gathering and
Article ID 585191, 10 pages, 2015. energy provisioning in wireless rechargeable sensor networks,”
[21] C. Lim, “Network coding for severe packet reordering in mul- IEEE Transactions on Mobile Computing, vol. 13, no. 12, pp.
tihop wireless networks,” International Journal of Distributed 2836–2852, 2014.
Sensor Networks, vol. 2015, Article ID 379108, 9 pages, 2015. [37] M. Zhao, J. Li, and Y. Yang, “A framework of joint mobile energy
[22] M. Rossi, M. Hooshmand, D. Zordan, and M. Zorzi, “Evaluating replenishment and data gathering in wireless rechargeable
the gap between compressive sensing and distributed source sensor networks,” IEEE Transactions on Mobile Computing, vol.
coding in WSN,” in Proceedings of the International Conference 13, no. 12, pp. 2689–2705, 2014.
[38] R. Ahlswede, N. Cai, S.-Y. R. Li, and R. W. Yeung, “Network

information flow,” IEEE Transactions on Information Theory,
vol. 46, no. 4, pp. 1204–1216, 2000.
[39] C. Fragouli, J.-Y. Le Boudec, and J. Widmer, “Network coding:
an instant primer,” ACM SIGCOMM Computer Communication
Review, vol. 36, no. 1, pp. 63–68, 2006.
[40] P. Ostovari, J. Wu, and A. Khreishah, Network Coding Tech-
niques for Wireless and Sensor Networks, Springer, Berlin, Ger-
many, 2013.
[41] S. Chachulski, M. Jennings, S. Katti, and D. Katabi, “Trading
structure for randomness in wireless opportunistic routing,”
ACM SIGCOMM Computer Communication Review, vol. 37, no.
4, pp. 169–180, 2007.
[42] I.-H. Hou, Y.-E. Tsai, T. F. Abdelzaher, and I. Gupta, “AdapCode:
Adaptive network coding for code updates in wireless sensor
networks,” in Proceedings of the 27th IEEE Communications
Society Conference on Computer Communications (INFOCOM
’08), pp. 2189–2197, Phoenix, Ariz, USA, April 2008.
[43] Z. Yang, M. Li, and W. Lou, “R-code: network coding based
reliable broadcast in wireless mesh networks with unreliable
links,” in Proceedings of the IEEE Global Telecommunications
Conference (GLOBECOM ’09), pp. 1–6, IEEE, Honolulu, Hawaii,
USA, November 2009.
[44] T. Ho, R. Koetter, M. Medard, D. R. Karger, T. Ho, and M.
Effros, “The benefits of coding over routing in a randomized
settings,” in Proceedings of the IEEE International Symposium on
Information Theory, Yokohama, Japan, July 2003.
[45] T. Ho, M. Medard, R. Koetter et al., “A random linear network
coding approach to multicast,” IEEE Transactions on Informa-
tion Theory, vol. 52, no. 10, pp. 4413–4430, 2006.
[46] T. Voigt, U. Roedig, O. Landsiedel, K. Samarasinghe, and M.
B. Prasad, “On the applicability of network coding in wireless
sensor networks,” ACM SIGBED Review, vol. 9, no. 3, pp. 46–
48, 2012.
[47] G. Campobello, A. Leonardi, and S. Palazzo, “On the use of
Chinese remainder theorem for energy saving in wireless sensor
networks,” in Proceedings of the IEEE International Conference
on Communications (ICC ’08), pp. 2723–2727, Beijing, China,
May 2008.
[48] A. Leonardi, G. Campobello, S. Serrano, and S. Palazzo, “Trade-
offs between energy saving and reliability in low duty cycle
wireless sensor networks using a packet splitting forwarding
technique,” EURASIP Journal on Wireless Communications and
Networking, vol. 2010, Article ID 932345, 2010.
[49] G. Campobello, A. Leonardi, and S. Palazzo, “A novel reliable
and energy-saving forwarding technique for wireless sensor
networks,” in Proceedings of the 10th ACM International Sympo-
sium on Mobile Ad Hoc Networking and Computing (MobiHoc
’09), pp. 269–278, New Orleans, La, USA, May 2009.
[50] T. A. A. Alsbouı́, M. Hammoudeh, Z. Bandar, and A. Nisbet,
“An overview and classification of approaches to information
extraction in wireless sensor networks,” in Proceedings of the
5th International Conference on Sensor Technologies and Appli-
cations (SENSORCOMM ’11), pp. 255–260, IARIA, Nice, France,
August 2011.
[51] T. M. Cover and J. A. Thomas, Elements of Information Theory,
Wiley-Interscience, New York, NY, USA, 1991.

Research Article: Data Gathering Techniques For Wireless Sensor Networks: A Comparison

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Research Article: Data Gathering Techniques For Wireless Sensor Networks: A Comparison

Uploaded by

Copyright:

Available Formats

Hindawi Publishing Corporation

International Journal of Distributed Sensor Networks

Giuseppe Campobello, Antonino Segreto, and Salvatore Serrano

Correspondence should be addressed to Giuseppe Campobello; gcampobello@unime.it

Received 2 November 2015; Revised 3 February 2016; Accepted 4 February 2016

Academic Editor: Haiping Huang

1. Introduction So, in contrast to many other wireless devices (e.g., cel-

(i.e., maximum lossless compression factor). As a conse-

the actual CS reliability (this justifies why we used notation 100

TECMODEL TEC TECMODEL TEC

Simulation Parameters Simulation Parameters

without altering the overall number of source nodes (i.e.,

TECMODEL TEC CSI (bCS = 15) CoSaMP (bCS = 14)

Figure 8: ERF of data gathering techniques for 𝑝𝑒 = 0.01. 20

[38] R. Ahlswede, N. Cai, S.-Y. R. Li, and R. W. Yeung, “Network

You might also like