You are on page 1of 10

HET-NETs 2010

ISBN 978-83-926054-4-7
pp. 419 428




Self healing in wireless mesh networks by channel switching

KRZYSZTOF GROCHLA KRZYSZTOF STASIAK


Proximetry Poland Sp. z o.o.
Katowice, Al. Rozdzieskiego 91
{kgrochla|kstasiak}@proximetry.pl



Abstract: The wireless mesh networking is a technology for building very reliable and self-
organizing wireless networks. The self-healing algorithms in mesh networks provide functions to
automatically adapting and repairing the network in response to failures or other transmission problems.
In this paper we present a novel algorithm for automatic channel switching in wireless mesh network,
which helps to react to the interferences blocking the data transmission. The link state is constantly
monitored using received signal strength (RSSI) and bit error rate. When problems are perceived one of
the nodes starts the channel switching procedure and other mesh nodes follow it by monitoring the
packed received from non-orthogonal channels. The method is discussed in detail, together with some
results of measurements of sample implementation on OpenWRT platform.
Keywords: wireless mesh networks, reliability, self-healing, channel assignment.


1. Introduction

For the last few years we have been experiencing a rapid growth of interest in
mobile ad-hoc networking. The wireless mesh networks, comprised of nodes with
multiple radio interfaces routing the packets, are a promising technology for
example for broadband residential internet access or to provide connectivity to
temporal events. Wireless mesh networks (WMNs) consist of mesh routers (nodes)
and mesh clients, where mesh routers have minimal mobility and form the backbone
of WMNs [1]. They provide network access for both mesh and conventional clients.
The links in WMN may use single or multiple wireless technologies. A single mesh
node may be equipped with one or multiply wireless interfaces. A WMN is
dynamically self-organized and self-configured, with the nodes in the network
automatically establishing and maintaining mesh connectivity among themselves
(creating, in effect, an ad hoc network). This feature brings many advantages to
420

WMNs such as low up-front cost, easy network maintenance, robustness, and
reliable service coverage.
In order to simplify network deployment, the autoconfiguration procedures
providing automatic network start-up with minimum manual configuration of the
nodes are increasingly important [5]. To maximize the utilization of radio resources
the efficient algorithms to select optimal channel to the current radio propagation
condition are required [2]. The algorithms to manage quality of service resources
reservation allows greatly increase the usability of the network. All these algorithms
are being developed within the EU-MESH project [3], which aims to create novel
configuration procedures, resource management, QoS routing, mobility support and
self-healing algorithms that achieve efficient usage of both the wireless spectrum
and fixed broadband access lines. In this paper we try to extend these algorithms by
self-healing procedures providing the network methods to automatically react to
interferences in data transmission.
The self-healing procedures inside the mesh network provide methods for
repairing the network connectivity in response to failures or interferences. They
should continuously monitor the state of the network and reconfigure the wireless
interfaces when misbehaviour is detected.
The self-healing actions may be executed on two levels:
locally on mesh node,
centrally, on the network management server.
In this paper we concentrate on locally executed actions. The locally executed
actions allow for very fast reaction to communication problems. The agent working
on mesh device constantly monitors the required parameters and statistics of a mesh
node. When an anomaly is detected a local repair action may be triggered. The
actions are executed in distributed manner on the mesh nodes, without global
coordination and without execution of communication protocol. We have developed
the distributed actions for channel switching.
The self-healing action limit the total cost of ownership and minimize the time
when the network is not operational, by performing automatic reconfiguration of the
network in response to failures or lower network performance. The drop of
performance is often caused by interferences, especially in unlicensed bands where
many other devices may transmit on the same channels. Most of the works related
to interferences in WMN take them into account as a part of routing metric see
e.g. [6], [7], unfortunately this method does not allow to avoid the interferences by
reconfiguration of the devices the traffic will be rerouted using another links.
Another common method is to use channel assignment algorithm for selecting
channels which experience low interferences [8] [9]. This provides methods for
selecting optimal channel for the wireless links, but typically the channel
assignment algorithm works periodically and the interferences may appear at any
421

moment of the transmission. In this work we try to join the approach of interference
aware channel selection with monitoring of the link quality to provide automatic,
local procedure to change the channel when interferences appear.
In this work we concentrate on the wireless mesh networks build of devices
having multiple IEEE 802.11 b/g interfaces. The solution has been implemented
and tested on Mikrotik Routerboard RB532 devices with OpenWRT Linux
software installed. We assume that the mesh nodes and antennas are fixed. In
such network the radio signal propagation changes mainly due to some
interferences e.g. by neighbouring transmission in overlapping frequencies or by
change in the physical signal propagation conditions e.g. by appearance of new
obstacle on the link.

2. Triggering the self-healing action

The self-healing procedures are able to repair the mesh network. However
before start of the repair the decision when start the self-healing action should
take place must be made. In classic, manually managed mesh network both this
decision and the repair action is performed manually by network operator. In the
mesh networks the devices should trigger the automatic repair action when the
network does not perform as good as it used to or as good as it should. The signal
to trigger the action will come from the observation of the network performance.
We propose that the self-healing actions are triggered in three cases:
when failure of some of the network elements is observed,
when the observed performance is lower than the typical for this
network,
when the observed performance is lower than preconfigured threshold.
The statistics representing the current performance of the mesh network needs
to be compared not only to the preconfigured values, but also to values
representing the observed average performance. The self-healing action will be
triggered not only when the observed value is worse than the preconfigured
threshold, but also when it is worse than the low-pass filtered value.

2.1 Detection of the link quality degradation

The goal is to use algorithms for detecting rapid drop in the wireless link
quality. The detection is later used as a trigger for wireless mesh network
management procedures to perform a self-healing action.
To detect the drop of link quality the constant monitoring of received signal
strength must be performed. The monitoring module must collect the data
reported by the network interface card driver, analyze the quality of the received
422

signal and apply some filtering or other methods of statistical analysis. The
output of this module is a binary value 1 if the quality of the link remains
stable, 0 otherwise.
The Linux wireless network drivers use Received Signal Strength Index as
a value representing the received radio signal strength (energy integral, not the
quality). In an IEEE 802.11 system RSSI is the received signal strength in
a wireless environment, in arbitrary units. RSSI can be used internally in
a wireless networking card to determine when the amount of radio energy in the
channel is below a certain threshold at which point the network card is clear to
send (CTS). Once the card is clear to send, a packet of information can be sent.
The end-user will likely observe an RSSI value when measuring the signal
strength of a wireless network through the use of a wireless network monitoring
tool like Network Stumbler.
In MadWiFi, the reported RSSI for each packet is actually equivalent to the
Signal-to-Noise Ratio (SNR) and hence we can use the terms interchangeably.
This does not necessarily hold for other drivers though. This is because the RSSI
reported by the MadWiFi HAL is a value in dBm that specifies the difference
between the signal level and noise level for each packet. Hence the driver
calculates a packet's absolute signal level by adding the RSSI to the absolute
noise level. In general, an RSSI of 10 or less represents a weak signal although
the chips can often decode low bit-rate signals down to -94dBm. An RSSI of 20
or so is decent. An RSSI of 40 or more is very strong and will easily support both
54MBit/s and 108MBit/s operation.
The RSSI may fluctuate over time and very short interferences may cause
a rapid change in RSSI value, which does not mean link breakage. To
successfully detect the link loss some kind of low pass filter must be used, to
ignore very short fluctuations of RSSI and do not trigger false detections. On the
other hand, whenever the link quality goes down for a period of few seconds it is
certainly the situation which should be detected.
We evaluated 3 simple methods to monitor the RSSI values and trigger the
decision, whenever the link quality drop has appeared:
threshold value on simple moving average,
result of subtraction between short and long simple moving averages,
a modified moving average.
The link quality degradation triggers the action, which is run as a procedure
on the mesh devices which reconfigures the node according to some
preconfigured parameters.


423

2.2.1 Simple moving average

This formula uses averaging window aw. This widow contains latest RSSI
values from aw periods. Average from this value is compared with current value
of RSSI. At time k anomaly is detected if the percentage of change of the current
RSSI value exceeds threshold value h.
|
|
|
|
.
|

\
|
>

=
h
RSSI
aw
RSSI RSSI
aw
Boolean alarm Ma
k
aw k i
i
k
k
aw k i
i
k 1
1
1
1
_

where
aw - size of the averaging window,
RSSI
i
- value of the RSSI at time equal to i
Simple moving average filters values which have random character in
window aw. As a result we get average level of RSSI value. Comparing
calculated moving average with current RSSI value we can get information about
rapid RSSI changes. Unfortunately, when the decrease of RSSI value has an
impulse character it may be misdetected by this algorithm. The size of the
averaging window is a parameter of the algorithm and can be tuned to provide
lower or higher cutoff frequency of the filter.

2.2.2 Result of subtraction between short and long simple moving averages

=
=
1
1
k
wl k i
i k
RSSI
wl
ML

=
=
1
1
k
ws k i
i k
RSSI
ws
MS

|
|
.
|

\
|
>

= h
ML
MS ML
Boolean alarm SL
k
k k
k
_

This formula uses a two simple moving average. ML is a long simple
moving average. MS is a short simple moving average it uses much smaller
averaging window. Anomaly is detected when MS is less than h percentage of
MS value. The anomaly detecting formula reacts slower than simple moving
average in case of RSSI value changes.

424

2.2.3 Modified moving average

We propose to use a modified moving average formula to simplify computing
of the values. With these formulas there is no need to have cyclic buffer for latest
value of RSSI. Only average if short and average of long modified moving
average should be remembered.
k n n
RSSI p W p WL ) 1 (
1 1 1
+ =


k n n
RSSI p W p WS ) 1 (
2 1 2
+ =


) 1 , 0 ( ,
2 1
e p p
|
|
.
|

\
|
=
k
k k
k
WL
WS WL
Boolean alarm W _
2.3 Implementation of the monitoring module

The proposed monitoring solution was implemented in Linux on OpenWRT
platform. It consist of userspace program that periodically reads the RSSI from
the MadWIFI driver by IOCTL exchange and online or offline processing tool
that analyzes the RSSI values and generates the signal of link loss. In case of
online processing the values are passed directly to another process realizing the
monitoring. For offline processing the RSSI are written to text file on the device
filesystem. Both these programs were implemented in C.

3. The channel switching in response to failures

The selfhealing of the link quality by channel switching algorithm may seem
very simple: if the link quality degradation is detected try to switch to other
channel, as probably the interferences there are lower than on current one.
Unfortunately, the channel switching must be performed on all nodes using the
same link simultaneously to sustain the connectivity. In wireless mesh network
based on IEEE 802.11 radios a single link may join two or more wireless
interfaces, using the same channel and ESSID. There are two possible solutions:
centralized or distributed. In centralized solution a single point in the network
reconfigures all the nodes using management protocol. In distributed algorithm
information about the channel change may be triggered by any node and is
gathered from the network state or transmitted by a protocol to all other nodes.
425

Link quality drop
detected?
Get link state
START
NO
YES
Switch do different
channel

Fig. 1. Channel switching - general algorithm block diagram

2.1.1 Repair channel selection

In both the distributed and centralized solution the first step after detection of
the link quality degradation is to select to which channel the link should be
switched. This algorithm is run on the node that detects and triggers the change.
To limit the time required to select the channel there is no additional scanning
before the channel selection. The algorithm does not use the information about
channel utilization reported by the interface driver (MadWIFI), because in the ad-
hoc mode usually used in mesh networks these information is not reported
correctly and for the infrastructure mode networks it requires background
scanning which can be only performed on the client devices.
For the IEEE 802.11 b and g networks there are only 3 orthogonal channels
available. The standard defines 13 channels each of width 22 MHz but spaced
only 5 MHz apart. It is recommended that two networks on the same area should
not use overlapping channels the distance between them should be at least 5.
However when the distance between the channels is greater than 3 the
interferences are small. The algorithm takes, as a starting point, current channel
and selects a channel which at least with distance of 3. The direction at which the
channel is selected depends on the current channel and previous channel
switching. When the current channel is lower than 7 the higher channels are
selected, when it is greater or equal 7 higher one is chosen. If the channel change
has already been started within some time in the past the channel is selected to be
in the same direction as before.

426

4. Performance evaluation of the local channel switching algorithm

The tests of the channel switching algorithm have been performed in the
Proximetry laboratory. Two Mikrotik routerboard RB532 devices were used as
a source and sink for the traffic, third device was used to generate the
interferences. The interferences were generated by sending as much packets as
the MAC layer allowed on a channel next to the channel used for transmission.
To flood the channel with the transmission the ACK messages on the MAC layer
has been disabled. The iperf [4] program was used to generate the traffic. The
source was configured to transmit 1MBit of data per second. The IEEE 802.11g
frequencies were used. The data transmission started at channel 2, the
interferences were generated at channel 1.
1
Interference generation
2

Fig. 2. Laboratory installation for evaluation of channel switching algorithm

As a first test a measurement of traffic without the channel switching has
been done. The UDP protocol was used during the experiment. The interferences
were generated starting from the 55s of the transmission. After start of the
interference generation almost no packet could be received correctly.


Fig. 3. Measurements of data rate in presence of interferences (bitrate in time)
427

The next test was executed in the same conditions, but the local channel
switching algorithm was enabled. The interferences were generated starting from
52
nd
second of the transmission. The self-healing mechanism requires few
seconds to detect the interference and start the channel switching procedure. As
can be seen on the 0 the algorithm was able to restore the full throughput of the
transmission within 6-7 seconds. The data transmission was switched to
channel 7. Unfortunately during the experiment we have experienced also a false
detection and the channel switch procedure was initiated at 62s of the
transmission, which temporary brought back the transmission to frequencies that
were subject to interferences. This situation was repaired within short time
period.

Fig. 3. Data rate with self-healing channel switching algorithm enabled (bitrate in time)

5. Summary

The presented channel switching algorithm provides quick and easy to
implement function to automatically repair the connectivity between nodes in
reaction to interferences. The sample implementation was presented, together
with measurements evaluating the time required to switch the channel.

6. Acknowledgement

This work was supported in part by the European Commission in the 7th
Framework Programme through project EU-MESH (Enhanced, Ubiquitous, and
Dependable Broadband Access using MESH Networks), ICT-215320,
www.eu-mesh.eu.
428

References

[1] Ian F. Akyildiz, Xudong Wang, Weilin Wang, Wireless mesh networks: a survey,
Computer Networks, Volume 47, Issue 4
[2] Vasilios Siris, Ioannis G. Askoxylakis, Marco Conti and Raffaele Bruno, "Enhanced,
Ubiquitous and Dependable Broadband Access using MESH Networks". ERCIM
Newsletter, Issue 73, pp 50-51, April 2008
[3] The EU-MESH project web site, http://www.eu-mesh.eu
[4] IPERF, http://www.noc.ucf.edu/Tools/Iperf/
[5] K. Grochla, W. Buga, P. Pacyna, J. Dzierga, A. Seman: "Autoconfiguration
procedures for multi-radio wireless mesh networks based on DHCP protocol", IEEE
Proceedings from HotMESH 09 Workshop, Kos, Greece 2009
[6] Prabhu Subramanian, Milind M. Buddhikot, Scott Miller: Interference aware
routing in multi-radio wireless mesh networks, Proc. of IEEE Workshop on
Wireless Mesh Networks 2006
[7] Jian Tang and Guoliang Xue and Weiyi Zhang: Interference-aware topology control
and qos routing in multi-channel wireless mesh networks, Proceedings of ACM
MOBIHOC 2005
[8] A. P. Subramaniana, H. Gupta, and S. Das, Minimum-interference channel
assignment in multi-radio wireless mesh networks, in IEEE Communications
Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks,
2007. SECON 07., 2007, pp. 481490.
[9] K. N. Ramachandran, E. M. Belding, K. Almeroth, and M. Buddhikot, Inteference-
aware channel assignment in multi-radio wireless mesh networks, in IEEE Infocom,
2006, pp. 112.

You might also like