You are on page 1of 12

Building Programming Abstractions for Wireless Sensor Networks Using Watershed Segmentation

Mohammad Hammoudeh1 and Tariq A. A. Alsbou'i1


Manchester Metropolitan University, Manchester, UK, m.hammoudeh@mmu.ac.uk

The availability and quality of information extracted from Wireless Sensor Networks (WSNs) revolutionised a wide range of application areas. The success of any WSN application is, nonetheless, determined by the ability to retrieve information with the required level of accuracy, within specied time constraints, and with minimum resource utilisation. This paper presents a new approach to localised information extraction that utilises the Watershed segmentation algorithm to dynamically group nodes into segments, which can be used as programming abstractions upon which dierent query operations can be performed. Watershed results in a set of well delimited areas, such that the number of necessary operations (communication and computation) to answer a query are minimised. This paper presents a fully asynchronous Watershed implementation, where nodes can compute their local data in parallel and independently from one another. The preliminary experimental results demonstrate that the proposed approach is able to signicantly reduce the query processing cost and time without involving any loss of eciency.
Abstract

1 Introduction
Wireless Sensor Networks (WSNs) are currently being employed in a variety of applications ranging from home to industry, and from health to military. These applications have a number of elements in common: (1) The request for information; (2) The answer to this request is usually present in a set of unstructured data streams; (3) WSNs generate large amount of data that is `imperfect' in nature and contains considerable redundancy. Resource constraints on nodes in the network coupled with the characteristics of returned data means that applications have to be developed with the primary design goal of minimising resource utilisation. Distributed information extraction has been advocated to solve this kind of problems. Information extraction is the sub-discipline of articial intelligence that selectively structures, lters, and merges data generated by one or more sensor nodes. It adds meaning to unstructured raw data; therefore, the data become structured or quasi-structured making it more suitable for information processing tasks. This denition is concise and covers exactly in what sense the term information extraction will be used throughout this paper.

This paper presents an in-network information extraction system that nds and links relevant information while ignoring irrelevant and extraneous information. The nal output of the extraction process varies based on user queries; however, it commonly involves the extraction of fragments of information from various nodes within the same segment and linking of these fragments into a coherent answer. We propose the utilisation of Watershed segmentation algorithm [20] that result in a set of well delimited segments of homogenous sensed regions based on nodes location and their corresponding sensor readings. Watershed algorithm is suitable for dynamically changing environments because it uses no thresholding, instead the best option is chosen at each decision stage. We also propose a parallel asynchronous Watershed algorithm implementation that complies with sensor node constraints, i.e. real-time computing, and low power consumption. This in-network information extraction does not return the entire collected data, but it extracts sense data units from one or more network segments, typically simple or multi-modal information of spatio-temporal nature. The goal of node segmentation is to provide a high-level programming model for sensor networks that abstracts away the details of individual sensor nodes. In this paper we examine query-based systems that utilise in-network processing for query response. However, the produced abstractions can be utilised in by other information extraction systems, i.e. event based and time-driven. Query-based information extraction is a request-response interaction between the sensor nodes and end-user or application component. The end user issues a query in an appropriate language, and then the query is disseminated to the network to retrieve the desired data from the sensors based on the description in the query. Most query-based systems provide a high level interface to the sensor network while hiding the network topology as well as radio communication. The end user does not need to know how the data is collected or processed. User controlled, query-based, information extraction is usually applied in situations where it is known in advance what type of semantic information is to be extracted from the network. For example, it might be necessary to identify what type of events are happening in a certain part of the monitored environment and at what time these events took place. Depending on the information needs, dierent queries can be constructed to dierentiate various types of events at dierent levels of semantic granularity. In some applications, for instance, it will be adequate to specify that a part of a query is a temporal expression, while in others it might be necessary to dierentiate between dierent temporal classes, for example between expressions indicating past, present and future. In other applications, not only the semantic nature of the target information is predened, but also the unit and scope of the event to be extracted. The unit of extraction refers to the granularity of individual information portions that are lifted out of the sensor node. The scope of extraction refers to the granularity of the extraction space for every individual information request. In order to decide which portion of information is supposed to contribute in the answer of a query, an information extraction application uses a set of extraction conditions. These

conditions state what formal properties a particular portion of information must possess to belong to a particular semantic class. The Watershed transformation is a popular image segmentation algorithm for grey scale images [20,12,7,3,2,5,17,14,19,21]. Its basic concept comes from the eld of topography, referring to the partitioning of a landscape to a number of basins or water catchment areas. The authors in [12] use the following analogy to explain how the Watershed algorithm works: The USA can be divided into two main segments, one associated with the Atlantic Ocean and another associated with the Pacic Ocean. All the rain falling on the east segment will ow into the Atlantic Ocean, while the rain falling on the west segment will ow into the other ocean. The water will reach the ocean given that it is not trapped in a local minimum along the way. Both segments are usually named catchment basins, and each one has an associated minimum (the ocean). The boundary line that separates both basins is called the watershed line, corresponding to the continental divide in the example. Therefore, the image is viewed as a topographic surface where each pixel is a point situated at some altitude as a function of its grey level. The grey levels correspond to the altitude associated to the image between 0 and 255. After the original Watershed algorithm was published, several modications and variations of this algorithm was published to suit various applications. Watershed algorithms can be classied into two conceptually distinct techniques: immersion and raining. 1. Immersion simulates progressively immersing the entire topographic surface in a water container. 2. Raining simulates the rain fall over a topographic surface. The raining can be considered as a local method because each droplet follows on its own way not considering neighbouring droplets. On at areas of the surface, the motion of the water droplet is directed towards the nearest brim of a downward slope and it stops when it reaches a regional minimum. An important aspect of designing a parallel asynchronous algorithm is the exploitation of the data locality for minimisation of the communication overhead. Aiming at the goal, we propose here a reformulation of the raining watershed segmentation due to its suitability for parallel implementation. The presented implementation is capable of computing the Watershed transform according to local conditions. The approach proposed in this paper generates the same result using only one point of synchronisation, thus decreasing the running time without degradation in the eectiveness of the segmentation. Both assertions are demonstrated throughout this paper and supported by experimental results. The paper is organised as follows: Section 2 describes the Watershed algorithm. Section 3 illustrates the suitability of this algorithm for WSNs. Section 5 presents the parallel asynchronous implementation of the Watershed algorithm. The performance of the proposed implementation is evaluated in Section 6. Section 7 concludes the work.

2 Description of the Watershed Algorithm


This section presents Meyer's formalism [9] that is based on local conditions (rain simulation). The Watershed proposed by Meyer is the base for our parallel asynchronous implementation. Roedink et al. [15] summarised approaches for parallel implementations of the Watershed algorithm on powerful (memory and computation) devices. Before our synchronous and parallel implementation details are given, some preliminary denitions are introduced as described by Meyer [9]. Let f (p) be a function of of grey levels, signifying a continous digital image with the domain Z2 . Every pixel p has a grey level f (p) and a set of neighbouring pixels p N (p) with a distance function d(p, p ) to each neighbour; in most published algorithms a 4 to 8neighbours. (Regional minimum) Is a point or a set of connected points with the same grey level, where none of them have a neighbour with a lower grey level [3]. In keeping with the rain simulation analogy, a regional minimum is the area of the surface where the rain water would get trapped without owing to lower leveles. (Lower slope) If p exists, the lower slope (LS)denes the maximum steepness from a pixel to its lower neighbours.
(LS) = max
p N (p)
f (p)f (p dist(p,p

) |f (p )f (p) )

(Steepest decending path) p , SDP (p) is the set of points p N (p) dened as follows:
p N (p) f (p) f (p ) = LS (p) , f (p ) < f (p) dist (p, p )

i.e. the SDP is a series of connected points where each point presents a grey level strictly lower than the previous one. There may exist multiple decending paths from a given point, the choice between them depends on the implementation. A descending path is said to be a steepest path if each point in the path is connected to the neighbour with the lowest grey level. According to rain simulation analogy, the SDP is the path a drop of water would follow when travelling down to a regional minimum. (Cost function based on lower slope) The cost, cost (pi1 , pi ), for walking on the topographic surface from point pi1 to pi N (pi1 ) is:
LS (pi1 ) .dist (pi1 , pi ) f (pi1 ) > f (pi ) LS (pi ) .dist (pp1 , pi ) f (pi1 ) < f (pi ) 1 2 (LS (pi1 ) + LS (pi )) .dist(pi1 , pi ) f (pi1 ) = f (pi )

(Topographic distance) The topographic distance between two points p and q on a surface is the minimal topographical distance among all paths between p and q on the suraface:
T Df (p, q) = inf T Df (p, q)

cost(pi1 , pi ) is the topograpical distance of a path = (p = p1 , p2, ..., pn = q), such that i , pi N (pi1 ) and pi (Catchment basin based on topographic distance) CBT D (mi ) of a local minimum mi is the set of points p where the topographical distance is closer to mi than to any other regional minimum mj , based on the topographical disn i=2

where T Df (p, q) =

tance and the grey level of the minima:

CBT D (mi ) = {p |f (mi ) + T Df (p, mi ) + T Df (p, mj ) j = i }

In other words, the CB is formed by a regional minimum and all the points whose steepest decsending path ends in that minimum [2]. According to the rain simulation analogy, a CB is an area of a topographic surface, such that when a droplet of rain fall in any point in that area, it would poure in to its minimum following the steepest descending path of that point.

3 Modied Watershed Segmentation


1- Neighbourhood denition using Shepard method In most published
Watershed algorithm variations, a 4 to 8pixel neighbours are used. Because the distance from a node to its neighbours can vary, we advocate the use Shepard method [16] to select nearby nodes. Shepard dened two criteria: (a) Arbitrary distance criterion: All data points within radius r of the point p are included in computation. (b) Arbitrary number criterion: Only the closest n data points are considered in the computation of any interpolated value. Shepard has chosen a mix of the two criteria which combined their advantages. An initial radius r is dened depending on the overall density of data points such that seven data points are included on average in a circle of radius r. r is written as follows:
r2 = 7A N

were A is the area of the largest polygon enclosed by the data points. The suitability of this method for WSN is presented in [6].

2- Inclusion of the number of hops in the calculation of the steepest path In Section 2, the SDP (Denition 3) was dened as a series of connected

points where each point presents a grey level strictly lower than the previous one. There may exist multiple descending paths from a given point, the choice between them depends on the Watershed algorithm implementation. For WSNs applications, the most energy ecient descending path should be chosen. According to [13], communication is the most power hungry operation. Communication cost can be computed from transmission distance, hop count, delay, link quality, and other factors. Hop count is widely used factor to measure energy requirement of a routing task and for grouping nodes in energy ecient clusters. The

power consumed in data transmission is directly proportional to the square of the transmission distance between the sending and receiving nodes. The power consumption does not only depend on the transmission distance, but also on the scale of the network [18]. Therefore, the lower distances must be computed as a function of the number of hops as well as the inter-sensor Euclidean distances. The new distance function is the sum of the individual inter-node Euclidean distances multiplied by the hop count.
Dist(p, p ) = dist(pi , pj ) hc

where i, j < hc 1 and hc is the hope count.

4 Abstractions for Local Interactions


In this section, we give a brief description of how Watershed segmentation algorithm can be used for dening abstractions for local interactions and place our work in the context of other existing work in the area. To demonstrate the usefulness of the proposed abstractions we consider events that are caused by multiple elements targets, e.g. a herd of animals in a habitat monitoring application or toxic gas diusion. Elements triggering these events often exhibit idiosyncratic behaviour, such as splitting into several groups or merging into a single group. This type of events involves several neighbouring sensor nodes to collaborate to acquire aggregate features of the target such as shape, location, coverage, moving direction, speed, etc. Events involving this kind of targets are submitted by [8] as `region-like' targets. In this paper, Watershed segmentation is proposed to build a cooperative node structure over every network segment covered by the events to collect and aggregate related information. Watershed groups nodes sharing some common group state into segments. A segment is described as a collection of spatially distributed nodes with an example being the set of nodes in a geographic area with sensor readings in a specic range. Therefore, Watershed segmentation combines the advantages of data-centric (e.g. [4] and topologically (e.g. [22]) dened group abstractions. This makes the generated segments capable of expressing a number of local behaviours powerfully, which allows network programmers to write programs that express higher level behaviour beyond that of the usual query-based methods. In fact, the programmer is concerned not only with the application logic, but also with identifying the network portions to be involved in extracting a particular piece of information and how to reach them. Dealing with this new requirement necessitates new programming abstractions to localise complexity without scarifying eciency. The logical segments of nodes generated by the Watershed algorithm replace the tight physical neighbourhood provided by wireless broadcast with a higher level, application dened abstractions. Segments are created such that the span of a logical neighbourhood is specied dynamically and declaratively based on the attributes of nodes, along with requirements about communication costs (specically the diameter of the segment). A network segment formed with

a logical notion of proximity determined by applicative information is, therefore, capable to return specied information with high condence. Segments generated by Watershed algorithm are automatically labelled with a marker, which is an integer identier. All nodes within the segment satisfy the logical constraints encoded in the Watershed algorithm. This logical and intuitive Watershed template serves as the membership function that dynamically determines and updates which nodes belong to the segment. Programmers manipulate segments instead of nodes within communication range. The programmers can still reason in terms of nodes and broadcast messages, but now they can specify declaratively which portions of the network to consider and therefore control the span of communication to save energy. Macroprogramming [22] has been put forward in the literature as an ecient approach to information extraction that provides a more general-purpose approach to distributed computation. Many macroprogramming approaches aim at programming the network as a whole rather than programming the individual nodes that compose the network. Global behaviour can be specied, programmed and then translated to node level code transparently from low level details like network topology, radio communication or power capacity. A signicance class of macroprogramming systems are the application-dened, in-network abstractions that are used in data processing. The Regiment [22] and Hood systems are examples of neighbourhood-based abstractions that handle many nodes collectively and a set of operations on it to enable the programmer to extract information about the state of the group. EnviroTrack [1] is a programming abstraction specically for target-tracking applications, where a group is dened as the set of sensors that detected the same event. In SPIDEY [10], a node is represented as a logical node that has multiple exported attributes (static and dynamic). However, utilising the network topology as an abstraction can require some rigidity in the programming model. It can also be inecient for systems with mobile nodes due to the cost and complexity of maintaining the mapping between the physical topology and the logical topology. The parallel Watershedbased abstractions are dierent from these approaches in one important aspect; segmentation runtime loosely synchronises state across nodes, attaining grater robustness and higher eciency.

5 Parallel Asynchronous Watershed Implementation


Each node, (x, y, z), is depicted as a pixel on the image and the colour of the pixel located at (x, y) is obtained as the node sensed modality (z). The resolution of the image is the total number of nodes in the network per area unit. In parallel operation, rain falling simulation Watershed uses less communication between nodes, compared to immersion which has a highly global nature. The new segmentation algorithm can be viewed as an asynchronous relaxation of the `Hill Climbing' algorithm [9]. All processing is local to each node, which runs a simple nite state machine associated with a single sensed modality. Non-blocking communications allow each node to run independently. Since syn-

chronisation is limited to N (u) and non-blocking communications are used, each node operates independently and the algorithm needs no global scheduling. The role of the Watershed process is to label each non-minimum node by walking downward on a steepest slope path towards the minimum. Initially, all nodes in the network are considered as non-minima and will be ooded from dierent sub-domains. At this stage of the segmentation process, nodes are assigned temporary labels because the segmentation results depends on neighbouring nodes readings. When a node detects a steepest neighbour, it changes it status to a non-minimum node and gets labelled from the steepest neighbour or from its predecessor that has the shortest distance among its neighbouring nodes. When the minimum is reached, its label is assigned to all the nodes upward along the path. If no non-minimum were detected, the steepest distances must be calculated based on the entire set of lower borders. Process termination is locally detected on each node. This reduces the amount of communication and the idle time in nodes. The algorithm builds several paths with dierent origins and destinations from data communications between nodes. This does not introduce additional cost due to the broadcast nature of wireless communications. Typically, any changes in one node readings may aect all nodes on the steepest slope line between that node and the minimum. Therefore, nodes must keep monitoring and updating their membership. A ag, called reset, is maintained by each node to record whether changes occurred or not since the last communication. One advantage of this implementation is that no relabeling and no synchronisation between nodes are needed frequently. One disadvantage of this parallel asynchronous implementation is that the middle nodes on plateaus cannot locally decide whether they are on a minimum or non-minimum plateau, (N P ), and necessitate global synchronisation points to identify and label the minimum plateaus. To avoid global synchronisation, all middle nodes are labelled as minimum or Plateau, (M P ), to allow them makes local decisions. After that, the propagation of data over a non-minimum plateau allows the middle nodes of that plateau ooded by a neighbour to switch to non-minimum. Algorithm 1 presents the pseudocode of the segmentation process in each node.

6 Evaluation
Suppose that a WSN has been deployed to monitor temperature of environmentally sensitive areas. An event of interest is predened if temperature readings with enough numbers go above a certain threshold in a specic geographic area. In our simulation, an event is triggered at random times in random locations followed by issuing a query to locate the hottest spots. Query resolution is implemented using three methods: Watershed-based, nodes are logically grouped into segments that are used to assist query processing; in-network processing via aggregation of messages up a spanning tree of the network; and centralised,

Algorithm 1 The segmentation process of nodes in each state. Input node current state of node (u) S(u); f (v) states of neighbours Output segment membership case (S (u) = initial)

The node u broadcasts its data d (u) and label l (u) to its neighbours N (u), waits for data from each neigbhour v N (u), compute the following: = N (u) all neighbouring readings equal to (u) N (u) all neighbouring readings greater than (u) Ln N (u) = vi a singleton set such that LSmax (vi ) and vi N (u)

if

else

N (u) = l (u) minvN (u)= (l (v)); S (u) M P

Ln

case (S (u) =

S (u) N P MP ) The node u waits to receive new data, d (v), from any neighbour if d (v) < d(u) ln N (u) v; S (u) N P

l (u) min (l (u) , l (v)); S (u) M P NP ) The node (u)waits for data from N (u)Ln ; S (u) N P In all states, the node (u) sends its data f (u) to all nodes in N (u) whenever

else

case (S (u) =

a new reading is recorded

(a)
Figure 1.

(b)

(c)

(a) Original thermal map (b) Watershed segmentation results (c) Extend segmentation

(a)
Figure 2.

(b)

query

Communication overhead and the number of nodes involved in resolving the

were all data is sent directly to the sink for analysis. All simulations were carried out using Dingo [11], which is a scalable python-based package to allow rapid prototyping of WSNs algorithms. In all experiments, we make use of a thermal map, Figure 1(a), adopted from goinfrared.com. Figure 1(b) shows the result of segmentation in which nodes in each segment collaborate to solve a query. It is easy to extend the segmentation to the whole monitored terrain by using generalised Voronoi, i.e., each location where there is no sensor node is assigned to its nearest segment, Figure 1(c). The cost of the query resolution was measured in terms of the number of messages exchanged to answer the query. Multiple runs with dierent topologies and dierent number of nodes were carried out for the three query resolution methods. These results are presented in Figure 2(a). The energy cost, response time, and accuracy are aected by which and how many nodes are involved in answering a query. Excluding nodes irrelevant to a certain query not only improve answer accuracy but also saves energy and reduce the time required for data analysis. Figure 2(b) shows the number of nodes involved in resolving the same query at dierent network densities. The results obtained in the above experiments indicate that segments can be used to support in-network query resolution. The results shows that the communication overhead associated with segment-based query processing is almost 2 folds less than in-network processing and 10 folds less than the centralised query resolution. These results are clearly explained by the analysis presented in Figure 2.

7 Conclusion
The preliminary work in this paper indicates that segment-based in-network query processing produces considerable energy savings over aggregation and centralised approaches. Watershed logically organises nodes into energy ecient segments that reduce unecessary data transmissions and improve response accuracy. Compared to standard Watershed segmentation algorithms, the major

improvement of our algorithm is that labelling and climbing along the steepest paths are concurrently and locally executed based on the node state, during the entire segmentation process. There are a number of limitations in the work so far that need to be addressed in the future, for example the cost of segmentation.

References
1. T. Abdelzaher, B. Blum, Q. Cao, Y. Chen, D. Evans, J. George, S. George, L. Gu, T. He, S. Krishnamurthy, L. Luo, S. Son, J. Stankovic, R. Stoleru, and A. Wood. Envirotrack: Towards an environmental computing paradigm for distributed sensor networks. In Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS'04), pages 582589, 2004. 2. A. Bieniek and A. Moga. An ecient watershed algorithm based on connected components. Pattern Recognition, 33(6):907  916, 2000. 3. Andr Bleau and L. Joshua Leon. Watershed-based segmentation and region merging. Comput. Vis. Image Underst., 77:317370, January 2000. 4. Maurice Chu and Juan Julia Liu. State-centric programming for sensor and actuator network systems. IEEE Pervasive Computing, 2003. 5. V. Grau, A. U. J. Mewes, M. Alcaniz, R. Kikinis, and S. K. Wareld. Improved watershed transform for medical image segmentation using prior information. 23(4):447458, 2004. 6. M. Hammoudeh, R. Newman, C. Dennett, and S. Mount. Interpolation techniques for building a continuous map from discrete wireless sensor network data. Wireless Communications and Mobile Computing, January 2011. 7. C.J. Kuo, S.F. Odeh, and M.C. Huang. Image segmentation with improved watershed algorithm and its fpga implementation. IEEE ISCAS 2001, 2:753756, 2001. 8. Chun-Han Lin, Chung-Ta King, and Hung-Chang Hsiao. Region abstraction for event tracking in wireless sensor networks. In 8th International Symposium on Parallel Architectures Algorithms and Networks, page 2005, 274-281. 9. Fernand Meyer. Topographic distance and watershed lines. Signal Process., 38:113 125, July 1994. 10. Luca Mottola and Gian Pietro Picco. Using logical neighborhoods to enable scoping in wireless sensor networks. In Proceedings of the 3rd international Middleware doctoral symposium, pages 6, 2006. 11. Sarah Mount. Dingo wireless sensor networks simulator. http://code.google.com/p/dingo-wsn/, 2011. [Online; accessed 26-March-2011]. 12. Vctor Osma-Ruiz, Juan I. Godino-Llorente, Nicols Senz-Lechn, and Pedro Gmez-Vilda. An improved watershed algorithm based on ecient computation of shortest paths. Pattern Recogn., 40:10781090, March 2007. 13. G. J. Pottie and W. J. Kaiser. Wireless integrated network sensors. Commun. ACM, 43(5):5158, 2000. 14. C. Rambabu, T.S. Rathore, and I. Chakrabarti. A new watershed algorithm based on hillclimbing technique for image segmentation. 4:14041408, 2003. 15. Roerdink and Meijster. The watershed transform: Denitions, algorithms and parallelization strategies. FUNDINF: Fundamenta Informatica, 41, 2000. 16. Donald Shepard. A two-dimensional interpolation function for irregularly-spaced data. In Proceedings of the 1968 23rd ACM national conference, pages 517524, 1968.

17. Han Sun, Jingyu Yang, and Mingwu Ren. A fast watershed algorithm based on chain code and its application in image segmentation. Pattern Recogn. Lett., 26:12661274, July 2005. 18. Peng Sun, Winston K.G. Seah, and Pius W.Q. Lee. Ecient data delivery with packet cloning for underwater sensor networks. In Symposium on Underwater Tech19. Michawiercz and Marcin Iwanowski. Fast, parallel watershed algorithm based on path tracing. In Proceedings of the 2010 international conference on Computer vision and graphics: Part II, ICCVG'10, pages 317324, 2010. 20. L. Vincent and P. Soille. Watersheds in digital spaces: An ecient algorithm based on immersion simulations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13:583598, 1991. 21. Bjrn Wagner, Andreas Dinges, Paul Mller, and Gundolf Haase. Parallel volume image segmentation with watershed transformation. In Proceedings of the 16th Scandinavian Conference on Image Analysis, SCIA '09, pages 420429, 2009. 22. Matt Welsh and Geo Mainland. Programming sensor networks using abstract regions. In Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1, pages 33, 2004.

nology and Workshop on Scientic Use of Submarine Cables and Related Technologies, pages 3441, April 2007.

You might also like