Professional Documents
Culture Documents
Chapter 1
INTRODUCTION
In a Mobile Ad Hoc Network (MANET), mobile devices (nodes) may be spread over a large area where access to external data is achieved through one or more access points(APs). However, not all nodes have a direct link with these APs. Instead, they rely on other nodes that act as routers to reach them. In certain situations, the APs may be located at the extremities of the MANET, where reaching them could be costly in terms of delay, power consumption, and bandwidth utilization. Additionally, the access point may connect to a costly resource (e.g., a satellite link), or an external network that is susceptible to intrusion. For such reasons and others that concern data availability and response time, MANET applications should check for the existence of the desired data inside the network before attempting to connect to the external data source. An example would be a node that is searching for data that have been requested before by other nodes and are now cached and available to the rest of the nodes. Another example is where there is a group of nodes that have data which may be of interest to other nodes and are willing to share them. These scenarios and others suggest that efficient data search techniques be developed for allowing mobile nodes to find the desired data if it exists in the MANET quickly and with minimum power consumption. Given how ad hoc wireless networks work, searching performance relies on the efficiency of employed routing strategies. Actually, one of the biggest challenges in MANETS lies in the creation of efficient routing techniques [5]. Routing protocols are responsible for finding an efficient path between any two nodes in the network that wish to communicate, and for routing data messages along this path. The path must be chosen so that network throughput is maximized and message delay and other undesirable events are minimized. Two main types of routing protocols exist: source routing and destination routing. Destination routing itself is classified into two types: distance-vector routing, used in the RIP Internet protocol [11], and link-state routing, used in the OSPF Internet protocol [12]. Relevant to our work are the Destination-Sequenced Distance Vector (DSDV) and the Ad hoc On-demand Distance Vector (AODV) protocols, which are distance-vector routing protocols designed for MANET environments. With such protocols, a node maintains a routing table and a distance vector. The table contains the neighbor along the shortest path to each destination in the
Dept Of C.S.E, A.P.S.C.E 2010-11 Page 1
MDPF For Search Applications In Mobile Adhoc Networks network, while the vector has the distance (number of hops) of this path. In high mobility scenarios, the paths from sources to destinations will become nonoptimal (i.e., not the shortest paths) until the routing tables are updated. With DSDV, each node periodically updates its shortest paths by sending its distance vector to its neighbors to inform them about possible distance changes to destinations in the network, while with AODV, a node computes/updates the shortest path to a destination only when it needs to communicate with it (i.e., on demand). Our proposed Minimum Distance Packet Forwarding (MDPF) algorithm is based on the same basic concept employed by distance-vector routing protocols in that it forwards the search message to the nearest node that potentially stores the desired data item. Actually, MDPF maybe regarded as a high-level routing protocol operating on top of a distance-vector routing protocol, and thus, together they form a two-layer protocol that works to minimize the response time of a search application by following the consecutive shortest paths. The given analysis focuses on providing confidence intervals for the mean distance to reach the node with the desired data and the distance to traverse all the search nodes. Moreover, it will be demonstrated that MDPF distributes the average load caused by search traffic among the visited nodes nearly uniformly in spite of their possibly nonuniform caching capacities. The rest of this paper is organized as follows: Section 2 describes the proposed approach and illustrates it using an example application. Section 3 derives expressions for the system parameters plus key performance measures and presents the analysis results. Section 4 presents the simulations done using the ns2 software. Section 5 provides a short survey of related work, while finally, Section 6 ends the paper with concluding remarks and ideas for future work.
2010-11
Page 2
Chapter 2
FRAMEWORK OVERVIEW
The idea behind MDPF is to use routing table information for visiting nodes in the order of shortest distance (hop counts). As implied, this requires valid routing information, which could be handled through a proactive routing protocol such as the DSDV protocol or an on-demand reactive routing protocol, like AODV. We make the assumption that the set of nodes that hold the search information is known to all nodes in the wireless network, and we refer to these nodes as the search nodes, or SNs. We emphasize though that a requesting node which is interested in a particular data item does not usually know which specific SN holds the location of the data item, and therefore, it must search for it in the SNs.
2010-11
Page 3
Fig. 2.1 Two scenarios for request forwarding: Scenario 1 corresponds to a hit and Scenario 2 describes a miss When a proactive routing protocol is employed, MDPF can readily use the routing information to choose the SN that requires the minimum number of hops from the set of unchecked SNs. However, when an on-demand routing protocol, such as the AODV or Dynamic Source Routing (DSR) protocols, is in place, the routing information to the nearest unvisited SN must be discovered on demand if necessary (i.e., if its is not cached, or if it is cached but not fresh) and kept in the routing table for a certain period of time before it expires. More specifically, when a reactive routing protocol is employed, MDPF works as follows. Each node examines its routing table to find if the routing information to the unchecked SNs is present and valid. If yes, the node acts as in the proactive case and chooses the SN with the minimum number of hops to reach. If the node finds that its routing table does not contain the routing information for one or more unchecked SNs, it broadcasts an SN Discovery Packet (SNDP)
Dept Of C.S.E, A.P.S.C.E 2010-11 Page 4
MDPF For Search Applications In Mobile Adhoc Networks containing the list of unchecked SNs and a sequence number to all its neighbors.When a neighbor receives an SNDP the first time, it checks its routing table for the presence of one or more unchecked SNs. If it knows of such SNs, the neighbor sends the routing information of these SNs to the requesting node. Else, the neighbor broadcasts (forwards) the SNDP to its own neighbors. In order to prevent the possibility of flooding the network with packets, the SNDP contains a hop limit k that denotes the maximum number of hops away from the source that the SNDP can be sent to. The value of k depends on the network size, the total number of nodes, the transmission range, and the number of current unchecked SNs. For example, for a 1;000 _ 1;000 m2 network containing 100 nodes, and when the number of unchecked SNs is 7, the network diameter is approximately 14 hops and k could be set to 14=7 2 hops (assuming that the SNs are uniformly spread throughout the network). As the number of uncheckedSNs decreases, k increases, and vice versa. When this number is 1, k will be equal to the network diameter. Finally, the SNDP source node waits for time _ (e.g., 0.1 sec), examines the routing information to the SNs it received, then chooses the SN with minimum number of hops to reach, and forwards the search packet to it. It also adds the routing information to its routing table for future use.
2010-11
Page 5
2010-11
Page 6
MDPF For Search Applications In Mobile Adhoc Networks Indeed, since the second SN was the closest one to the first, it means that the second SN is on the boundary of a disk centered at the first SN which is empty of any SN. As one reaches the third, fourth, and nth SN, the empty area becomes an ever more complicated union of disks, making it difficult to obtain a provably accurate analysis.
2010-11
Page 7
Chapter 3
where NN(i) is the length of the Traveling Salesperson Tour obtained using the Nearest Neighbor heuristic on a problem instance i, Opt(i) is the length of the optimal tour of the problem instance, and n is the number of nodes. Here, the term optimal tour was borrowed from [9] and refers to the simple cycle of the shortest length containing all the nodes. Next, [16, Theorem 2] specifies that if n points are in a unit square, the optimal path length is at most:
2010-11
Page 8
MDPF For Search Applications In Mobile Adhoc Networks So, we can deduce a worst case bound on the length of the nearest neighbor path in a unit square:
However, it is often the average case which is the most relevant in practice. So, we are now faced with a standard statistical problem: estimating the mean of an unknown probability distribution using sampling. In our case, the probability distribution of the total path length is not known to belong to a well-known family (e.g., Binomial, Poisson, Geometric, Gaussian, etc.). It might be argued that since we will be looking at the distribution of the sample mean, we could infer that it would tend to be a Gaussian law using the central limit theorem. But if we simply rely on this, we would disregard the conditions of validity of the theorem, which would be a rather risky thing to do: there are probability distributions which do not even have a mean, such as defined on And, even for the distributions that do have a mean,
it is difficult to determine the sample size that would guarantee a sample mean distribution acceptably close to the normal. The only tool we have in this regard is the Berry-Esseen theorem [19], which requires knowing at least some bounds on the third moment and on the standard deviation to
be applied. We do not know either of these two values in our case. It would be possible to derive some bounds on them, but if these bounds are too imprecise, the required sample size would become enormous. That is why we finally opted for more direct methods that allow us to derive confidence intervals without making any further assumption. We start by describing a nave approach: we run, say, 1,000,000 experiments and record the smallest and highest values of obtained path length. Wechoose two bounds which are, respectively, much below the minimum and much above the maximum. We run 1,000,000 experiments a few times again. It is quite probable that all the values we obtain will be within our bounds. So, we could obtain a result such as 99.9 percent of all path length values falling within bounds x and y at the 0.0001 confidence level or even better (a chance in 10,000 of being wrong). We then can obtain a bound on the mean by using the fact that the remaining paths have a bounded length (by Theorems 1 and 2 that were mentioned above) and that there are only few of them.
2010-11
Page 9
MDPF For Search Applications In Mobile Adhoc Networks The main problem with this approach is that in spite of the very high confidence level it guarantees, the bounds obtained are far too imprecise, since, in fact, all what we will have established is a weaker form of the following statement: Were pretty sure that the mean must be between the shortest path and the longest path we ever obtained. This is why it was necessary to come up with methods that trade off the confidence level for precision. In our work, we obtained our confidence intervals using two methods, which can be combined or used independently.
2010-11
Page 10
Since we can make Mi mi as small as we want, and since we can make li-ui arbitrarily small provided that we can make the sample size arbitrarily large, we now have a method, which, in principle, can yield results as precise as we want. But, in fact, the sample size would become prohibitively large if the precision we require is too great.Aproblem remains: how are we going to determine the confidence level of our estimation? We recall that a given procedure yields a level of confidence 1 -. We can write: if the actual parameter falls within the confidence interval with a probability of 1 -.We now letA1 and A2 be two events with respective probabilities of 1 and
Therefore, the confidence level for the entire set ointervals is the sum of the confidence levels for each interval. As expected, our confidence decreases as the number of interval decreases. Thus, it appears that this is one possible method to trade confidence for precision.
2010-11
Page 11
MDPF For Search Applications In Mobile Adhoc Networks compared them to Bettstetter and Eberspachers [2] by reproducing their experiments using our own setup. We obtained a very close agreement (actually, the values we obtained for the number of hops between two random nodes are the same as theirs with two significant digits, which is the precision with which they decided to present their results).
Fig. 3.1 Confidence interval for the mean number of hops in four cases.
2010-11
Page 13
distribution) and Nd is the total number of data items. In this analysis, we let i correspond to the order of SN traversal. That is, we let the nearest SN to the client (i.e., the first contacted SN) be the most probable SN to have the desired data, followed by the next-visited SN, and so on. The Zipf probability density function for Nd =20 is illustrated in the left part of Fig. 3 for different values of , where it is seen that as increases, the probability of finding the data in
the nearest SNs becomes increasingly higher. The effect of applying the Zipf distribution to the localization of data on the expected number of hops to get to the SN that holds the data is shown in the right part of Fig. 3.
Fig. 3.2 Property of the Zipf pdf and its effect on the mean number of hops to desired data.
2010-11
Page 14
Chapter 4
since the list of SNs may be accessed in any order. For this purpose, we define the function which is the probability that SNi will be accessed (or have a request forwarded to) given that it is in position n As explained earlier, this probability
depends on the cache size of all nodes that follow SNi. However, since the next nodes are considered to be random, an expected total cache size must be determined. Now, since there is no a priori knowledge of the positions of each of the other nodes in the sequence, their size is estimated using the expected cache size of other nodes. This is determined as follows (N stands for NSN):
2010-11
Page 15
We then multiply this value by the number of nodes that follow SNi and add Ci to get the total expected cache size of node SNi as well as all the nodes that follow it. Dividing the resultant value by the total cache size of the system gives us as follows:
Finally, since the position of SNi is assumed to be uniformly random, the probability of it being accessed is given by taking the average of for all values of n:
2010-11
Page 16
Fig. 4.1 Fraction of load per SN for two different storage capacity cases. The expression in (11) is plotted in Fig. 4, where one SN has twice the cache size with respect to the others, which, in turn, have the same size. The curves illustrate the load trends for the SN with double the capacity and any of the other SNs, as the number of SNs increases. As shown, the load starts high, especially for the double-capacity SN, and then, decreases toward a lower bound. The curves illustrate that beyond a certain number of SNs, the benefit in terms of lessening the load becomes insignificant, and also show that the lower limit of the load per SN is 0.5 when having a large number of SNs.
2010-11
Page 17
Chapter 5
EXPERIMENTAL EVALUATION
To experimentally evaluate MDPF, we implemented it and two other techniques to which we compare it using the ns2 network simulation software. The other techniques are the Random Packet Forwarding (RPF) and Minimal Spanning Tree Forwarding (MSTF) which we describe in the section after next. This section presents the results and illustrates their significance.
2010-11
Page 18
Fig. 5.1 A sample MST connecting the SNs and a request traversing the SNs. In the example of Fig. 5, a request is sent to SN1, then to SN2, SN4, SN5, and SN6 along the edges of the MST. At SN6, the request needs to be forwarded to a next unvisited SN along the MST. However, such SN doesnt exist. Hence, SN6 will forward the request to one of the remaining unvisited SNs (SN3 and SN7) along routing paths available from the routing protocol. If such paths dont exist, SN6 will send the request along the reverse path it came from (i.e, to
Dept Of C.S.E, A.P.S.C.E 2010-11 Page 19
MDPF For Search Applications In Mobile Adhoc Networks SN5, SN4, then SN2) until the request reaches an SN that has a path to one of the remaining unvisited SNs. Note that the reverse path can be determined by the order of visited SNs in the request packet. In MSTF, even though an MST builds a tree that links all its nodes with the least number of total hops hMST , the total number of hops traversed by the packet, however, could be greater than hMST due to the aforementioned condition, and as illustrated in the example of Fig. 5. However, we will illustrate later in this section that MSTF might produce better average search times when the time to search for the data item at an SN is significantly high.
5.3 Results
To be consistent with the results presented in Section 3.5, we computed the lower and upper bounds of the mean hop count for each scenario, in addition to the average taken over all sample values. As was described above, each experiment comprised at least 27,000 points. To compute the bounds, we used a procedure that is similar in principle to Method 2 (see Section 3.1.2) by dividing the sample space into 54 groups, each consisting of about 500 samples. For each group, the average value was computed, and then, the lower and upper bounds were taken as the lowest and highest means, respectively, across all groups. In addition to the bounds, the overall average was taken over all 27,000 points.
2010-11
Page 20
Fig. 5.2 Mean number of hops and total search times when DSDV routing is used.
2010-11
Page 21
MDPF For Search Applications In Mobile Adhoc Networks requested data item. On the other hand, when considering the local search time (Tls) at each SN, MSTF will take less total time when Tls is greater than 10 milliseconds, as illustrated in the bottom graph of Fig. 6. This is because the savings in the cumulative forwarding time become outweighed by the much larger total search time. Note that the forwarding time in the three systems was set to 5 milliseconds, which is the average communication time between two nodes. Finally, Fig. 7 shows the total number of messages generated during the simulation time by the three systems for different numbers of SNs. It is noticed that MDPF generates the least number of messages (requests, replies, and control packets), because the nearest SN is always chosen. Furthermore, the number of messages successfully reaching their destinations is higher in MDPF than that in RPF, which is why the number of received messages for MDPF is higher than that of RPF. Finally, MSTF generates a large number of control messages (MST and routing table messages) that are sent periodically, which is why the number of originated, received, and forwarded messages is much greater in MSTF.
Fig. 5.3 Total number of originated, received, and forwarded messages when DSDV is used.
2010-11
Page 22
Fig. 5.4 Total number of originated, received, and forwarded messages when AODV is used
MDPF For Search Applications In Mobile Adhoc Networks while nodes in different zones have other offset values. For instance, if a node in zone i generated a request for data item id following the original Zipf-like access pattern, then the new id would be set to where nq is the database size. This access
pattern can make sure that nodes in neighboring grids have similar, although not necessarily the same, access pattern. The effect of varying the value of the parameter on the number of hops
to get to the SN that knows where the data reside is illustrated in Fig. 9.
Fig. 5.5 Effect of varying the request popularity and locality of space, on average, hop count. Two sets of experiments were run, one with locality of space enabled and another one without it. Clearly, both graphs illustrate a saving in hop count that increases with the increase in the number of SNs. Intuitively, locality of space helps in shortening the path to the SN that holds a reference to the requested data item, a fact that is confirmed when comparing the left graph to the right one.
2010-11
Page 24
Chapter 6
RELATED WORK
Several works have tackled the problem of traversing a certain set of nodes in a network according to some given criteria. We begin with Espes and Mammeri who propose in [7] an adaptive expansion search method, in which nodes in the network determine their locations using a Global Positioning System (GPS). The route request packet is sent only to nodes within a certain triangle whose vertex is S (source node), height SD+ distance between source and destination nodes, while parameters). and and angle (where SD is the
and
the search triangle, and then, sends another route request with new (greater) values after a certain timeout (hence expanding the search triangle), and so on. The reported results show that the proposed system always returns a valid route after a given number of attempts. An approach to using Dynamic Hash Tables (DHTs) in distributed applications within MANETs was proposed in [14]. The approach includes two methods: The first usesDHTs on top of an MANET routing protocol, and hence, it requires provisions for communication and control messages exchange between the DHT algorithm and the routing protocol. The second method integrates the DHT into the routing protocol itself such that the next destination for each message is obtained from the DHT (for maximum efficiency), while the routing path of the message is obtained from the routing protocol. The authors do not describe in details how a DHT can generally be integrated into a particular routing protocol (for example, how different DHTs can be incorporated into different types of routing protocols, like proactive, reactive, and geographic ones).
2010-11
Page 25
MDPF For Search Applications In Mobile Adhoc Networks Instead, they concentrate on Ekta, which is a protocol they implemented that integrates the Pastry DHT into the DSR protocol. The authors also present an application which uses Ekta for discovering resources in an MANET, like a specific application on a node or a given type of nodes. The main difference between such an approach and MDPF is that MDPF is a general algorithm that is weakly coupled to the underlying routing protocol, in that it only uses its services to find the path to the next unvisited search node. On the other hand, there is a strong coupling between the DHT approach and the routing algorithm. For instance, the approach requires defining the way how a specific DHT is integrated into a specific routing protocol. The work in [8] seeks to maintain consistency of service provider information that is cached at different points within the wireless network. Toward this goal, the approach calls for integrating service discovery functionality with on-demand routing. This is basically achieved by including the information about the required service in a header that is attached to the routing packet (for example, the RREQ message in AODV). If a route to the destination SP is not known, the packet is broadcasted to the network until an intermediate node knows the required route or knows the route to an alternate service provider that provides the same service type, or until it reaches the SP itself. The reply packet follows the reverse path taken by the request packet (using a technique similar to the gratuitous flag in AODV). Given that the sought service bindings may be cached at multiple intermediate nodes, the proposed approach evaluates experimentally the trade-off between forwarding a packet to the nearest (measured in hops) Service Provider (SP) and sending it to the SP with the most up to date (i.e., the freshest) information.
2010-11
Page 26
network topology, which is typically multi-hop, may change randomly and rapidly at unpredictable times.
2. Bandwidth constrained links: Wireless links have significantly lower capacity than
their hardwired counterparts. They are also less reliable due to the nature of signal propagation.
3. Energy constrained operation: Devices in a mobile network may rely on batteries or
other exhaustible means as their power source. For these nodes, the conservation and efficient use of energy may be the most important system design criteria. The MANET characteristics described above imply different assumptions for routing algorithms as the routing protocol must be able to adapt to rapid changes in the network topology. They also present different optimization parameters such as bandwidth overhead and energy usage.
2010-11
Page 27
Chapter 7
CONCLUSION
This paper described a data search algorithm for use in mobile ad hoc networks. The technique, which we called MDPF, minimizes the total distance (hop count) taken by the search packet to traverse the set of mobile search nodes while using local routing information found on the nodes. This was proven through reliably obtained performance results that were compared to those of two other search techniques, namely, RPF and MSTF. The proposed algorithm which the paper analyzes and evaluates its performance may be regarded as being specific to MANETs since it accounts for their different dynamic aspects. This does not remove the fact that the carried analysis is valid for other types of networks. Although the search method itself is not 100 percent original, but the approach is justified by the need to have provably reliable estimates. The value of this approach is that the only assumption that was used to derive the confidence intervals is the fact that the employed pseudorandom generator is agood one, while other statistical approaches assume that the sample size used by the simulation is sufficient to make the difference between the sample mean distribution and the normal distribution negligible, with absolutely no evidence to back up this assumption.
2010-11
Page 28
REFERENCES
[1] T. Andrel and A. Yasinsac, On Credibility of Manet Simulations, Computer, vol. 39, no. 7, pp. 48-54, July 2006. [2] C. Bettstetter and J. Eberspacher, Hop Distances in Homogeneous Ad Hoc Networks, Proc. IEEE Vehicular Technology Conf., vol. 4, pp. 2286-2290, 2003. [3] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker, Web Caching and Zipf-Like Distributions: Evidence and Implications, Proc. IEEE INFOCOM, pp. 126-134, 1999. [5] J. Broch, D. Maltz, D. Johnson, Y. Hu, and J. Jetcheva, A Performance Comparison of Multi-Hop Wireless Ad Hoc Network Routing Protocols Source, Proc. Fourth Ann. ACM/IEEE Intl Conf. Mobile Computing Networking, pp. 85-97, 1998. [4] C. Clopper and E. Pearson, The Use of Confidence or Fiducial Limits Illustrated in the Case of the Binomial, Biometrika, vol. 26, pp. 404-413, 1934. [5] D. Espes and Z. Mammeri, Adaptive Expanding Search Methods to Improve AODV Protocol, Proc. 16th IST Mobile Wireless Comm. Summit, pp. 1-5, July 2007. [6] M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NPCompleteness. W.H. Freeman Publisher, 1979. [7] H. Pucha, S.M. Das, and Y.C. Hu, Ekta: An Efficient DHT Substrate for Distributed Applications in Mobile Ad Hoc Networks, Proc. Sixth IEEE Workshop Mobile Computing Systems Applications (WMCSA 04), Dec. 2004. [8] S. Vural and E. Ekici, Analysis of Hop-Distance Relationship in Spatially Random Sensor Networks, Proc. Sixth ACM Intl Symp. Mobile Ad Hoc Networking Computing, pp. 320-331, 2005.
2010-11
Page 29
MDPF For Search Applications In Mobile Adhoc Networks [9] Wikipedia, http://en.wikipedia.org/wiki/Order_statistic, 2009. [10] G. Zipf, Human Behavior and the Principle of Least Effort. Addison- Wesley, 1949.
2010-11
Page 30