Comparison of User Mobility Pattern Prediction Algorithms to increase Handover Trigger Accuracy

Stefan Michaelis Christian Wietfeld christian. wietfeld@ uni-dortmund. de stefan. michaelis @ uni-dortmund. de Communication Networks Institute University of Dortmund, Germany
Abstract- The estimation of correct triggers for handover in cellular networks belongs to the critical tasks for accurate network operation. The importance of seamless handover even rises according to the increasing number of available radio access technologies, demanding for a reliable vertical handover. To aid the handover process, mobility prediction technologies gain interest and provide the possibility to prepare for handover in advance. The approaches presented here feature prediction of macro-mobility as an additional measure to aid handover decisions. The prediction is based on statistical data gained from the observation of movement across multiple cells. One of the main features lies in a generic user centric calculation of the most likely next hop during user movement, compared to network specific technological methods.

traces is the key driver for successful training of mobility detection and position prediction algorithms. Historical and simulation data about user movements provide the input for prediction algorithms trying to detect regularities. In case of simulation, the scenario parameters are of major importance, defining the degree of regularity inside the generated traces, as the users are bound to streets and areas

A. Movement Models One of the most widely spread and general mobility models is the so called Random Walk model. In this model the users follow no certain rules, but their movement is completely Mobility of users with seamless accessibility and without independent of position, other users or movement history. This the need to care about the underlying topology is the most model is easily to implement and easily to parameterise, but popular feature in wireless networks. Otherwise, to guarantee obviously contains no detectable patterns inside the generated a seamless service the correct estimation of when to trigger a movement traces. Nevertheless this model still holds some handover is critical. benefits in producing Noise to test the robustness of applied Different research approaches try to use movement pre- pattern detection methods. dictions as an addition to classical handover triggers. They The Gravity Model (i.a. in [5]) assigns values indicating a vary from statistical analysis ([9], [3]) up to complex pattern given level of attractiveness to certain areas. The higher the detection algorithms ([7]). In the majority of cases these attractiveness of an area is, the higher is the probability that a methods target at specific networks and input parameters and user will try to reach this area. This model provides a balanced the methodology behind is not generalizable to heterogeneous mixture between deterministic and random parts. The first few networks. In this paper we compare different generic pattern users of a large bunch of users, all heading to an attractive area, detection algorithms regarding their performance and propose can generate detectable movement patterns, which are then methods to further enhance their quality, enabling predictions used to predict location updates for the later users. In contrast, to be a reasonable addition to handover triggering. the stochastic nature of area attractiveness leads to travelling To evaluate the prediction approaches presented in this pa- only a subset of all users residing inside the simulation system per, we explore the overall success rate for the predictions on to these areas. The remaining users can be considered as noise simulation generated trace data. This analysis is accompanied to the pattern detection algorithms, performing stability tests by a detailed per base station analysis ofprediction accuracy regarding the prognosis accuracy. Finally the last mobility model outlined in this context is and an analytical investigation of the underlying simulation setup. The question has to be examined, how accurate the the Path Following Model([5]). This model gives a sequence of predictions of future user's base stations can be and how areas to reach or cross during mobile movement. Depending on intense the impact of input data is. To let prediction be an the selection of the waypoints, certain degrees of determinism addition to classical handover measures, the algorithms have can be achieved. Additionally, this model complements the gravity model, when assigning the target area to a user by the to deliver stable results for varying conditions. gravity model and afterwards switching to the path follower model leading to this area. When the area is reached, the I. GENERATION OF MOBILITY PATTERNS gravity model can select a new target again. Beside the capturing of short term (micro) movements, longer and repeating trends (macro inter-cell movement) in B. Available Data for Prediction user movement enable management of the networks on a During tracking of the user movements, different granularregular basis. Generation of a sufficient amount of movement ities of available information are expected: connectivity to
0-7803-9392-9/06/$20.00 (c) 2006 IEEE

The mobile starting at the bottom left area D uses a semi-deterministic path following model. The areas and streets forming the geographical model are covered by a series of base stations. 1. the users are bound to the streets connecting the areas (pavilions). The length of sequences depends on the usage of applications. B. After the maximum duration for staying at this area has elapsed. following a gravity based model. the mobile user moves back to the starting area. Main problem here rises for heterogeneous networks. in contrast to most other approaches targeting at mobility prediction. e. which is used as the evaluation environment for testing pattern detection algorithms. because actual mobile networks favour soft handovers or macrodiversity in contrast to hard handovers without overlapping regions.g. estimated exact location (GPS. The upper left and right areas have associated weights to set an attractiveness value for these areas. BM. three different user models populate the scenario. TL=T(op)L(eft). EVALUATION ENVIRONMENT AND REFERENCE SCENARIO Start Area Random A TL TL2 TR2 c D BL Start Area Path BM BR BL2 BR2 F Start Area Gravity Fig.base stations. this model leads the user back to the starting area after the maximum stay time has expired. if the user shuts down mobile equipment before completing the whole path to his target. The base stations are labeled in the notation XY. for later reference. This generates a slight amount of noise to the pure deterministic behaviour through varyingly getting into and leaving the coverage of the last base stations. triangulation. This keeps the examinations independent of a specific serving mobile network (comparing differences of GSM to WLAN) and capabilities needed in the end user equipment (GPS hardware). A. In this section we present a selection of different pattern detections and discuss their benefits for prediction of the user's next base station. While this model is of no use for any prediction algorithm. M2. ANALYSIS OF MOBILITY PATTERN PREDICTIONS Pattern recognition algorithms from the field of computational intelligence offer one possibility to detect repeating behavior in user movement. Network and Geographical Topology Setup Figure 1 illustrates a simple example topology scenario for user mobility.. III. instance based (IB) nearest-neighbour algorithms (similar to the dictionary approach used in [2]) and 953 . Asymmetric double T-shaped scenario This section presents and explains a reference scenario. This scenario has its main challenge for prediction at the crossing sections in the middle areas. The current target is chosen randomly and uniformly distributed between these two areas. For the research carried out here we concentrate on the smallest common denominator: Entered/Exit traces of base stations. II. Topology population For the specific simulation setup used in the following section to analyze the prediction algorithms and results. independently of the generating user. BL2. but greatly reduces amount of stored data and allows topology driven pattern detection. Input data for location prediction algorithms is generated by tracking the sequences of traversed base stations while the users move. While moving. The next section presents how the mobility models used for pattern generation are used in a specific scenario. generating a topology of a shape similar to typical exhibition centers.) or signal quality measurements. Six user areas are connected with streets. simulating typical commuter behavior. Asymmetry is achieved by assigning different levels of attractiveness (weights) to the pavilions. These sequences are kept. timing advance etc. as users can not be treated individually. Target area for the mobile is the upper right area. this will result in shorter sequences and less unique path patterns. e. A.. the mobile stays in this area and is allowed to move randomly inside its bounds. the third user starting in area A introduces noise by following a random movement model. it is an admirable choice for testing stability and performance of the algorithms.g. This leads to the precondition that a stable prediction algorithm is either insensitive to missing data in each trace or capable to perform good results even on minimal available data like pure entered/exited base station events. Again. After reaching the target area. The second user starts at the lower right area F. While these two users are rather easily to track because of their high level of determinism. We normally assume circles as approximations for the range of the stations in contrast to the widespread hexagons. where vertical handovers may occur and not all types of data may be available. This generally complicates the prediction process. Pattern detection algorithms Several algorithms for pattern recognition are compared: Decision trees (DT). This user would for example generate a traces like BL.

08 0 0 0. in this context the next target base station of the user.1].02 0 0 0 0 0. The worst predictions appear where the number of traces generated by random users outweigh the others.00 0 0 0. Support vector machines try to classify the data by finding hyperplanes separating the trace data into subsets for target prediction. which is a set of movement sequences as described in the former sections. [8]) are generated using the so called training set.00 t Real class BR TR M2 TM BM BL M TL TR2 TL2 BR2 BL2 TABLE I EXAMPLE CONFUSION MATRIX RUNNING DECISION TREE PREDICTION ALGORITHM support vector machines (SVM) ([11]. Decision trees (e. as described in section II. the aggregated probabilities transform to ATM (X) = ~~~~~~~18 4 = 7 0.82 0 0. The trees define a set of rules.03 0. leads to three values per user: For the random user of course ATM(X) = to each neighboring cell.27 BL2 0 0 0 0 0 0 0 0 0 0 0 1. This approach can be found in a modified form in [2] for paging of mobile equipment. For the specific confusion matrix in table I a historical sequence length of 3 has been used.05 0. Predictions have been performed using a decision tree algorithm ([11]).83 0.02 0 0 1. The results in the matrix are based on historical data where the next base station is known for and which has not been used for decision tree generation and hence is new to the predictor. Main disadvantage of the SVM is the computationally very intensive generation of the prediction classifier.g.39 each for the right and the lower cell.11 0 0. The third algorithm used for prediction. Manual investigation of the raw trace data reveals the reasons for this: because of the asymmetry chosen for the scenario parameters (33% randomness. The choice of so-called kernel functions allows the transformation of the original data into another space. as well for the gravity model user and ATM (X) = for the lower and right cell.11 0 TR2 0 0 0 0 0 0 0. TR2). leading to an overall accuracy of about 85%.g.06 0 0 0 0. showing the prediction accuracy as ratio of wrong versus correct predictions.91 BR 0 0 0 0.22 for the left and ATM(X) = 18 = 0. different user speeds) different amounts of traces have been generated for each part of the topology. While this algorithm is incredibly fast for data collection. a path down to the last node (leaf) in the tree is followed. Remembering that we wanted to keep any prognosis independent of the specific user. comparable to neural networks in [7]). which allows handover to three neighbouring cells (TL2.73 0. B.00 0 0 0.02 0. Each row is associated with the real class (base station) the user moved into from the current base station area.85 0 0 0 0 1. The associated value will be the predicted class.94 0 0.05 0 0 0 0 0 0 0 0 0 BM 0 0 0. Amount of random walk users for pattern generation has been set to 33%. Lazy algorithms usually keep the input sequences unprocessed and try to find the best matching sequence and resulting target base station by calculating the distance to up to k stored sequences.60 0. 0 for the left cell. BL) or very poor (e. Identification of Prediction Accuracy per Base Station Table I presents the so-called Confusion Matrix as the result of running a standard ten-fold stratified cross validation test for the base station predictions. Analytical Benchmarking of Predictions The transition rate A for handovers to cell x can be estimated following a Markovian model on the basis of the last known cell n. where based on the values at the different positions inside the movement sequence. heading for a certain user area and back. the amount of time needed to find best matching sequences grows with the amount of collected data.04 0 0 0 0 TM 0 0 Predicted class BL M TL 0 0 0 0 0 0 0 0. is a variant of the class of support vector machines (SVM).03 0 0 0 0 0 0 0.09 0 0 0 0 0 0 0 0. To the class of the so-called lazy algorithms belongs IBk. each colunm with the predicted class. The absolute number of predictions has been normalized along each colunm to [0.29 0 0 TL2 0 0 0 0 0 0 0 0 0 1. C4. This class of algorithms is additionally interesting as the choice of an appropriate kernel function provides the SVM with the same capability as the popular neural networks (compare [7]).06 0 0 0 0 0 0 0 0 TR 0 M2 0 0 0.0. C. a k-nearest neighbour algorithm. for the path following user. The confusion matrix enables a detailed view on problematic spots in the prediction scenario where an algorithm acts unstable and standard handover triggers should be applied. enabling data classification through many freeform surfaces instead of only hyperplanes.5 or ID3.g. Focusing as an example on the base station TM.92 0.07 0 0. SMO. The main diagonal of the table therefore contains the percentage of correct predictions.00 0 0 BR2 0 0 0 0 0 0 0 0 0 0 0. M). TR2. This approximation enables to calculate transition and cell residence probabilities for TM and each 954 . rest of users either used gravity or path following models. The most interesting regions occur where the predictions performed either very well (e.

5 95 90 85 -0 01 0- 80 75 70 65 60 55 PM 0.5 PTR2 * 0.22 0. E. outliers for certain predictions are not visible. while in 11% of the predictions M2 and in 7% M was the correct base station.5 PTL2 + PTM + PTR2 + PM PTM PTM PTM PTM 1 0. To overcome limitations through outliers of single algorithms we propose a hybrid approach of multiple algorithms 955 .39) 50 2 3 4 5 6 7 8 Sequence length [# transitions] Fig. This trend in accuracy can be divided into three phases: For a sequence length of 1. For sequence lengths up to six base stations the accuracy keeps roughly the same.5 0. all algorithms perform poorly with a rate just marginally above 50%. For this scenario post-filtering could be explained looking at the matrix where BM was predicted. But as figure 3 only illustrates the general accuracy for all predictions made.neighboring cell. i. letting the prediction instantly be rejected. Only the direct neighborhood of the T-crossing at base station TM is observed. This kind of validation of predictions can be used to either only use standard triggers for handover or execute alternative prediction algorithms to get alternative estimations and reduce the overall rate of mispredictions. the prediction BM can obviously not be correct.5 0. Summarized prediction results As to expect the overall quality raises the longer the historical sequences are (see figure 3). Selective Prediction Validations and Enhancements Post-prediction filters using a-priori topology knowledge can further enhance overall prediction quality. The third interesting phase in the figure can be seen for lengths of seven and eight base stations traversed. For successful handovers it is much more important to gain good prediction results for users moving into high load cells than leaving from these cells.e. allowing to differentiate the random from recurring movements.39 (0. Remembering the poor results for base station TR2 in matrix I. the more important is the accuracy of the predictions done for this cell. e.g.5 (PTL2 + PTR2 + PM) * 0. Important is that the overall effect showing these three phases in pattern detection quality can be observed for all of the three completely different algorithms. Complementing this. Prediction accuracy dependent on length of historical data Please notice. calculating state probabilities allows identification of most important parts of the network for prediction. most interesting cells are with high loads of users. As the predictions of user mobility could be used for reservation techniques. with the exception of falling below 80% for the SVM at a length of four. allowing the algorithms to at least detect the direction the users are heading for and raising performance for all three algorithms to about 80%. The probabilities can be easily calculated by solving the equations for statistical balance: PTL2 * 0. The analytical transition rates calculated using knowledge about the movement models allow comparison with the success rate of the pattern detection algorithms as discussed through the Confusion Matrix.22 0. resulting in 71% accuracy for lengths of 5 and 79% for lengths of 8.39 0. The higher the state probability for a certain base station is. Comparing this to the topology in figure 1 indicates that in cases where M2 was the correct next target base station. A boost in accuracy can be seen for sequence lengths of two or greater. This effect could lead to the conclusion that there is no relevant difference for the selection of the algorithm to prefer. Evaluation on the test data showed an accuracy of 82% for BM. investigating this cell in detail for longer sequences showed also a positive effect. Therefore. the poor result of prediction accuracy for TR2 of 60% is less relevant than the results for TM (83%). 0. raising the accuracy for all algorithms above 90%. that for sake of simplicity this is only a subset of the equations needed to describe the whole scenario. only the actual base station as input for the algorithms. Partial Markovian Model of base station transitions The calculated state probabilities show as to expect that the center of the T-crossing TM has the highest probability for active users being served by this cell.5 0. 3.22 + 2 0.39 0. but exist for some base stations as can be observed in the confusion matrix.39 Fig. D. 2.

Scheme 3 kept data of sequence length 2 fixed as input to SVM. For each value along the abscissa the other two algorithms delivered for voting same results as taken for the single predictions. San Francisco 2000 956 . Zygmunt J. Vol. which means better error reduction up to 45% compared [2] Bhattacharya. Complementing this. E. Kotz. Wietfeld. Haas. Three different algorithms have been investigated and it has been demonstrated. ICSA 2000 [8] Quinlan.. Wang.: Evaluation and comparison of prediction stability for user movement pattern detection algorithms. The great impact of the better results provided by SVM could be seen for sequence length 1.. we tested the application of the three pattern detection algorithms in parallel and accepted only predictions.. S. Performance of error reduction is highest for this scheme. Seattle 1999 [3] Cheng. For the success of handovers the most important predictions occur when moving into highly populated cells. Additionally it should be considered that the overall accuracy for predictions is one measure for algorithm performance. Boca Raton 2003 [4] Liang.T.K. Morgan Kaufmann. CONCLUSIONS Prediction approaches to gain knowledge about future user positions can help to aid the triggering of cell handovers. in: parallel. In spite of the worse predictions by SVM.5: Programs for machine learning. The ordinate shows the gain (i.e...: Predictive Distance-Based Mobility Management for Multidimensional PCS Networks.. W. MobiCom '99. where the poor prediction results of the other two algorithms could be partially compensated.: The Predictive User Mobility Profile Framework for Wireless Multimedia Networks.: Exploiting Information Theory for Adaptive Mobility and Resource Management in Future Cellular Networks. The task finding the correct parameterization for the pattern detectors is non-trivial. Das.: Traffic Management in Wireless ATM Network Using a Hierarchical Neural-Network Based Prediction Algorithm. J. it has been shown. No. L. that each of it showed similar behavior depending on the maximum length of available path sequences. Proceedings of the International Conference on Computers and their Applications. 3). European Wireless..: Location Prediction for Mobile Wireless Systems. where all algorithms used data of the same sequence length. San Mateo 1993 [9] Roy. . benefiting the overall voting process from the better results provided by SVM using data with sequence length 2 compared to the worse accuracy using length 1. Berg.: C4.: Evaluating location predictors with extensive Wi-Fi mobility data. 1021-1035. Finally it could be demonstrated that a combination of more than one algorithm reduces the overall rate of erroneous 30 25 20 15 10 5 Ca) I uo LL] 2 1 2 . + X . the results of all three algorithms have been combined for the final prediction. Jain. where the fittest (i. 12. The approach for mobility prediction presented here is reduced to the most common denominator available for a multitude of heterogeneous networks. Three schemes are evaluated and the results are shown in figure 4... letting each algorithm vote for its prediction.e. Sajal K. Proceedings of IEEE InfoCom. Scheme 1 shows a simple voting mechanism. Chan. CRC Press. Frank: Data Mining: Practical machine learning tools with Java implementations. that prediction of user movements is possible.6. which means that up to 30% of the errors could be eliminated using all three algorithms together.G. No. 2004 To examine this effect and to present a method to compensate the degradation of quality. Eibe. 2004 [10] Song. Proceedings of ACM/IEEE International Conference on Mobile Computing and Networking. This scheme shows a continuous raise. while before they have been treated independently (fig. pp. A.: LeZi-Update: An InformationTheoretic Approach to Track Mobile Users in PCS Networks. Future work is going to integrate these algorithms with the classical measures for handover triggering. IEEE Wireless Communications. where two or more of the algorithms delivered the same result. Das. W.d. Misra. 4. by which the error rate could be reduced) which could be achieved by using three predictors in parallel for the different sequence lengths. C. 2003 [5] Markoulidakis.. I. the percentage. J. i. A. a detailed analysis per base station allows to investigate problematic locations for the prediction. 3) inside the voting scheme.R. IEEE/ACM Transactions on Networking. helps eliminating the number of wrong predictions performed by a single algorithm. S. This means for each value of the abscissa. R. Ian H. 11. cell residence. 2004 [11] Witten.): Wireless Internet Handbook. (Hrsg. A decision process. 245-264. In [6] we show how sensitive the overall prediction accuracy depends on configuration of pattern detection algorithms.. Using this trace data. 5. Morgan Kaufmann. et al.F.e.v. IEEE/ACM Transactions on Networking.. S. figure 4 shows how this effect can be compensated by the other two algorithms. IEEE Personal Communications 1997 [6] Michaelis. REFERENCES [1] Akyildiz. Ben. . E. et al. Scheme 2 kept data of sequence length 1 fixed as input to the SVM algorithm. IV. B.: Mobility modeling in third-generation mobile telecommunication systems. currently best performing) algorithm is weighted over others to achieve optimal results even with a high degree of randomness. C. Vol. 45 40 35 0 to scheme 2. ~Scheme 1 [ 4 5 0 6Scheme 2Scheme 3---+-7 ~~3 4 5 6 Sequence length [# transitions] 8 Fig. even training on data with low degrees of random walk users may result in poor prediction results. which denotes to have on algorithm with low quality of nearly 50% errors (see fig. Athens 2006 [7] Poon. D. Reduction of error rates through majority voting predictions. Amiya.