(IJCSIS) International Journal of Computer Science and Information Security, Vol. 9, No.

5, May 2011

A New Dynamic Data Allocation Algorithm for Distributed Database
Fardin Esmaili Sangari
Sama Technical and vocational training college Islamic Azad university Urmia branch, Urmia, Iran Fardin_e_s@yahoo.com

Seyed Mostafa Mansourfar
Sama Technical and vocational training college Islamic Azad university Sahand branch, Sahand, Iran Mostafa.mansourfar@gmail.com

Abstract__ Data and fragment allocation is an important issue in distributed database systems. Data allocation is carried out based on data access dynamic and static patterns. This paper proposes a new strategy for data allocation named Relative Threshold Algorithm (RTA) in non-repeated distributed database. Proposed algorithm does reallocation data fragments by changing access pattern to data fragments. This algorithm proposes data fragments migrate at the site that has at most availability to fragments. Simulation results show that RTA performance is better than existing algorithms in term of hit ratio. It also reduces requirement space. We believe the reduction of storage overhead make RTA more attractive in distributed database systems.
Keywords-component: Allocation Distribute Database; Dynamic Data

I. Introduction Database and network technologies have been the most important problems in creating distributed database systems, for the past decade. A distributed database system is consists of a collection of sites connected communication network, in which each site is a database system in its own right, but the sites have agreed to work together, so a user at any site can access data anywhere in the network exactly as if the data were all stored at the user’s own site [1]. Distributed database systems use data allocation for achieving two aims. First is total data transmission cost minimized for process (i.e., the maximum number of fragments that can be allocated in a site) and Second one is the unifying of implementation strategy. The majority concern of a distributed database system is the designing of the fragmentation and allocation of the underlying data. Fragmentation unit can be a file where allocation issue becomes the file allocation problem [2]. However, data allocation is a NP-complete problem [3]. So, quick allocation requires creation of efficient solution. Moreover, optimal allocation of database hardly is employed by a distributed database system on query strategy. A few papers have been recently proposed for data allocation problem. Chu in [4] has considered this problem. Repetitive and non-repetitive models conducted in [5][6] and [7][8] address issue dynamic file allocation. In [6][7][8] and [9] have been presented various solutions for data allocation

in distributed systems. These papers have been performed data allocation depending on static data access patterns or query access patterns. Access probability of nodes to data fragmentations is stable in static environment. While these changes in dynamic environments and using of static methods frequently reduces database performance. Dynamic algorithm has been presented for data allocation in nonreplicate database systems called threshold algorithm [7]. Threshold algorithm transfers data fragmentation among sites according to change data access pattern. It focuses on load balance. This algorithm provides data allocation with low hit ratio. In other words, the requirement probability of that site is low to fragment in site and it doesn’t completely consider number of other sites access while takes into account and only the last site has access to data during data transfer to other sites. We aim to focus on the disadvantages and we attempt to eliminate them. The rest of the paper is as follows. In section 2, we review threshold algorithm. Proposed algorithm is presented in section 3. In Section 4, simulation results of proposed algorithm have been showed. Finally, section 5 is the conclusion. II. Threshold algorithm Threshold algorithm is one of the dynamic allocation algorithms which transfer data fragments among sites according to changing patterns [7][10][11]. Threshold algorithm stores only one counter for each fragment. Figure 1 shows fragment i with its associated counter.

Figure 1. Any fragment i in threshold algorithm

In the threshold algorithm, the initial value of the counter is zero. The counter value is increased by one for each remote access to the fragment. It is reset to zero for a local access. Whenever the counter exceeds a predetermined threshold value, the ownership of the fragment is transferred to another node. At this point, the critical question is, which node will be the new owner of the fragment? The algorithm

138

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 9, No. 5, May 2011

gives very little information about the past accesses to the fragment. In fact, throughout the entire access history only the last node which accessed the fragment is known. Two strategies have been selected for current possessor. Whether new possessor is selected randomly, or last accessing node is selected as new possessor. In initial strategy, the randomly chosen node could be one that has never accessed the fragment before. Therefore, latter strategy heuristically is better. Initially all fragments are distributed to the nodes randomly. A threshold value is set by δ. Every node j, threshold algorithm executes for every fragment i that have been stored. It reduces traffic two nodes which have threshold value exceed one (δ>1). One of the important problems in threshold algorithm is the exact choice of threshold value. Because of this, value affects on fragments movement (mobility of the fragments) directly. If threshold value increases, fragment will tend to remain more in current node. Otherwise, as the threshold value decreases, fragment tendency will visit more sites. In threshold algorithm, if n fragments are in a site then n distinctive counter are requirement. If site B consecutively accesses to fragment in site A then counter increases by one and counter is tended to threshold value. Now, if site A randomly accesses to fragment that before site B consecutively accesses it then counter be zero. If site B consecutively accesses to fragment in site A and site C accesses to this fragment for first time and with this access, counter value equal with threshold value then fragment is transferred to site C because site C has performed last access. This events increase response time. III. Proposed Algorithm Our proposed algorithm uses two fields for every site. Number of fields doesn’t depend on fragments number which resides in site. One of fields count number of accesses and other shows last fragment which has access to current site. The fragment tends to stay at the node with higher access probability. As the access probability of the node increases, the tendency to remain at this node also increases. It is also shown that as the threshold value increases, the fragment will tend to stay more at the node with higher access probability. At every access, name of fragment is compared with counter if they are similar counter increased by one. Counter is set to zero when site accesses to fragment for first time and then the name of fragment is recorded in identifier field. Our algorithm computes total number of accesses whether these accesses are local or remote. It is important that the number of accesses is interval. This algorithm increases probability of fragment resident in site. However, response time decreases, because it doesn’t require any information replacement from remote site. Threshold algorithm is a centralized algorithm. If site failed, total site information would waste. Our proposed algorithm is

distributed. We eliminate single point of failure. If that site crashed, other sites access to information yet is there and only crashed site information will be destroyed. Our proposed algorithm raise hit ratio. It reduces data replacement due to locality. This would be show as follow. We make our work assumptions as follow.     Initially, fragments are randomly distributed in the sites. Initially, counter value is zero An incremental counter is used. The initial value of the counter is zero. if the name of access fragment is same as the name of identifier field then For each access to fragment, counter value increases by one Fragment counter

Figure2. The fragment in each site at proposed algorithm

Relative threshold algorithm: Step 1. Initial counter value is set zero for all sites and distribute fragments randomly between sites. (at each site counter=0) Step 2. Process the access request for stored fragment. Step 3. For each request (locally or remote), counter value increase one, if the access is repetitive. go to step 2. Step 4. If name of requested fragment is not same as the fragment field, set counter by zero is replaced identifier field with new fragment name. Step 5. If counter value exceeds threshold value (counter>δ) and fragment is in site then counter will be zero else fragment is transferred to access site and counter will be zero. Step 6. Refer to step 2. We suppose sites topology as in figure 3. Site 2 wants to access fragment of site1, so it increased one to counter and fragment field value become equal to A. each sequential access increases counter value, if site 2 finds existent data in A. if this value is higher threshold value, data will move to site 2. If site 2 accesses to data unlike A, counter value will be zero. And fragment field value will be replaced by a new fragment name.

139

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 9, No. 5, May 2011

Experiment is repeated with number of site 5 and threshold 10 and similar results have almost been achieved. Whatever environment be more intense, higher hit ratio would be achieved.
site = 9 , value = 10
1200 1000 800

Hit Rate

Figure3. Example of sites topology

600 400

Threshold RTA

IV. Simulation Results In this section, we evaluate the proposed algorithm and compare it with threshold algorithm and show our algorithm which has better performance. In this simulation, the number of fragment is between 100 and 9000. Initially, these fragments are randomly distributed between sites. Experiments were examined in different environments. In first scenario, we consider number of sites variably and assume threshold value as stable (figure 2).
site = 5 , value = 5
2500

200 0 0 2000 4000 6000 8000 10000 Number of access

Figure 6. compare of Hit Rate in RTA & Threshold algorithm with different access numbers and different sites & value numbers

V. Conclusion In this article we introduce a new method to distributed data fragment of Distributed Database System. RTA is based on threshold algorithm that uses different strategy for data transmission. In our experiments, we consider hit ratio. This simulation is configurable for testing different network topologies and different data request and/or allocation conditions. Result of experiment shows the RTA hit rate is better than threshold algorithm and achieve better improvement of threshold algorithm. We use non-repeated distributed algorithm. In future, we can consider RTA in repeated distributed algorithm. References
[1] Baseda, S. Tasharofi, M. Rahgozar, "Near Neighborhood Allocation: A Novel Dynamic Data Allocation Algorithm in DDB", CSICC 2006. [2] Navathe, S.B., S. Ceri, G. Wiederhold and J.Dou," Vertical Partitioning Algorithms for Database Design", ACM Transaction on Database Systems, 1984 [3] Y. F. Huang and J. H. Chen, “Fragment Allocation in Distributed Database Design” , Journal of Information Science and Engineering 17, 491-506, 2001 [4] Ahmad, I., K. Karlapalem, Y. K. Kwok and S. K. “Evolutionary Algorithms for Allocating Data in Distributed Database Systems”, International Journal of Distributed and Parallel Databases, 11: 5-32, The Netherlands, 2002. [5] A. Brunstroml, S. T. Leutenegger and R. Simhal, “Experimental Evaluation of Dynamic Data Allocation Strategies in a Distributed Database with changing Workloads” , ACM Transactions on Database Systems, 1995 [6] A. G. Chin,” Incremental Data Allocation and ReAllocation in Distributed Database Systems”, Journal of Database Management; Jan-Mar 2001; 12, 1; ABI/INFORM Global pg. 35 [7] T. Ulus and M. Uysal, "Heuristic Approach to Dynamic Data Allocation in Distributed Database Systems", Pakistan Journal of Information and Technology 2 (3): 231-239, 2003 [8] S. Voulgaris, M.V. Steen, A. Baggio, and G. Ballintjn,” Transparent Data Relocation in Highly Availabl Distributed Systems”. Studia Informatica Universalis. 2002

2000

Hit Rate

1500 Threshold RTA 1000

500

0 0 2000 4000 6000 8000 10000 Number of Acceess

Figure 4. Compare of Hit Rate in RTA & Threshold algorithm with different access numbers

In this experiment, hit ratio factor of data fragment length is 2500, threshold value is 5 and number of sites is 5. So simulation results in figure 3 show proposed algorithm increases fragment hit ratio when requested fragment exist in current site.
site = 5 , value= 10
1400 1200 1000

Hit Rate

800 600 400 200 0 0 2000 4000 6000 8000 10000 Numaber of Access

Threshold RTA

Figure 5. compare of Hit Rate in RtA & Threshold algorithm with different access numbers

140

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 9, No. 5, May 2011
[9] L. C. John,” A Generic Algorithm for Fragment Allocation in Distributed Database Systems” , ACM, 1994 [10] Basseda. R , “Fragment Allocation in Distributed Database Systems “,Database Research Group , 2006 [11] Basseda. R ,“Data Allocation In Distributed Database Systems”, Technical Report No . DBRG . RB-ST. A50715, 2005.

141

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

Sign up to vote on this title
UsefulNot useful