Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more ➡
Download
Standard view
Full view
of .
Add note
Save to My Library
Sync to mobile
Look up keyword
Like this
4Activity
×
0 of .
Results for:
No results containing your search query
P. 1
EFRRA: An Efficient Fault-Resilient-Replica-Algorithm for Content Distribution Networks

EFRRA: An Efficient Fault-Resilient-Replica-Algorithm for Content Distribution Networks

Ratings: (0)|Views: 742|Likes:
Published by ijcsis
Nowadays, content distribution is an important peer-to-peer application on the Internet that has received considerable research attention. Content distribution applications typically allow personal computers to function in a coordinated manner as a distributed storage medium by contributing, searching, and obtaining digital content. The primary task in CDN is to replicate the contents over several mirrored web servers (i.e., surrogate servers) strategically placed at various locations in order to deal with the flash crowds. Geographically distributing the web servers’ facilities is a method commonly used by service providers to improve performance and scalability. Hence, contents in CDN are replicated in many surrogate servers according to some content distribution strategies dictated by the application environment. Devising an efficient and resilient content replication policy is crucial since, the content distribution can be limited by several factors in the network. Hence, we propose a novel Efficient Fault Resilient Replica Algorithm (EFRRA) to replicate the content from the origin server to a set of surrogate servers in an efficient and reliable manner. The contributions of this paper are twofold. First we introduce a novel EFRRA distribution policy and theoretically analyze its performance with traditional content replication algorithms. Then, by means of a simulation based performance evaluation, we assess the efficiency and resiliency of the proposed EFRR Algorithm, and compare its performance with traditional content replication algorithms stated in the literature. We demonstrate in experiment that EFRRA significantly reduces the file replication time and maintaining the Delivery ratio as compared with traditional strategies such as sequential unicast, multiple unicast, Fast Replica (FR), Resilient Fast Replica(R-FR), and Tornado codes (TC). This paper also analyzes the performance of sequential unicast, multiple unicast, Fast Replica (FR), Resilient Fast Replica(R-FR), Tornado codes, and EFRRA algorithms in terms of average replication time and maximum replication time.
Nowadays, content distribution is an important peer-to-peer application on the Internet that has received considerable research attention. Content distribution applications typically allow personal computers to function in a coordinated manner as a distributed storage medium by contributing, searching, and obtaining digital content. The primary task in CDN is to replicate the contents over several mirrored web servers (i.e., surrogate servers) strategically placed at various locations in order to deal with the flash crowds. Geographically distributing the web servers’ facilities is a method commonly used by service providers to improve performance and scalability. Hence, contents in CDN are replicated in many surrogate servers according to some content distribution strategies dictated by the application environment. Devising an efficient and resilient content replication policy is crucial since, the content distribution can be limited by several factors in the network. Hence, we propose a novel Efficient Fault Resilient Replica Algorithm (EFRRA) to replicate the content from the origin server to a set of surrogate servers in an efficient and reliable manner. The contributions of this paper are twofold. First we introduce a novel EFRRA distribution policy and theoretically analyze its performance with traditional content replication algorithms. Then, by means of a simulation based performance evaluation, we assess the efficiency and resiliency of the proposed EFRR Algorithm, and compare its performance with traditional content replication algorithms stated in the literature. We demonstrate in experiment that EFRRA significantly reduces the file replication time and maintaining the Delivery ratio as compared with traditional strategies such as sequential unicast, multiple unicast, Fast Replica (FR), Resilient Fast Replica(R-FR), and Tornado codes (TC). This paper also analyzes the performance of sequential unicast, multiple unicast, Fast Replica (FR), Resilient Fast Replica(R-FR), Tornado codes, and EFRRA algorithms in terms of average replication time and maximum replication time.

More info:

Published by: ijcsis on Sep 05, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See More
See less

12/15/2010

pdf

text

original

 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 5, August 2010
EFRRA : An Efficient Fault-Resilient-Replica-Algorithm for Content Distribution Networks
 
Amutharaj. J
Assistant Professor, CSEArulmigu Kalasalingam College of EngineeringKrishnankoil, Virudhunagar, INDIAamutharajj@yahoo.com
Radhakrishnan. S.
Senior Professor, CSEArulmigu Kalasalingam College of EngineeringKrishnankoil, Virudhunagar, INDIAsrk@akce.ac.in 
 Abstract
— Nowadays, content distribution is an important peer-to-peer application on the Internet that has received considerableresearch attention. Content distribution applications typicallyallow personal computers to function in a coordinated manner asa distributed storage medium by contributing, searching, andobtaining digital content. The primary task in CDN is to replicatethe contents over several mirrored web servers (i.e., surrogateservers) strategically placed at various locations in order to dealwith the flash crowds. Geographically distributing the webservers’ facilities is a method commonly used by serviceproviders to improve performance and scalability. Hence,contents in CDN are replicated in many surrogate serversaccording to some content distribution strategies dictated by theapplication environment. Devising an efficient and resilientcontent replication policy is crucial since, the content distributioncan be limited by several factors in the network.Hence, we propose a novel Efficient Fault Resilient ReplicaAlgorithm (EFRRA) to replicate the content from the originserver to a set of surrogate servers in an efficient and reliablemanner. The contributions of this paper are twofold. First weintroduce a novel EFRRA distribution policy and theoreticallyanalyze its performance with traditional content replicationalgorithms. Then, by means of a simulation based performanceevaluation, we assess the efficiency and resiliency of the proposedEFRR Algorithm, and compare its performance with traditionalcontent replication algorithms stated in the literature. Wedemonstrate in experiment that EFRRA significantly reduces thefile replication time and maintaining the Delivery ratio ascompared with traditional strategies such as sequential unicast,multiple unicast, Fast Replica (FR), Resilient Fast Replica(R-FR),and Tornado codes (TC). This paper also analyzes theperformance of sequential unicast, multiple unicast, Fast Replica(FR), Resilient Fast Replica(R-FR), Tornado codes, and EFRRAalgorithms in terms of average replication time and maximumreplication time.
 Keywords-
 
CDN, Fast Replica, Resilient Fast Replica, Efficient Fault Resilient Replica Algorithm, Tornado Codes.
I.
 
I
NTRODUCTION
Content Delivery Networks (CDNs) [1][2][3] provideservices that improve network performance by maximizingbandwidth, accessibility and maintaining correctness throughcontent replication. It offers fast and reliable applications andservices by distributing content to cache or edge servers locatedclose to users [1]. A CDN has some combination of content-delivery, request-routing, distribution and accountinginfrastructure. The content-delivery infrastructure consists of aset of edge servers (also called surrogates) that deliver copiesof content to end-users. The request-routing infrastructure isresponsible to directing client request to appropriate edgeservers. It also interacts with the distribution infrastructure tokeep an up-to-date view of the content stored in the CDNcaches.In particular, Content Delivery Networks (CDNs) optimizecontent delivery by putting the content closer to the consumerand shorting the delivery path via global networks of strategically placed servers [4]. The CDN’s edge servers are thecaching servers, and if the requested content is not yet in thecache, this document is pulled from the original server. Forlarge documents, software packages, and media files, pushoperational mode is preferred. It is desirable to replicate thesefiles at edge servers in advance [6].While transferring a large file with individual point-to-point connections from an original server can be a viablesolution in the case of limited number of mirror server, thismethod does not scale when the content needs to be replicatedacross a CDN with thousands of geographically distributededge replica nodes [5].This paper is organized as follows. Section II describes therelated work. Next, we present the working mechanisms of different content distribution algorithms and our proposedEFRRA content distribution algorithm. Section IV presentsdiscussion on analytical study, experimental results andanalyzes the performance of different content distributionalgorithms. Finally, the conclusion and future work issummarized.II.
 
R
ELATED
W
ORKS
 Ludmila et al. [5] proposed a novel algorithm, called FastReplica, for an efficient and reliable replication of large files inthe Internet environment. Instead of downloading the entire filefrom one server, a user downloads different parts of the samefile from different servers in parallel. Once all the parts of thefile are received, the user reconstructs the original file byreassembling the different parts. There are several advantageswhile using a dynamic parallel access. First, as the block size is
189http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 5, August 2010
small, a dynamic parallel access can easily adapt to thechanging network/server conditions. Second, as the client isusing several connections to different servers, a parallel accessis more resilient to congestion and failure in thenetwork/server. Third, the server selection process iseliminated since clients connect to all available servers toretrieve the parts of the file. Fourth, the throughput seen by theclient will increase. There is an overhead incurred whenopening multiple connections and extra traffic generated toperform block request.ZhiHui Lu [6] et al proposed Tree Round Robin Replica(TRRR) to improve the work of Fast Replica [5]. Theyproposed an efficient and reliable replication algorithm fordelivering large files in the content delivery networksenvironment. As part of this study, they carried out someexperiments to verify TRRR algorithm in small scale. Theydemonstrated in experiment that TRRR significantly reducesthe file distribution/replication time as compared withtraditional policies such as multiple unicast in content deliverynetworks.Several content networks attempt to address theperformance problem using different mechanisms to improvethe Quality of Service (QoS). One approach is to modify thetraditional web architecture by improving the web serverhardware adding a high-speed processor, more memory, anddisk space, or may be with a multi-processor system. Thisapproach is not flexible [7].In order to offload popular servers and improve end-userexperience, copies of popular content are often stored indifferent locations. With mirror site replication, files fromorigin server are proactively replicated at surrogate servers withthe objective to improve the user perceived Quality of Service(QoS). When a copy of the same file is replicated at multiplesurrogate servers, choosing the server that provides the bestresponse time is not trivial and the resulting performance candramatically vary depending on the server selected [8,9].Laurent Massoulie [10] proposed an algorithm called thelocalizer which reduces the network load, helping to evenlybalance the number of neighbors of each node in overlay,sharing the load and improving the resilience to random nodefailures or disconnections. The localizer refines the overlay in away that reflects geographic locality so as to reduce network load.In [11], Rodriguez and Biersack studied a dynamic parallel-access scheme to access multiple mirror servers. In their study,a client downloads files from mirror servers residing in a widearea network. They showed that their dynamic paralleldownloading scheme achieves significant downloadingspeedup with respect to a single server scheme. However, theystudied only the scenario where one
 
client uses paralleldownloading. The authors fail to address the effect andconsequences when all clients choose to adopt the samescheme.J. Kanagharju, J. Roberts, and K.W. Ross[12] studied theproblem of optimally replicating objects in CDN servers. Intheir model, each Internet Autonomous System (AS) is a nodewith finite storage capacity for replicating objects. Theoptimization problem is to replicate objects so that when clientsfetch objects from the nearest CDN server with the requestedobject, the average number of ASs traversed is minimized.They showed that this optimization problem is NP complete.They developed four natural heuristics and compared themnumerically using real Internet topology data.Al-Mukaddim Khan Pathan and Rajkumar Buyya [13]presented a comprehensive taxonomy with a broad coverage of CDNs in terms of organizational structure, content distributionmechanisms, request redirection techniques, and performancemeasurement methodologies. They studied the existing CDNsin terms of their infrastructure, request-routing mechanisms,content replication techniques, load balancing, and cachemanagement. They provided an in depth analysis and state-of-the-art survey of CDNs. Finally, they applied the taxonomy tomap various CDNs. The mapping of the taxonomy to theCDNs helps in “gap” analysis in the content networkingdomain.James Broberg and Rajkumar Buyya [14] et al proposedMetaCDN a system that exploits ‘Storage Cloud’ resources,creating an integrated overlay network that provides a low cost,high performance CDN for content creators. MetaCDNremoves the complexity of dealing with multiple storageproviders, by intelligently matching and placing users’ contentonto one or many storage providers based on their quality of service, coverage and budget preferences. MetaCDN makes ittrivial for content creators and consumers to harness theperformance and coverage of numerous ‘Storage Clouds’ byproviding a single unified namespace that makes it easy tointegrate into origin websites, and is transparent for end-users.M. O. Rabin [15] and Byers [16] et al proposed an efficientdispersal of information for secure, and fault tolerant datadissemination based on digital fountain approach. The mainidea underlying their technique is to take an initial fileconsisting of 
packets and generate an
n
packet encoding of the file with the property that the initial file can be restitutedfrom
any k 
packet subset of the encoding. For the application of reliable multicast, the source transmits packets from thisencoding, and the encoding property ensures that differentreceivers can recover from different sets of lost packets,provided they receive a sufficiently large subset of thetransmission. To enable parallel access to multiple mirror sites,the sources each transmit packets from the same encoding, andthe encoding property ensures that a receiver can recover thedata once they receive a sufficiently large subset of thetransmitted packets, regardless of which server the packetscame from. In fact, the benefits and costs of using erasurecodes for parallel access to multiple mirror sites are analogousto the benefits and costs of using erasure codes for reliablemulticast. For both applications, simple schemes which do notuse encoding have substantial drawbacks in terms of complexity, scalability, and in their ability to handleheterogeneity among both senders and receivers.Ideally, using erasure codes, a receiver could gather anencoded file in parallel from multiple sources, and as soon as
any k 
packets arrive from any
 
combination of the sources, theoriginal file could be restituted efficiently. In practice,however, designing a system with this ideal property and with
190http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 5, August 2010
very fast encoding and decoding times appears difficult. Hence,although other erasure codes could be used in this setting, wesuggest that a newly developed class of erasure codes called
Tornado codes
are best suited to this application, as they haveextremely fast encoding and decoding algorithms [12]. Indeed,previously these codes have been shown to be more effectivethan standard erasure codes in the setting of reliable multicasttransmission of large files [16].Byers [17] et al proposed a parallel accessing scheme basedon tornado codes in which a client is allowed to access a filefrom multiple mirror sites in parallel to speed up the download.They eliminated complex client-server negotiations andimplemented a straightforward approach for developing afeedback-free protocol based on erasure codes. Theydemonstrated that a protocol using fast Tornado codes candeliver dramatic speedups at the expense of transmitting amoderate number of additional packets into the network. Theirscalable solution would be extended to allow multiple clients toaccess data from multiple mirror sites simultaneously.Danny Bickson and Dahlia Malkhi [18] proposed a newcontent distribution network named “Julia” which reduces theoverall communication cost, which in turn improves network load balance and reduces the usage of long haul links.Compared with the state-of-the-art BitTorrent contentdistribution network, the authors found that while Juliaachieves slightly slower average finishing times relative toBitTorrent, Julia nevertheless reduces the total communicationcost in the network by approximately 33%. Furthermore, theJulia protocol achieves a better load balancing of the network resources, especially over trans-Atlantic links. They evaluatedthe Julia protocol using real WAN deployment and byextensive simulation. The WAN experimentation was carriedover the PlanetLab wide area testbed using over 250 machines.Simulations were performed using the GT-ITM topologygenerator with 1200 nodes.Amutharaj J and Radhakrishnan.S [19, 20] constructed anoverlay network based on Dominating Set theory to optimizethe number of nodes for large data transfer. They investigatedthe use of Fast Replica algorithm to reduce the content transfertime for replicating the content within the semantic network. Adynamic parallel access scheme is introduced to download afile from different peers in parallel from the Semantic OverlayNetwork (SON), where the end users can access the membersof the SON at the same time, fetching different portions of thatfile from different peers and reassembling them locally. That is,the load is dynamically shared among all the peers. Toeliminate the need for retransmission requests from the endusers, an enhanced digital fountain with Tornado codes isapplied. Decoding algorithm at the receiver will reconstruct theoriginal content. The authors also found that no feedback mechanisms are needed to ensure reliable delivery. The authorsinvestigated the performance of sequential unicast, multipleunicast and fast replica with tornado content distributionstrategies in terms of content replication time and deliveryratio. They also analyzed the impact of dominating set theoryfor the construction of semantic overlay networks.Srinivas Shakkottai and
 ,
Ramesh Johari,[22] evaluated thebenefits of a hybrid system that combines peer-to-peer and acentralized client–server approach against each method actingalone. The key element of their approach is to explicitly modelthe temporal evolution of demand. They also investigated therelative performance of peer-to-peer and centralized client–server schemes, as well as a hybrid of the two—both from thepoint of view of consumers as well as the content distributor.They showed how awareness of demand could be used to attaina given average delay target with lowest possible utilization of the central server by using the hybrid scheme.Zhijia Chen [23] et al addressed the issues related todistributing the media content such as audio, video, andsoftware packages to increasing number of end consumers inhigher speed. They integrated Peer-to-Peer (P2P) paradigm intothe Internet content distribution infrastructure which provides adisruptive market opportunity to scale the Internet for highquality data delivery. They have done an experimental andanalytical performance study over BitTorrent-like P2Pnetworks for accelerating large-scale content distribution overbooming Internet. They explored the unique strength of P2P inhigh speed networks, identified the performance bottlenecks,and investigated and quantified the special requirements in thenew scenario, i.e. file piece length and seed capacity. Theyproposed a Piece-On-Demand (POD) scheme to modifyBitTorrent in integration with File system in Userspace (FUSE)with an objective to decrease file distribution time and increaseservice availability.Ye Xia [24] et al considered a two-tier content distributionsystem for distributing massive content and proposedpopularity-based file replication techniques within the CDNusing multiple hash functions. Their strategy is to set aside alarge number of hash functions. When the demand for a fileexceeds the overall capacity of the current servers, then thepreviously unused hash functions were used to obtain a newnode ID where the file would be replicated. They developed aset of distributed, robust algorithms to implement the abovesolutions and evaluated the performance of proposedalgorithms.Yaozhou Ma [25], and Abbas Jamalipour, presented acooperative cache-based content dissemination framework (CCCDF) to carry out the cooperative soliciting. Theyinvestigated two cooperative strategies such as CCCDF(Optimal) with an objective to maximize the overall contentdelivery performance while CCCDF (Max-Min) with an aim toshare the limited network resources among the contents in aMax-Min fairness manner. They demonstrated in simulationresults that the enhanced delivery performance was offered bythe proposed CCCDF over existing content disseminationschemes.Oznur Ozkasap [26], MineCaglar and AliAlagoz proposedand designed a peer-to- peer system; SeCond, addressing thedistribution of large sized content to a large number of endsystems in an efficient manner. It employed a self-organizingepidemic dissemination scheme for state propagation of available blocks and initiation of block transmissions. Theirperformance study included scalability analysis for differentarrival/departure patterns, flash-crowd scenario, overheadanalysis, and fairness ratio. Authors studied variousperformance metrics such as the average file download time,
191http://sites.google.com/site/ijcsis/ISSN 1947-5500

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->