You are on page 1of 3


As one of the emerging services in cloud paradigm, cloud storage enables users to remotely store their data into the cloud so as to enjoy the on-demand high quality applications and services from a shared pool of configurable computing resources. While cloud storage relieves users from the burden of local storage management and maintenance, it is also relinquishing users&#8 !"# ultimate control over the fate of their data, $hich may put the correctness of outsourced data into ris%s. &n order to regain the assurances of cloud data integrity and availability and enforce the quality of cloud storage service for users, $e propose a highly efficient and fle'ible distributed storage verification scheme $ith t$o salient features, opposing to its predecessors. (y utili)ing the homomorphic to%en $ith distributed erasure-coded data, our scheme achieves the integration of storage correctness insurance and data error locali)ation, i.e., the identification of misbehaving server*s+. ,nli%e most prior $or%s, the ne$ scheme further supports secure and efficient dynamic operations on outsourced data, including- bloc% modification, deletion and append. .'tensive security and performance analysis sho$s the proposed scheme is highly efficient and resilient against (y)antine failure, malicious data modification attac%, and even server colluding attac%s.

/o012 3loud data stores provide scalability and high availability properties for $eb applications, but at the same time they sacrifice data consistency. 4o$ever, many applications cannot afford any data inconsistency. 3loud560 is a scalable transaction manager $hich guarantees full A3&7 properties for multi-item transactions issued by Web applications, even in the presence of server failures and net$or% partitions. We implement this approach on top of the t$o main families of scalable data layers- (igtable and 0imple7(. 6erformance evaluation on top of 4(ase *an open-source version of (igtable+ in our local cluster and Ama)on 0imple7( in the Ama)on cloud sho$s that our system scales linearly at least up to 89 nodes in our local cluster and 89 nodes in the Ama)on cloud.

nterprises.3+.HEURISTICS BASED QUERY PROCESSING FOR LARGE RDF GRAPHS USING CLOUD COMPUTING 0emantic $eb is an emerging area to augment human reasoning. As the data is physically not accessible to . $hich are remotely located. on $hich user does not have any control. 0emantic $eb technologies can be utili)ed to build efficient and scalable systems for 3loud 3omputing. &n this paper. :arious technologies are being developed in this arena $hich have been standardi)ed by the World Wide Web 3onsortium *W. >urthermore. 5o determine the jobs. DATA INTEGRITY PROOFS IN CLOUD STORAGE 3loud computing has been envisioned as the de-facto solution to the rising storage costs of &5 . $e present an algorithm to generate query plan. unli%e traditional approaches. 4o$ever. 3loud storage moves the user@s data to large data centers. large =7> graphs are common place.e. 3urrent frame$or%s do not scale for large =7> graphs and as a result do not address these challenges. With the e'plosion of semantic $eb technologies. $e sho$ that our frame$or% is scalable and efficient and can handle large amounts of =7> data. based on a greedy approach to ans$er a 06A=12 6rotocol and =7> 1uery 2anguage *06A=12+ query. We describe a scheme to store =7> data in 4adoop 7istributed >ile 0ystem. With the high costs of data storage devices as $ell as the rapid rate at $hich data is being generated it proves costly for enterprises or individual users to frequently update their hard$are. <ne such standard is the =esource 7escription >rame$or% *=7>+. ?ore than one 4adoop job *the smallest unit of e'ecution in 4adoop+ may be needed to ans$er a query because a single triple pattern in a query cannot simultaneously ta%e part in more than one join in a single 4adoop job. <ur results sho$ that $e can store large =7> graphs in 4adoop clusters built $ith cheap commodity class hard$are. $hose $orst case cost is bounded. <ne of the important concerns that need to be addressed is to assure the customer of the integrity i. $e describe a frame$or% that $e built using 4adoop to store and retrieve large numbers of =7> triples by e'ploiting the cloud computing paradigm. 5his poses significant challenges for the storage and retrieval of =7> graphs. this unique feature of the cloud poses many ne$ security challenges $hich need to be clearly understood and resolved. correctness of his data in the cloud. We use 4adoop@s ?ap=educe frame$or% to ans$er the queries. Apart from reduction in storage costs data outsourcing to the cloud also helps in reducing the maintenance.

tas% scheduling and e'ecution. $e perform e'tended evaluations of ?ap=educe-inspired processing jobs on an &aa0 cloud system and compare the results to the popular data processing frame$or% 4adoop. /ephele is the first data processing frame$or% to e'plicitly e'ploit the dynamic resource allocation offered by today@s &aa0 clouds for both. (ased on this ne$ frame$or%. &n this paper $e provide a scheme $hich gives a proof of data integrity in the cloud $hich the customer can employ to chec% the correctness of his data in the cloud. ma%ing it easy for customers to access these services and to deploy their programs. the processing frame$or%s $hich are currently used have been designed for static. homogeneous cluster setups and disregard the particular nature of a cloud. 5his proof can be agreed upon by both the cloud and the customer and can be incorporated in the 0ervice level agreement *02A+. 3onsequently. 5his scheme ensures that the storage at the client side is minimal $hich $ill be beneficial for thin clients.the user the cloud should provide a $ay for the user to chec% if the integrity of his data is maintained or is compromised. ?ajor 3loud computing companies have started to integrate frame$or%s for parallel data processing in their product portfolio. 4o$ever. . EXPLOITING DYNAMIC RESOURCE ALLOCATION FOR EFFICIENT PARALLEL DATA PROCESSING IN THE CLOUD &n recent years ad hoc parallel data processing has emerged to be one of the %iller applications for &nfrastructure-as-a-0ervice *&aa0+ clouds. the allocated compute resources may be inadequate for big parts of the submitted job and unnecessarily increase processing time and cost. $e discuss the opportunities and challenges for efficient parallel data processing in clouds and present our research project /ephele. &n this paper. 6articular tas%s of a processing job can be assigned to different types of virtual machines $hich are automatically instantiated and terminated during the job e'ecution.