Professional Documents
Culture Documents
Dr. D. Y. Patil Institute of Engineering and Technology, Pune, India 30 Oct - 01 Nov, 2018
1
Miss .Mane VIdya Maruti 2
Abstract— Now a days use of cloud computing is increasing copies of the same data Deduplication only keep original copy
rapidly. Cloud computing is very important in the data sharing and provide only references of the original copy to the
application. Daily use of cloud is increasing. But the problem in redundant data [1]. There are two methods of the duplication
cloud computing is every day data uploaded on the cloud, so check, one is file level duplication check and other is content
increasing similar data in cloud. Therefore it can be reduce the size level duplication check. In the file level duplication check
of similar data in cloud using the data Deduplication method. deduplication check based on file name in the storage system
These method main aim is that remove duplicate data from cloud. and content level Deduplication are check duplication of all
It can also help to save storage space and bandwidth. This content inside of file[1]. As the data Deduplication is
proposed method is to remove the duplicate data but in which user considering the user data there must be need of the some
have assigned some privilege according to that duplication check
& each user have their unique token. Cloud Deduplication is
security mechanism. It arises security and privacy concern of
achieve using the hybrid cloud architecture. This proposed method the user’s sensitive data. In the traditional method user need to
is more secure and consumes less resources of cloud. Also it shown encrypt his own data by himself so there are different cipher
that proposed scheme has minimal overhead in duplicate removal files for each new user.
as compared to the normal Deduplication technique. In this paper To avoid the unauthorized data Deduplication convergent data
Content Level Deduplication as well as File Level Deduplication of Deduplication is proposed in [4] to enforce the data
file data is checked over the cloud. confidentiality while checking the data duplication. The cloud
Keywords—Authorization; data security; privilege; deduplication;
credential; hybrid cloud; providing many services as shown in the below figure 1 such as
platform, services, infrastructure as a service, and database as a
service. In this we are using in cloud storage as a service. We
I. INTRODUCTION are using user credentials to check the authentication of the user.
In the hybrid cloud is present two type of cloud such private
Current era is cloud computing era. Now a days cloud cloud and public cloud. In private cloud store the user credential
computing has wide range of scope in data sharing. Cloud and user
computing is provide large amount of virtual environment
hiding the platform and operating systems of the user. Users use
the resources for sharing data. But users have to pay as per the
use of resources of cloud. Now cloud service providers are
offering cloud services with very low cost and also with high
reliability. User can upload the large amount data on cloud and
shared data to millions of users. A cloud provider is offer
different services such as infrastructure as a service, platform as
a service, etc. Users not need to purchase the resources. As the
data is get uploaded by the user every day it is critical task to
manage this ever increasing data on the cloud. Deduplication is
best method to make well data management in the cloud
computing. This method is becoming more attraction for data
Deduplication. This method send the data over the network Fig 1. Cloud architecture and services
required small amount of data. This method have application in
data management and networking. Data duplication is the data present in public cloud. The hybrid cloud take advantages
technique of reducing the size of data Also it is the best of both public cloud and private cloud as shown in the figure 2
compression method for the data Deduplication. This method is public cloud and private cloud are present in the hybrid cloud
sending the data over the network required small amount of architecture When any user forward request to the public cloud
data. This method have application in data management and to access the data he need to submit his information to the
networking. Instead of keeping redundant private cloud then private cloud will provide a file token and
user can get the access to the file resides on the public cloud. In
this proposed paper hybrid cloud architecture is used. The file
name is check on primary level in file data duplication and data
Deduplication is checked at the content level. If user wants to
retrieve his data or download the data file he need to download
both of the file from the cloud server this will leads to perform
695
the operation on the same file this violates the security of the in all the cases due to the pigeonhole principal. It will produce
cloud storage. wrong result that if for two blocks of the data same
identification number is get generated it simply remove the one
block of the data.
t2 d3
d3- download/update/upload file by providing
token/key or any other details.
V. GRAPHICAL ANALYSIS
A. File Size :
To evaluate an effect of file with size to the time breakdown.
The worst case scenario is to evaluate unique files which
enables us where we have upload all file data and the average
time of steps from set of test of different number of file size are
shown in figure 5. The time spend on encryption, downloading,
uploading increases linearly with the different file size, since
this operation contain the actual file data and perform file
Fig 4. Confidential Data Encryption. Input/Output with that whole file. On other side such as
generation of token and duplication check of file only uses the
C. Proof of Data metadata of file and so the time spend remains constant with
When file upload and download user need to provide proof of the size of file increasing from 10 MB to 120MB.
the data [5]. User need to submit his convergent key which was
generated at the time of file upload. To generate the hash value
of the data we have used MD5 massage digest version 5
algorithm to generate the hash value of the user data. If there
is any change in data occur the hash value of that data get
changed.
R=Registration of the user with specifying size on cloud. To evaluate the effect of different number of stored files in the
T=token generated and forwarded to user through mail system, I upload different number of unique size files and record
for activation.
the breakdown time for every file upload. From Fig 6, every
step remains constant along the breakdown time. Checking of
P-User Privileges Token and data of file is performing with a hash function &
H- Hash function calculation. D= Matching contents of linear search which is carried out in case of collision.
user uploaded data with existing database
R={r0,r1} Where ,
697
VI. RESULTS
cloud public cloud and user login etc. for checking deduplication. this figure 7 home
screen
Fig.10. File Upload of this
project is
displayed. In which different logins are there like private
At the time of duplication check of the file data file is get uploaded on
cloud which is shown in figure 10 and then
duplication of file is check with all content of that specific file
.if duplication is not found then that file is getting uploaded on cloud.
File stored on the cloud is stored in encrypted format, in this system all
data is uploaded on driveHQ cloud which is stored in encrypted format
shown in following figure 11.
VII. CONCLUSION
REFERENCES
[1]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED
SYSTEM VOL:PP NO:99 YEAR 2014,A Hybrid cloud approach
for secure authorized deduplication, Jin Li, Yan Kit Li, Xiaofeng
Chen, Patrick P. C. Lee, Wenjing Lou
[3]. P. Anderson and L. Zhang. Fast and secure laptop backups with
encrypted de-duplication. In Proc. of USENIX LISA, 2010.
[4]. J. Li, X. Chen, M. Li, J. Li, P. Lee, andW. Lou. Secure deduplication
with efficient and reliable convergent key management. In IEEE
Transactions on Parallel and Distributed Systems, 2013.
699