Professional Documents
Culture Documents
ON
MASTER OF ENGINEERING
(COMPUTER ENGINEERING)
By
Anita Dumbre
Prof.Monika Rokade
CERTIFICATE
This is to certify that the Dissertation-II report entitled
“Secure data deduplication with Role Base Access Control in cloud
computing environment”,
Has been submitted by
Anita Dumbre
————————— —————————
Internal Examiner External Examiner
Date :
Place : Dumbarwadi,Otur
Acknowledgement
It is my privilege to acknowledge with deep sense of gratitude towards my project guide Prof.Monika
Rokade and Head of Computer Department, Prof. Gholap P.S. for his valuable suggestions and guidance
throughout course of study and timely help given in the progress of my dissertation on “Secure data
deduplication with Role Base Access Control in cloud computing environment”.
It is needed a great moment of immense satisfaction to express out profound gratitude towards our
P. G. coordinator Prof. Rokade M.D, whose real enthusiasm was a source of inspiration for my work.
My special thanks to Dr. G. U. Kharat, Principal, Sharadchandra Pawar College of Engineering for his
valuable support.
I would also like to thank all other faculty members of Computer Engineering Department who
directly or indirectly kept the enthusiasm and momentum required to keep the work towards an effective
dissertation alive in me and guided in their own capacities in all possible ways. Last but not least i would
like to thanks my family and friends for continuous support and their encouragement to award this degree.
Anita Dumbre
1
Abstract
An overview of cloud computing, cloud file services, their usability, and storage is given by this project.
Storage optimization is also considered by the de-duplication analysis of current data de-duplication tech-
niques, processes and implementations for the benefit of cloud service providers and cloud users. The
project also proposes an effective method for detecting and eliminating duplicates by measuring the di-
gest of files using file checksum algorithms, which takes less time than other methods pre-implemented.
This suggested method is to delete duplicate data, but according to that duplication search, the user has
allocated some privilege and each user has its unique token. Using the hybrid cloud model, cloud dedu-
plication is accomplished. This proposed technique is more reliable and uses less cloud resources. It
has also shown that, relative to the standard deduplication technique, the proposed scheme has limited
overhead in duplicate elimination. Content level deduplication as well as file level deduplication of file
data is reviewed through the cloud in this document
Keywords: Data deduplication, Delta compression, Storage system, Index structure, Performance eval-
uation.
2
Contents
Acknowledgment 1
List of Figures 7
List of Tables 8
Abstract 8
1 Introduction 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Definition and Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 LITERATURE REVIEW 3
4 Objectives 6
4.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5 Dissertation Plan 7
5.1 Area of Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5.2 Plan Of Dissertation Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5.2.1 Purpose of the document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5.3 Proposed Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.4 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.4.1 Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.4.2 MySQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.4.3 Eclipse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.5 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.5.1 Algorithm 1: Hash Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3
5.5.2 Algorithm 2: Encryption and Decryption . . . . . . . . . . . . . . . . . . . . . 11
5.5.3 Algorithms 3: Role Based Access Control Algorithms: . . . . . . . . . . . . . . 11
5.6 Feasibility Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5.7 Technical Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5.8 Economical Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5.9 Operational Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5.10 Time Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.11 Risk Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.11.1 Project Risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.12 Risk Mitigation, Monitoring and Management (RMMM) Plan . . . . . . . . . . . . . . 13
5.13 Dissertation Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.13.1 Dissertation Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.13.2 Installation and Configuration Task . . . . . . . . . . . . . . . . . . . . . . . . 14
4
7.2.2 DFD level 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
7.3 UML Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
7.3.1 Use-case Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
7.3.2 Activity Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7.3.3 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.3.4 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
8 TESTING 31
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.1.1 Principle of Testing: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.2 Testing scope: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.2.1 Major Functionalities: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.3 Basics of Software Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
8.3.1 White-box testing: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
8.3.2 Black box Testing: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
8.3.3 Unit testing: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
8.3.4 Integration testing: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
8.3.5 Validation testing: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
8.4 Test Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
8.4.1 Testing Process: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
8.4.2 Functionality testing and non-functional testing: . . . . . . . . . . . . . . . . . 34
8.5 Test Cases and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
8.5.1 Test cases of system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5
REFERENCES 49
6
List of Figures
7
List of Tables
8
CHAPTER 1
Introduction
1.1 Overview
Data deduplication technology usually identifies redundant data quickly and correctly by using file check-
sum technique. A checksum can determine whether there is redundant data. However, there are the
presences of false positives. In order to avoid false positives, we need to compare a new chunk with
chunks of data that have been stored. In order to reduce the time to exclude the false positives, cur-
rent research uses extraction of file data checksum. As a result of these ”data delegations”, managing
storage and reducing its cost can be one of the most difficult and important tasks in a mass storage sys-
tem. Data duplication is an efficient data reduction approach that not only reduces storage space by
removing duplicate data but also reduces the transmission of redundant data in low-bandwidth network
environments. Data duplication has become increasingly popular in recent years as a highly efficient
data reduction tool. Cloud computing is an evolving trend in information and communication technol-
ogy for the modern century. In order to reduce the time to exclude the false positives, current research
uses extraction of file data checksum. However, the target file stores multiple attributes such as user id,
filename, size, extension, and checksum and date-time table. Whenever user uploads a particular file,
the system then first calculates the checksum and that checksum is cross verified with the checksum data
stored in database. If the file already exists, then it will update the entry else it will make a new entry
into the database. Data owners (owners), the cloud server (server) and data consumers (users). Cloud
computing is a term, which involves virtualization, distributed computing, networking, and software and
web services. A cloud consists of several elements such as clients, datacenter and distributed servers. It
includes fault tolerance, high availability, scalability, flexibility, reduced overhead for users, reduced cost
of ownership, on demand services etc
1.2.2 Objectives
The objectives of this research work include the following.
• To increase the storage utilization and reduce network bandwidth for cloud storage providers
• To develop algorithms of Role Base Access Control (RBAC) for secure data access.
1.3 Motivation
• In existing system a user can be a Data Owner and a Data Consumer simultaneously.
• Authorities are assumed to have powerful computation abilities, and they are supervised by gov-
ernment offices because some attributes partially contain users’ personally identifiable informa-
tion.
• The whole attribute set is divided into N disjoint sets and controlled by each authority, therefore
each authority is aware of only part of attributes.
• A Data Owner is the entity who wishes to outsource encrypted data file to the Cloud Servers.
• The Cloud Server, who is assumed to have adequate storage capacity, does nothing but store them.
• Newly joined Data Consumers request private keys from all of the authorities, and they do not
know which attributes are controlled by which authorities
LITERATURE REVIEW
According to Kaiping Xue [1] propose a novel heterogeneous framework to remove the problem of
single-point performance bottleneck and provide a more efficient access control scheme with an auditing
mechanism. Our framework employs multiple attribute authorities to share the load of user legitimacy
verification. Meanwhile, in our scheme, a CA (Central Authority) is introduced to generate secret keys
for legitimacy verified users. Unlike other multiauthority access control schemes, each of the authorities
in our scheme manages the whole attribute set individually. To enhance security, we also propose an
auditing mechanism to detect which AA (Attribute Authority) has incorrectly or maliciously performed
the legitimacy verification procedure.
Kan Yang and et. Al.[2], proposed a revocable multi-authority CP-ABE scheme, and apply it as the
underlying techniques to design the data access control scheme. Our attribute revocation method can
efficiently achieve both forward security and backward security. System also design an expressive, ef-
ficient and revocable data access control scheme for multi-authority cloud storage systems, where there
are multiple authorities co-exist and each authority is able to issue attributes independently.
The system [3] proposed a secure way for anti-collusion key distribution without any secure third party
channels, and the users can securely get their private keys from group owner. Second, this method can
propose fine grained access control, any user in the group can use the source in the cloud and revoked
users cannot access the cloud all over again after they are revoked. Thirdly, system can shield the scheme
from collusion attack that means that revoked users cannot get the actual data file even if they combine
with the untrusted cloud. In this approach, by exploit polynomial capability, framework can complete a
safe client negation conspire, finally, this plan can accomplish fine efficiency, which implies past clients
necessitate not to refresh their revoked from the group.
According to [4] proposes the major of key-approach feature which is based on KP-ABE with reaction
of non-monotonic access structures and with regular cipher text size. System also proposes the first Key-
Policy Attribute-based Encryption (KPABE) method allowing for non-granted access structures (i.e., that
may contain negated attributes) and with constant cipher text size. Towards achieving this goal, system
first show that a certain class of identity based broadcast encryption schemes generically yields mono-
tonic KPABE systems in the selective set model. System then describes a new efficient identity-based
revocation mechanism that, when combined with a particular instantiation of our general monotonic con-
struction, gives rise to the first truly expressive KP-ABE realization with constant-size cipher text.
According to F. Zhang and K. Kim [5] proposed an ID-based ring signature approach, both approaches
Secure data de-duplication with Role Base Access Control in Cloud Computing Environment
has defined base on bilinear pairings as well as Java pairing library. Also system analyzes their security
and efficiency with different existing strategies. The Java Pairing library (JPBC) has used for data en-
cryption and decryption purpose. Some user access control policies has design for end users that also
enhance the privacy and anonymity of data owner.
In approach [6], propose the first Identity-based threshold ring signature approach that does not support
to java pairings. It proposes the first Identity -based threshold verifiable ring signature strategy. System
also analyze that the secrecy of the actual signers is maintained even against the PK generator (PKG) of
the Identity –based system. Finally system shows how to add identity collusion and other existing base
different schemes. Due to the dissimilar levels of signer inscrutability they support, the system proposed
in this paper actually form a suite of Identity based thresh-old ring signature method which is related to
many real-world systems with varied anonymity needs.
In [7], system first validates the security requirements of whole architecture, and after that adds to in the
security architecture. System proposed AES 128 16 bit encryption approach for end to end user verifica-
tion and data encryption/ decryption purpose.
According to Kan Yan [8], System proposed Cipher text-Policy Attribute-based Encryption (CP-ABE)
is a promising technique for access control of encrypted data. It requires a trusted authority manages all
the attributes and distributes keys in the system. In cloud storage systems, there are multiple authori-
ties co-existing and each authority is able to issue attributes independently. However, existing CP-ABE
schemes cannot be directly applied to data access control for multi-authority cloud storage systems, due
to the inefficiency of decryption and revocation. In this paper, system proposes DAC-MACS (Data Ac-
cess Control for Multi-Authority Cloud Storage), an effective and secure data access control scheme
with efficient decryption and revocation. Specifically, system construct a new multi-authority CP-ABE
scheme with efficient decryption and also design an efficient attribute revocation method that can achieve
both forward security and backward security.
The system [9] proposed CaCo, an efficient Cauchy coding approach for data storage in the cloud. First,
CaCo uses Cauchy matrix heuristics to produce a matrix set. Second, for each matrix in this set, CaCo
uses XOR schedule heuristics to generate a series of schedules. In second phase CaCo selects the shortest
one from all the produced schedules. In such a way, CaCo has the ability to identify an optimal coding
scheme, within the capability of the current state of the art, for an arbitrary given redundancy configu-
ration. It also implements CaCo in the Cloud distributed file system and evaluates its performance by
comparing with ”Cloud 2.5”. Finally author proposed this system enhance the security in distributed file
system with effective data storage scheme.
Ibrahim Adel [10] defines a new replica placement policy for HDFS. The issue of load balancing is
addressed in this work by evenly distributing replicas to cluster nodes. Therefore, there is no more need
for any load balancing utility. IDPM can generate replica distributions that are perfectly even and satisfy
all HDFS replica placement rules as confirmed by the simulation results. There is an exciting future work
for the proposed policy. HDFS replica placement policy the replicas of data blocks cannot be evenly dis-
tribute across cluster nodes, so the current HDFS has to rely on load balancing utility to balance replica
distributions which results in more time and resources consuming. These challenges drive the need for
intelligent methods that solve the data placement problem to achieve high performance without the need
for load balancing utility
3.2 Scope
• The research work focus on cloud data storage security, which has always been a most aspect of
quality of service.
• For ensuring the correctness of cloud clients data in the cloud, in this paper propose a highly
effective and flexible distributed scheme with two features, apposing to its predecessors
CHAPTER 4
Objectives
4.1 Objectives
• To increase the storage utilization and reduce network bandwidth for cloud storage providers
• To develop algorithms of Role Base Access Control (RBAC) for secure data access.
Dissertation Plan
5.4.1 Java
Object Oriented: In Java, everything is an Object. Java can be easily extended since it is based on the
Object model. Platform Independent: Unlike many other programming languages including C and C++,
when Java is compiled, it is not compiled into platform specific machine, rather into platform independent
byte code. This byte code is distributed over the web and interpreted by the Virtual Machine (JVM) on
whichever platform it is being run on. Simple: Java is designed to be easy to learn. If you understand the
basic concept of OOP Java, it would be easy to master. Secure: With Java’s secure feature it enables to
develop virus-free, tamper-free systems. Authentication techniques are based on public-key encryption.
Architecture-neutral: Java compiler generates an architecture neutral object file format, which makes
the compiled code executable on many processors, with the presence of Java runtime system. Portable:
Being architecture-neutral and having no implementation dependent aspects of the specification makes
Java portable. Compiler in Java is written in ANSI C with a clean portability boundary, which is a
POSIX subset. Robust: Java makes an effort to eliminate error prone situations by emphasizing mainly
on compile time error checking and runtime checking. Multithreaded: With Java’s multithreaded feature
it is possible to write programs that can perform many tasks simultaneously. This design feature allows
the developers to construct interactive applications that can run smoothly. Interpreted: Java byte code
is translated on the fly to native machine instructions and is not stored anywhere. The development
process is more rapid and analytical since the linking is an incremental and light-weight process. High
Performance: With the use of Just-In-Time compilers, Java enables high performance. Distributed: Java
is designed for the distributed environment of the internet. Dynamic: Java is considered to be more
dynamic than C or C++ since it is designed to adapt to an evolving environment. Java programs can
carry extensive amount of run-time information that can be used to verify and resolve accesses to objects
on run-time.
5.4.2 MySQL
MySQL is an open-source relational database management system (RDBMS). The MySQL™ software
delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database
server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for em-
bedding into mass-deployed software. MySQL is under two different editions: the open source MySQL
Community Server and the proprietary Enterprise Server. MySQL Enterprise Server is differentiated
by a series of proprietary extensions which install as server plugins, but otherwise shares the version
numbering system and is built from the same code base.
5.4.3 Eclipse
There are many ways to learn how to program in Java. The most developers believes that there are
advantages to learning Java using the Eclipse integrated development environment (IDE). Some of these
are listed below:
• Eclipse provides a number of aids that make writing Java code much quicker and easier than using
a text editor. This means that you can spend more time learning Java, and less time typing and
looking up documentation.
• The Eclipse debugger and scrapbook allow you to look inside the execution of the Java code. This
allows you to “see” objects and to understand how Java is working behind the scenes
• Eclipse provides full support for agile software development practices such as test-driven devel-
opment and refactoring. This allows you to learn these practices as you learn Java.
• If you plan to do software development in Java, you’ll need to learn Eclipse or some other IDE.
• So learning Eclipse from the start will save you time and effort.
• The chief concern with learning Java with an IDE is that learning the IDE itself will be difficult
and will distract you from learning Java. It is hoped that this tutorial will make learning the basics
of Eclipse relatively painless so you can focus on learning Java
5.5 Algorithm
a← filekey list [i . . . . . . n]
k← Email-ID List [i . . . . . . n]
Step 3: for each (read a to S)
If (key-data. Equals (a) && User Email-ID.Equals (k))
Then User File Share information show
Else
Then User File Not Share information show
End for
ational feasibility assesses the extent to which the required web application performs a series of steps
to solve problems and user requirements. This feasibility is dependent on human resources (software
development team) and involves visualizing whether the software will operate after it is developed and
be operative once it is installed.
3. Compilation of Assignments
5. Testing of a system.
6.1 Introduction
A Software requirements specification document describes the intended purpose, requirements and na-
ture of software to be developed. It also includes the yield and cost of the software. A Software Re-
quirements Specification (SRS) is describes the nature of a Software , software or application. In simple
words, SRS is a manual of a Software provided it is prepared before you kick-start a Software /applica-
tion. A software document is primarily prepared for a Software , software or any kind of application..
• For ensuring the correctness of cloud clients data in the cloud, in this paper propose a highly
effective and flexible distributed scheme with two features, apposing to its predecessors.
• Members can upload file, access it as when needed and use the files shared by data owner, take
organization back up by means of admin. It will provide authentication to individual file uploaded
over cloud irrespective of operating system and device, which will increase the security of backup.
• The proposed system is improved with multi-factor authentication and better combinations for
data encryption, PBEWithMD5AndDES Algorithm and Secure Hash Algorithm achieves the re-
quired security goals
designed in such a way that it will get easy for the user to interact with the system. Certain documentation
will also be provided so that it will get easy for the user to understand the working of the system
• Reusability
• Performance
• Proper output
• Response Time: Its very quick as an interactive system response to user input. In terms of the
web, definitions vary. Great response time from the moment the user submits the request for a
page to the moment where the page begins to render.
• Workload Work load depends on the server configuration. Its very efficient for bulk load as well
• Scalability It can be scalable as per the requirement and hardware configuration will play vital
role in this
• Platform Java is the platform, IDS concept is the heart of this system.
• Memory :- 2 GB or above
• It will execute all commands like DML, DDL and DCL as well as we required some security
measurements for sql injection.
• JSP,Html
Back-End
• MySQL
• Memory :- 2 GB or above
Implementation Procedure
Registration and Authentication: In that phase all entities can register. Data owner, Multiple Admin
and user can create own profile.
Data Uploading: In first phase once data owner upload the file. In that module data encryption done
using PBEWithMD5AndDES and SHA256 encryption scheme and the same time keys send to EC2
Cloud. Data Owner will upload file for his backup and can allow the file to be accessed by friend user.
He will be able to access the uploaded file by entering proper credentials sent to his registered email id.
Files uploaded by user will get scan by algorithm and if similar contents are there in old and current
new file then previous file will get stored. Common files in which all contents are exactly similar if gets
uploaded on the system by multiple users will get registered as the first owners file and others can access
it as friend user which will avoid duplication of file at server.
Data Sharing: In that phase data sharing done by data owner, he can any file to any user in cloud group.
Friend user can access the shared file to him by data owner by following the login process and proper
credentials send to him for that particular file.
Access Control and revocation: In access control any user can view or access the file shared by user to
him. In revocation data owner can revoke the file access to specific user. The common files uploaded by
multiple users can’t be deleted will get deleted according to maximum requirement time specified by out
of multiple users.
File request and download: user can give the download request to cloud server, at the same Data Owner
at verification has done.
Cloud Storage Service Provider (CSP): Cloud Storage Services Provider provides database. It allows
data owner to keep any kind of information. CSP also allows to the user to make the user defined database
schema. According to user requirement the space for the user instance will be allocated by CSP
TESTING
8.1 Introduction
Testing is very important phase of the software development life cycle. The purpose of this phase is to
check lifetime system. This is compulsory phase. Information given in this section gives the details for
the testing activities that should be approved for the “opinion mining” of scheme. Tester has to estimate
test of each component and write downs test cases according to user requirement and system structure.
• User registration
• Data uploading
• Data store in DB
Access by GUI
Secure data de-duplication with Role Base Access Control in Cloud Computing Environment
• Classification results
• Programming style
• Control method
• Source language
• Database design
This type of a test is useful to beat defects at structural level. This test goes lower the top or functional
layer to expose defects. Test case designing methods:
• Statement coverage
• Decision coverage
• Condition coverage
• Path coverage
• Server connection
• Data upload
• features extraction
• classification
• Results
soliciting for/building takes a look at database surroundings. Execute venture Integration check Run take
a look at cases from integration of the utility. Test all check cases after deployed on the system. Signoff
– this is ultimate stage whilst all end completed.
• The expected time of results after each testing module is been diagnosed.
• Testing which is been associated with equipment’s and reference record that are required to exe-
cute have to be listed.
10.1 Conclusion
We have studied various cryptographic techniques, encryption standards, de-duplication process and we
will apply it for developing organization specific independent cloud based secured data backup system
with multifactor authentication. To protect data confidentiality along with secure de-duplication, notion
of authorized de-duplication is proposed. To carry duplicate check firstly privileges assigned to user are
checked Instead of data itself duplicate check is based on differential privileges of users. Here, problem of
privacy preserving in de-duplication in cloud environment is considered and advanced scheme supporting
differential authorization and authorized duplicate check is proposed. This project addresses the issue
in authorized de-duplication to achieve better security. We showed that our authorized duplicate check
scheme incurs minimal overhead compared to convergent encryption and network transfer.
[1] Xue K, Xue Y, Hong J, Li W, Yue H, Wei DS, Hong P. RAAC: Robust and auditable access control
with multiple attribute authorities for public cloud storage. IEEE Transactions on Information Forensics
and Security. 2017 Apr;12(4):953-67.
[2] Kan Yang and Xiaohua Jia, Expressive, E-cient, and Revocable Data Access Control for Multi-
Authority Cloud Storage, IEEE Transactions on parallel and distributed systems, VOL. 25, NO. 07, July
2014.
[3] Zhongma Zhu and Rui Jiang proposed A Secure Anti-Collusion Data Sharing Scheme for Dynamic
Groups in the Cloud in IEEE TRANSACTIONS ON PAR- ALLEL AND DISTRIBUTED SYSTEMS,
VOL. 27, NO. 1, JANUARY 2016.
[4] N. Attarpadung, B. Libert, and E. Panaeu, Expressive keypolicy attribute based encryption with
constant-size ciphertexts, in 2011.
[5] F. Zhang and K. Kim. ID-Based Blind Signature and Ring Signature from Pairings. In ASIACRYPT
2002, volume 2501 of Lecture Notes in Computer Science, pages 533547. Springer, 2002.
[6] J. Han, Q. Xu, and G. Chen. E-cient id-based threshold ring signature scheme. In EUC (2), pages
437442. IEEE Computer Society, 2008.
[7] J. Yu, R. Hao, F. Kong, X. Cheng, J. Fan, and Y. Chen. Forward secure identity based signature:
Security notions and construction. Inf. Sci., 181(3):648660, 2011
[8] Yang K, Jia X. DAC-MACS: E-ective data access control for multi-authority cloud storage systems.
InSecurity for Cloud Storage Systems 2014 (pp. 59-83). Springer, New York, NY.
[9] Guangyan Zhang at. al. proposed CaCo: An Efficient Cauchy Coding Approach for Cloud Storage
Systems in IEEE Feb 2016.
[10] Ibrahim Adel Ibrahim at. al. proposed Intelligent Data Placement Mechanism for Replicas Distri-
bution in Cloud Storage Systems in 2016 IEEE International Conference on Smart Cloud