Thesis Final 11-02-19

ABSTRACT
In distributed storage administrations, deduplication innovation is generally used to lessen the

space and transfer speed necessities of administrations by killing excess information and putting
away just a solitary duplicate of them. Deduplication is best when various clients outsource
similar information to the distributed storage; however it raises issues identifying with security
and proprietorship. Evidence of-possession plans permit any proprietor of similar information to
demonstrate to the distributed storage server that he claims the information powerfully.
Nonetheless, numerous clients are probably going to encode their information before outsourcing
them to the distributed storage to save security, yet this hampers deduplication due to the
randomization property of encryption. As of late, a few deduplication plans have been proposed
to tackle this issue by enabling every proprietor to have a similar encryption key for similar
information. In any case, the vast majority of the plans experience the ill effects of security
defects, since they don't consider the dynamic changes in the responsibility for information that
happen as often as possible in a down to earth distributed storage benefit. In this paper, we
propose a novel server-side deduplication conspire for scrambled information. It enables the
cloud server to control access to outsourced information notwithstanding when the possession
changes progressively by abusing randomized joined encryption and secure proprietorship amass
key dispersion. This anticipates information spillage not exclusively to repudiated clients despite
the fact that they beforehand possessed that information, additionally to a fair however
inquisitive distributed storage server. Furthermore, the proposed plot ensures information
uprightness against any label irregularity assault. In this manner, security is improved in the
proposed conspire. The effectiveness examination comes about exhibit that the proposed
conspire is practically as proficient as the past plans, while the extra computational overhead is
immaterial.
TABLE OF CONTENT
S. No. Topics Page No.

Certificate i
Acknowledgement ii
Abstract iii
Table of Contents iv
List of Figure vi
List of Table vii
Abbreviations viii
CHAPTER 1 INTRODUCTION 1-5
1.1. Introduction 1
1.1.1. Problem Statement 4
1.1.2. Contribution 5
1.1.3. Organization of the Project 5
CHAPTER 2 Literature Survey 6-11
CHAPTER 3 System Requirement Specification 12-21
3.1. General Description 12
3.1.1 User Perspective 12
3.2 Non Functional Requirement 12
3.3 System Requirement 13
3.3.1 Hardware Requirement 13
3.3.2 Software Requirement 13
3.4 Feasibility Study 14
3.4.1 Technical Feasibility 14
3.4.2 Economic Feasibility 14
3.4.3 Operational Feasibility 15
3.5 Resource Requirement 15
3.5.1 Java 15
3.5.2 Java Server Page 17
3.5.3 Java Script 18
3.5.4 JDBC 19
CHAPTER 4 System Analysis 22-40
4.1. Introduction to System Analysis 22
4.1.1 System 22
4.1.2 System Analysis 22
4.2 Existing System 22
4.3 Disadvantages 23
4.4 Proposed System 23
4.5. Advantages of Proposed System 23
i
4.6 Project Module Description 24
4.6.1 Admin Modules 24
4.6.2 User Modules 24
4.7. Technique and Algorithms 25
4.8 System Design 28
4.8.1 The Activities of Design Process 28
4.9 System Architecture 30
4.10 Use Case Diagram 31
4.10.1 Admin Use Case Diagram 31
4.10.2 User Use Case Diagram 32
4.11 Data Flow Diagram 33
4.11.1 DFD Admin Session 33
4.11.2 DFD User Session 34
4.11.3 DFD File Upload Process 35
4.11.4 DFD File Download Process 36
4.12 Sequence Diagram 37
4.13 Context Analysis 38
4.14 Flow Chart 39
CHAPTER 5 Testing 41-60
5.1. Definition 41
5.2. Validation and System Testing 41
5.3. Integration Testing 43
5.4. Unit Testing 44
5.5. Unit Test Cases 45
CHAPTER 6 Result 46-60
7. Conclusion 61
8. References 62-65
ii
LIST OF FIGURE AND GRAPH
Figure No. Name of Figure Page No.

Figure 2.1 Cloud Computing Architecture 10
Figure 3.5.2.1 Architecture of JSP model 1 17
Figure 4.9.1 System Architecture 30
Figure 4.10.1.1 Admin Use Case Diagram 31
Figure 4.10.2.1 User Use Case Diagram 32
Figure 4.11.1.1 DFD Admin Session Diagram 33
Figure 4.11.2.1 DFD User Session Diagram 34
Figure 4.11.3.1 DFD File Upload Process 35
Figure 4.11.4.1 DFD File Download Process 36
Figure 4.12.1 DFD Sequence Diagram Upload Process 37
Figure 4.14.1 Flow Chart For Upload Process 38
Figure 4.14.2 Flow Chart For Block Signature 39
iii
LIST OF TABLES
Table No. Table Name Page No.

Table 5.4 Unit Test Cases 45
Table 5.5 System Test Cases 45
iv
LIST OF ABBREVIATION
JVM Java Virtual Machine

PoW Proof of Ownership
MLE Message Locked Encryption
MH Mobile host
AES Advanced Encryption Standard
DSR Dynamic Source Routing
RREQ Route Request
RREP Route Reply
TORA Temporally Ordered Routing Algorithm
ZRP Zone Routing Protocol
TCP Transmission Control Protocol
QOS Quality of Service
GPF Greedy Packet Forwarding
MFR Most Forwarded within R
NFP Nearest with Forward Progress
DREAM Distance Routing Effect Algorithm for Mobility Protocol
MLBRBCC Multipath Load Balancing and Rate Based Congestion Control
EDAODV Early Detection of Congestion and Control Routing Protocol
CBCC Cluster Based Congestion Control
RBCC Rate Based Congestion Control
SMORT Scalable Multipath On Demand Routing
EECCS Efficient Energy Based Congestion Control Scheme
ACK Acknowledgement
v
CHAPTER 1
INTRODUCTION
Cloud storage is one of the digital storage solutions that utilize multiple servers (typically
spread across multiple locations) which ensures to safely store files such as site backups etc.
The data is stored on dedicated servers, which provides unlimited accessibility wherever an
internet connection is available, along with an increase in backup file security which ensures
our files not to be hacked. Distributed computing gives versatile, minimal effort, and area free
online administrations extending from basic reinforcement administrations to distributed
storage foundations. The quick development of information volumes put away in the
distributed storage has prompted an expanded interest for procedures for sparing circle space
and system transfer speed.
To decrease asset utilization, many distributed storage administrations, for example, Dropbox,
Wuala, Mozy, and Google Drive, utilize a deduplication method, where the cloud server stores
just a solitary duplicate of excess information and gives connects to the duplicate as opposed
to putting away other real duplicates of that information, paying little respect to what number
of customers make a request to store the information.
The investment funds are critical, and allegedly, business applications can accomplish plate
and transfer speed reserve funds of over 90. Be that as it may, from a security point of view,
the mutual utilization of clients' information raises another test. As clients are worried about
their private information, they may scramble their information before outsourcing with a
specific end goal to shield information security from unapproved outside enemies, and in
addition from the cloud specialist co-op.
This is advocated by current security patterns and various industry directions, for example,
PCI DSS. In any case, ordinary encryption makes deduplication outlandish for the
accompanying reason. Deduplication procedures exploit information similitude to recognize
similar information and diminish the storage room.
Conversely, encryption calculations randomize the encoded records with a specific end goal to
make cipher text indistinct from hypothetically arbitrary information. Encryptions of similar
Page 1
information by various clients with various encryption keys brings about various cipher texts,
which makes it troublesome for the cloud server to decide if the plain information are the
same and deduplicate them.
Say a client Alice encodes a document M under her mystery key skA and stores its comparing
cipher text CA. Weave would store CB, which is the encryption of M under his mystery key
skB. At that point, two issues emerge: (1) in what capacity can the cloud server recognize that
the hidden document M is the same, and (2) regardless of the possibility that it can identify
this, how might it enable both sides to recoup the put away information, in view of their
different mystery keys? Direct customer side encryption that is secure against a picked
plaintext assault with arbitrarily picked encryption keys avoids deduplication.
One credulous arrangement is to enable every customer to encode the information with
general society key of the distributed storage server. At that point, the server can deduplicate
the distinguished information by decoding it with its private key combine. Be that as it may,
this arrangement permits the distributed storage server to get the outsourced plain
information, which may abuse the protection of the information if the cloud server can't be
completely put stock in [13],[14].
Merged encryption [15] settle this issue adequately. A merged encryption calculation encodes
an info record with the hash estimation of the information document as an encryption key. The
cipher text is given to the server and the client holds the encryption key. Since united
encryption is deterministic1, indistinguishable documents are constantly scrambled into
indistinguishable cipher text , paying little heed to who encodes them.
In this manner, the distributed storage server can perform deduplication over the cipher text,
and all proprietors of the record can download the cipher text (after the confirmation of-
possession (PoW) handle alternatively) and decode it later since they have a similar
encryption key for the document.
United encryption has for some time been examined in business frameworks and has
distinctive encryption variations for secure deduplication [8],[16],[17],[18], which was
formalized as message blocked encryption later in [20]. Be that as it may, focalized
Page 2
encryption experiences security blemishes with respect to label consistency and proprietorship
repudiation.
Deduplication is a technique that ensures only one unique instance of data is retained on
storage media, such as disk, flash or tape. Redundant data blocks are replaced with a pointer
to the unique data copy. The technique is used to improve storage utilization and can also be
applied to network data transfers to reduce the number of bytes that must be sent. In the
deduplication process, unique chunks of data, or byte patterns, are identified and stored during
a process of analysis. Data deduplication is a specialized data compression technique for
eliminating duplicate copies of repeating data. Related and somewhat synonymous terms are
intelligent (data) compression and single-instance (data) storage.
At that point, the repudiated clients ought to be kept from getting to the information put away
in the distributed storage after the erasure or change ask for (forward mystery). Then again,
when a client transfers information that as of now exist in the distributed storage, the client
ought to be discouraged from getting to the information that were put away before he acquired
the proprietorship by transferring it (in reverse mystery).
These dynamic proprietorship changes may happen often in a handy cloud framework, and
hence, it ought to be appropriately overseen keeping in mind the end goal to maintain a
strategic distance from the security corruption of the cloud benefit. Be that as it may, the past
deduplication plans couldn't accomplish secure get to control under a dynamic possession
evolving condition, regardless of its significance to secure deduplication, in light of the fact
that the encryption key is inferred deterministically and once in a while refreshed after the
underlying key deduction.
In this way, for whatever length of time that denied clients keep the encryption key, they can
get to the comparing information in the distributed storage whenever, paying little respect to
the legitimacy of their proprietorship. This is the issue we endeavour to understand in this
investigation.
As of late, Li et al. [25] proposed a united key administration conspire in which clients
appropriate the concurrent key offers over various servers by misusing the Ramp mystery
Page 3
sharing plan [26]. Li et al. [27] additionally proposed an approved deduplication plot in which
differential benefits of clients, and in addition the information, are considered in the
deduplication technique in a half and half cloud condition. Jin et al. [31] proposed an
unknown deduplication conspire over scrambled information that adventures an intermediary
re-encryption calculation. Bellare et al. [28] proposed a server-helped MLE which is secure
against beast compel assault, which was as of late stretched out to intuitive MLE [29] to give
protection to messages that are both associated and reliant on people in general framework
parameters.
In any case, these plans don't deal with the dynamic possession administration issues required
in secure deduplication for shared outsourced information. Shin et al. [30] proposed a
deduplication plot over scrambled information that utilizations predicate encryption. This
approach permits deduplication just of documents that have a place with a similar client,
which extremely decreases the impact of deduplication. Therefore, in this paper, we
concentrate on deduplication crosswise over various clients to such an extent that
indistinguishable records from various clients are distinguished and deduplicated securely to
give more stockpiling investment funds.
1.1 Problem Statement

Now days everybody using cloud services, storing data, sharing data with others and people
knows that cloud is a third party resource so many concern are there like data security,
privacy, data ownership, space for storage, bandwidth, data deduplication etc. Major issues
are data security,, data deduplication, dynamic data ownership which is not as per our
expectation, we need more enhancement in these concern. As we analyze many existing work
in the field of data security, dynamic data ownership. Many existing plan using deduplication
but they are not addressing deduplication at small block level.
1.2 Contribution
We propose a data deduplication scheme over mixed data. The proposed system ensures that
individual secure access to the common data is possible, which is believed to be the most
Page 4
basic test for successful and secure appropriated bigdata storage organizations in nature where
ownership changes effectively. At the point when diverged from the past deduplication
schemes over mixed data, the proposed system has the going with purposes of enthusiasm to
the extent security, dynamic data ownership and profitability. To begin with, dynamic data
ownership organization guarantees the retrogressive and forward secret of deduplicated data
upon any ownership change.
1.3 Organization of the project report

The specialized perspectives, framework prerequisites and the association of the
venture report are examined as takes after. The undertaking report for the most part comprises
of aggregate 8 sections, reference index and informative supplement. Section 1 gives the
general data identified with the task. Section 2 gives data about the alluded papers. Section 3
gives points of interest of various types of prerequisites expected to effectively finish the task.
Part 4 gives insights around a few investigations that are performed to encourage taking
choice of whether the task is sufficiently plausible or not. Section 5 incorporates some
fundamental configuration idea, approach, some vital charts to diagram the undertaking
effortlessly. Part 6 depicts execution of modules furthermore manages about the instruments
and advancements utilized for usage. Section 7 characterizes different testing techniques
utilized for testing the distinctive modules in the venture. Part 8 is identified with previews of
the undertaking alongside elucidation. At long last closes the paper work and abridges the
venture. It additionally highlights some future improvement extents of the task.
Page 5
CHAPTER 2
LITERATURE SURVEY
General
On the basis of extensive literature survey related to the data deduplication with dynamic
ownership management in cloud storage has been taken into consideration in this chapter.
D. T. Meyer, and W. J. Bolosky [1] has proposed that File frameworks regularly contain
repetitive duplicates of data: indistinguishable documents or sub-record locales, perhaps put
away on a solitary host, on a mutual stockpiling group, or moved down to optional capacity.
Deduplicating stock piling frameworks exploit this excess to decrease the fundamental space
expected to contain the record frameworks (or reinforcement pictures thereof).
Deduplication can work at either the sub-document or entire record level. All the more fine-
grained deduplication makes more open doors for space reserve funds, yet fundamentally
lessens the successive format of a few records, which may have critical execution impacts
when hard plates are utilized for capacity (and at times requires confused strategies to
enhance execution.
M. Dutch[2] has stated that with respect to the comprehension of information deduplication
proportions that information deduplication brings down business dangers, builds income
openings, and diminishes stockpiling level expenses, bringing about an ideal tempest for
organizations sending a versatile stockpiling foundation. Capacity flexibility advances, for
example, RAID or RAIN, shield the deduplicated information to guarantee high accessibility
of utilizations getting to the information.
The financial aspects of information deduplication make it more than convincing; it is

required for any business looking to expand their client benefit levels. Information
deduplication proportions are anything but difficult to over-break down and credit advantages
to, that could conceivably exist.
Page 6
W. K. Ng et al. [3] has proposed about another idea which we call private information
deduplication convention, a deduplication method for private information stockpiling is
presented and formalized. Instinctively, a private information deduplication convention
permits a customer who holds a private information demonstrates to a server who holds an
outline string of the information that he/she is the proprietor of that information without
uncovering additional data to the server.
M. W. Storer, K. Greenan, D. D. E. Long, and E. L. Miller [4]has proposed about the

Businesses and buyers are ending up noticeably progressively aware of the estimation of
secure, chronicled information stockpiling. In the business field, information safeguarding is
regularly commanded by law, and information mining has ended up being an aid in forming
business system.
For people, recorded capacity is being called upon to save nostalgic and authentic ancient
rarities, for example, photographs, films and individual archives. Further, while few would
contend that business information calls for security, protection is similarly imperative for
people; information, for example, medicinal records and authoritative reports must be kept for
drawn out stretches of time however should not be openly available. Incomprehensibly, the
expanding estimation of authentic information is driving the requirement for cost-productive
capacity; reasonable capacity permits the conservation of all information that may in the long
run demonstrate helpful.
N. Baracaldo, E. Androulaki, J. Glider, A. Sorniotti[5] represented that Cloud reckoning has

developed as exceptionally useful for organizations that hope to decrease their expenses, send
new applications quickly or that do not have any need to stay up their own process
framework. In any case, late info ruptures in clear distributed storage suppliers have created
customers be increasingly troubled concerning the classification of their (outsourced) info.
There are things wherever client info was conferred to and spilled by cloud provider
representatives that had physical access to the capability medium, and moreover wherever
cloud clients accessed alternative customer's info within the wake of getting been distributed
physical warehousing quality already alloted to a different customer e.g., then alternative
Page 7
client had worn out its distributed storage membership (in this paper we tend to suggest to that
as indweller off boarding).
C. Wang, Z. Qin, J. Peng, and J. Wang[8] has stated another information pressure innovation,
which is known as information deduplication, come in locate in pace with the gigantic
increment of electronic information. Information deduplication partitions information into
settled or variable size pieces and the cryptographic hash estimation of each lump is utilized
as the piece's worldwide one of a kind 10 with the end goal that excess information might be
distinguished.
These redundancies are utilized to either decrease stockpiling limit needs or to lessen organize
activity. Not at all like conventional pressure technique, deduplication recognizes normal
groupings of bytes both inside and amongst records, and just stores a solitary occasion of each
piece paying little mind to the quantity of times it happens.
J. R. Douceur, A. Adya, W. J. Bolosky, D. Simon, and M. Theimer[11] in this it's address

regarding the problems of recognizing and mixture indistinguishable documents within
the Farsite circulated record framework, with the top goal of sick stowage eaten up by
unexpectedly repetitive substance. Farsite may be a protected, versatile, server less document
framework that coherently capacities as a focused record server but that's physically
disseminate among associate organized accumulation of desktop workstations.
Since desktop machines don't seem to be typically on, not halfway oversaw, and not
physically secured, the area recovery method should endure a high rate of framework
disappointment, work while not focal coordination, and capability couple
with science security.
P. Anderson, L. Zhang [12] had stated that a run of the mill cluster of moveable digital
computer shoppers share plenty of knowledge in like manner. This provides the chance to
essentially diminish reinforcement times, and capability conditions. In any case, we've got
incontestable that manual determination of the many info - as an example, moving down
simply home registries - may be a poor methodology; this neglects to reinforcement some
important records, within the meanwhile as superfluously repeating totally different
Page 8
documents. we've got exhibited a model reinforcement program that accomplishes a perfect
level of sharing within the mean while as taking care of secrecy.
This adventures a unique calculation to diminish the number of documents that ought to be
checked and later diminishes reinforcement times. we've got incontestable that run of the mill
cloud interfaces, as an example, Amazon S3 don't seem to be acceptable to the present
reasonably use, due to the time and price of run of the mill exchanges, and therefore the
absence of multi-client verification to shared info.
We have delineated a usage utilizing a neighbourhood server which might be from these
problems by storing and pre-preparing info before sending to the cloud. this can be looked as
if it would accomplish noteworthy price funds2.9: Safe backup of cloud system with
guaranteed deletion:
A. Rahumed, H. C. H. Chen, Y. Tang, P. P. C. Lee, J. C. S. Lui[14] has pictured that Cloud

registering could be a rising administration demonstrate that offers calculation and capability
assets on the net. One appealing utility that distributed computing can give is distributed
storage. Folks and ventures area unit often needed to remotely document their info to dodge
any knowledge misfortune within the event that there area unit any equipment/programming
disappointments or unforeseen calamities. Rather than getting the desired warehousing media
to stay info reinforcements, folks and undertakings will simply source their info reinforcement
administrations to the cloud specialist co-ops, which provide the important ware housing
assets to possess the knowledge reinforcements. Whereas distributed storage is appealing, a
way to provide security assurances to outsourced info turns into a rising concern.
J. Xu, E. Chang, and J. Zhou[15] has expressed that Cloud reposition administration is
learning prominence as recently. To diminish quality utilization in each system transmission
capability and capability, several distributed storage administrations as well as Dropbox one
and Wuala a pair of utilizes client aspect deduplication. That is, the purpose at that a consumer
tries to transfer a document to the server, the server checks whether or not this specific record
is as of currently within the cloud (transferred by some consumer beforehand), and spares the
transferring procedure within the event that it's as of currently within the distributed storage.
Page 9
Along these lines, every and each record can have simply one duplicate within the cloud (i.e.
Single Instance Storage). SNIA study declared that the deduplication procedure will put aside
to ninetieth capability, dependent on applications. As per Halevi et al. what is a lot of, Drop
ship, a current usage of client aspect deduplication is as beneath: Cloud consumer Alice tries
to transfer a record F to the distributed storage.
Nowadays cloud computing gaining more popularity for sharing data through cloud. It runs a
lot of application to provide different features to users. It gives a platform for users to share
data through cloud storage. Cloud contains many schemes for data security and privacy.
Figure 2.1 Cloud Computing Architecture

Cloud computing providing fast computing and high performance computing power.
Generally in military and research facilities need fast computing power. Cloud store huge data
for commercial organizations. It provides data storage for a lot of different online games.
To get these things distributed computing uses extensive gatherings in system of servers
which is running in ease buyer PC innovation to spread information preparing Virtualization
strategies for the most part used to build the force of distributed computing.
Conclusion
Page 10
This part for the most part talks about the papers that are alluded while making this
thesis report. Every one of these papers give data identified with learning of aggregate
conduct, their current arrangements, and strategies utilized furthermore their focal points &
constraints.
Page 11
Chapter 3
SOFTWARE REQUIREMENT SPECIFICATION
This part depicts about the prerequisites. It determines the equipment and
programming prerequisite that are needed for software to keeping in mind the end goal, to run
the application appropriately. The Software Requirement Specification (SRS) is clarified in
point of interest, which incorporates outline of this exposition and additionally the functional
and non-practical necessity of this thesis.
3.1 General Description

The reason behind the framework prerequisites and determination record is to depict
the assets and administration of those assets utilized as a part of the configuration of the
Public Key Cryptosystem for information partaking in distributed storage. This framework
necessity and particular will likewise give insights with respect to the utilitarian and non-
useful prerequisites of the venture.
3.1.1 Users Perspective

The Characteristic of this task work is to give information adaptability security while
sharing information through cloud. It gives a proficient approach to share information through
cloud.
3.2 Non Functional Requirement

Non utilitarian necessities are the prerequisites which are not straightforwardly having
a place with the specific capacity gave by the framework. This gives the criteria that can be
utilized to finish up the operation of a framework rather than particular practices.
This can be utilized to relate the rising structure properties, for instance, immovable
quality, response time and store inhabitancies. Here again they ought to portray objectives on
the system, for instance, the capacity of the data yield devices and data representation used as
a piece of structure interfaces. In all probability all non helpful essentials can be relating to the
system as whole rather than to individual structure highlights. This suggests they are every
now and again essential appear differently in relation to the individual commonsense
Page 12
necessities. Non utilitarian necessity gets through the client needs, in view of spending plan
limitations, hierarchical approaches, and the requirement for interoperability with other
programming and equipment frameworks.
The going with non-valuable requirements are meriting thought.
 Security: The framework ought to permit a secured correspondence between
information proprietor and beneficiary.
 Reliability: The system should be trustworthy and ought not corrupt the execution of
the present structure and should not to provoke the hanging of the structure.
3.3 System Requirement
3.3.1 Hardware Requirement

 Processor : intel/amd
 Keyboard : 104 Keys
 Floppy Drive : 1.44 MB MHz Pentium III
 RAM : 128 MB
 Hard Disk : 10 GB
 Monitor : 14” VGA COLOR
 Mouse : Logitech Serial Mouse
 Disk Space : 1 GB
3.3.2 Software Requirements
 Operating System : Win 2000/ XP

 Server : Apache Tomcat
 Technologies used : Java, Servlets, JSP, JDBC
 JDK : Version 1.7
 Database : My SQL 5.0
Page 13
3.4 Feasibility Study
Believability is the determination of paying little respect to whether an undertaking
justifies action. The framework followed in building their strength is called acceptability
Study, these kind of study if a task could and ought to be taken.
Three key thoughts included in the likelihood examination are:
 Technical Feasibility
 Economic Feasibility
 Operational Feasibility
3.4.1 Technical Feasibility

Here it is considered with determining hardware and programming, this will effective
fulfil the client necessity the specialized requires of the framework should shift significantly
yet may incorporate
 The office to create yields in a specified time.
 Reaction time under particular states.
 Capacity to deal with a particular segment of exchange at a specific pace.
3.4.2 Economic Feasibility

Budgetary examination is the often used system for assessing the feasibility of a
projected structure. This is more usually acknowledged as cost/favourable position
examination. The method is to center the focal points and trusts are typical casing a projected
structure and a difference them and charges. These points of interest surpass costs; a choice is
engaged to diagram and realize the system will must be prepared if there is to have a
probability of being embraced. There is a consistent attempt that upgrades in exactness at all
time of the system life cycle.
3.4.3 Operational Feasibility

It is for the most part identified with human association and supporting angles. The
focuses are considered:
Page 14
 What alterations will be carried through the framework?
 What authoritative shapes are dispersed?
 What new aptitudes will be needed?
 Do the current framework employee’s individuals have these aptitudes?
 If not, would they be able to be prepared over the span of time?
3.5 Resource Requirement

3.5.1 Java
Java is a stage autonomous programming dialect. It is outline to be basic and convenient
crosswise over diverse stages.
The java programming vernacular is an unusual state tongue that can be portrayed by most of
the going with in vogue expressions:
 Object oriented
 Simple
 Architecture neutral
 Portable
 Robust
 Dynamic
 Secure
The Java API is a broad social affair of moment programming fragments that give various
profitable limits, for instance, graphical customer interface (GUI) contraptions. The Java
API is accumulated into collections of correlated classes and interfaces; these collections
are recognized as packs.
Java stage gives you the accompanying elements:

 The essentials: Items, strings, strings, numbers, info, yield, information structures,
framework properties, date, time et cetera.
 Applets: The arrangement of traditions utilized by applets.
 Networking: URLs, TCP, UDP attachments, and IP addresses.
Page 15
 Internationalization: Help for composing projects that could be restricted for clients
around the world. Projects can naturally adjust to particular local people and be shown
in the suitable dialect.
 Security: Mutually low level and abnormal state, together with electronic marks, open
and private key administration, right of entry control and authentications.
 Software components: Recognized as JavaBeans, could connect to existing parts
structural designs.
 Object Serialization: Let’s Permits lightweight tirelessness and correspondence by
means of Remote Method Invocation (RMI).
 Java Database Connectivity (JDBC): Give consistent entree to an extensive variety
of social databases.
Advantage of java technology:

 Get started quickly: In spite of the fact that the java programming dialect is an
intense article arranged dialect, it is anything but difficult to learn, particularly for
software engineers effectively acquainted using C or C++.
 Write less code: Examinations of undertaking estimationsadvise that a framework
built in the java language tongue shall be 4 times more diminutive compare tosimilar
program in C++.
 Write better code: Java dialect energizes great coding rehearsals and its trash
gathering serves to evade memory spills. Its item introduction, its javaBeans segment
building design and its far reaching, effectively extendible API let’s to use again other
individuals tried code and present less bugs.
 Develop programs more quickly: Headway time can be as much as twice as speedy
against making the similar program in C++.
 Write once, run anywhere:Since 100% immaculate java projects are ordered into
system autonomous byte codes, run reliably on whichever java stage.
 Distribute software more easily: Update applets smoothly from a middle server. The
applets misuse the segment of allocating new classes could be stacked "on the fly",
without recompiling the whole system.
Page 16
3.5.2 Java Server Page
Java Server Pages development allows you to put scraps of servlet coding particularly
into a substance based record. A java server page is a substance based record that holds two
sorts of substance: static format data, which should be imparted in several substance based
association, for instance, XML, WML, HTML and Java server page segments, which choose
how the page fabricates component content.
Java Server Page (JSP):An extensible Web innovation that make use of layout information,
custom components, scripting dialects, and server side Java programming language articles to
homecoming element substance to a customer. As indicated by JSP model1 we can build up
the application as:
Figure 3.5.2.1 Architecture of jsp model 1

Commonly the layout information is XML/HTML components, and as a rule the customer is a
Web programs. According to above replica the presentation method of reasoning must be
executed in java server page and the business justification must be realized as a component of
Javabean and this model assist us in disconnecting the appearance and business basis.
For sweeping extent reaches out instead of using model1 it is perfect to use model2 Model
View Controller. Struts structure is in light of model 2.JSP allows you to specific the dynamic
bit of your pages with the static HTML. Here it just makes the steady HTML in the run of the
mill way, using any Web page building mechanical assemblies it frequently uses. Here again
encase the dynamic code for the parts in exceptional labels, the greater part of which begin
with "<%" and end with "%>". You ordinarily give your record a .jsp augmentation, and
regularly introduce it in wherever you could put a typical Web page.
Despite the fact that what you compose frequently seems more like a customary HTML
record compare to a servlet, in the background, the java server page just acquires change over
Page 17
to an average servlet, through the static HTML basically being printed to the yield stream
joined with the servlet's organization method.
3.5.3 JavaScript
Java Script is a one of the object oriented; lightweight, script based computer
programming language dialect tongue which was made by Netscape Communication
Corporation. JavaScript holds the change equally client and server parts of application of Web
program and the client side; this could be employ to form programs which are implemented
by a Web program inside the association of a Web page on the server side, this should be
employed to make a Web server programs which could deal with information set up together
by a Web project and thereafter overhaul the program's showcase moreover.
Despite the fact that JavaScript underpins client and server Web programming, this
lean toward JavaScript at customer side programming subsequent to a large portion of the
programs bolsters this one. JavaScript is as simple as to study as HTML, and JavaScript
explanations could be incorporated in HTML reports by encasing the announcements among a
couple of scripting labels.
Here there are a couple of issues we can do with JavaScript

 Approve the substance of a structure and build computations.
 Include looking over or changing information to the Browser's status line.
 Invigorate pictures or pivot pictures that change when we shift the mouse above them.
 Identify the program being used and presentation distinctive substance for diverse
programs.
 Identify introduced modules and inform the client if a module is needed.
 JavaScript can do a great deal more with JavaScript, including making whole
application.
JavaScript and Java are altogether diverse dialects. A couple of the obvious contrasts
are:
Page 18
 Java applets are usually indicated for a situation inside the web chronicle; JavaScript
could impact every bit of the Web report itself.
 JavaScript is most suitable to essential applications and inserting instinctive
components to Web pages and Java could be used for amazingly multifaceted
applications.
Here there are various distinctive complexities yet the fundamental object to review is that
JavaScript and Java are autonomous lingos. They are together useful for unmistakable objects;
really they could be used both to join their ideal circumstances.
3.5.4 JDBC
JDBC with a last goal to set a self administering database benchmark API for java made
database blend, or JDBC. This offers a non specific SQL database access portion that provides
an expected interface to a mixed pack of RDBMS. This trustworthy interface is refined in the
use of "unit" database framework units, or drivers. On the off chance that a database merchant
wishes to have JDBC strengthen, he or she can allot to the driver to every stage that the
database and java keep running on. To get a more expansive attestation of JDBC, sun builds
up JDBC's system concerning ODBC. Java database integration gives uniform access to an
extensive variety of social databases. MS Access databases are utilized for rapidly
overhauling the store table.
Plan objectives for JDBC are as per the following:
 SQL Level API
The organizers experience that their essential objective was to portray a SQL interface
for java. Yet not the most diminished database interface level achievable, this is an
adequately low level for strange state devices and APIs to be made. Then again, it is at
an adequately abnormal state for application programming architect to use it surely.
Finishing this target looks into future mechanical assembly shippers to "make" JDBC
code and to cover countless difficulties from the end customer.
 SQL Conformance
Page 19
SQL sentence structure changes as this movement from database shipper to database
vender. With an end target to support a wide grouping of traders; JDBC will authorize
every investigation articulation to be moved out through this to the basic database
driver. This authorizes the integration unit to grip non-standard usefulness in approach
that is appropriate for their clients.
 JDBC must be implemented on top of basic database interfaces

The JDBC SQL API must "sit" on top of other fundamental SQL level APIs. This
target grants JDBC to use active ODBC level drivers by the usage of an item interface.
This interface could be making an elucidation of JDBC calls to ODBC and the other
route around.
 Give a java interface that is steady with whatever is left of the java framework
As of java's affirmation in the customer gathering thusly for, the makers feels that they
should not to wander away from the present blueprint of the middle java structure.
 Keep it simple
These objective likely shows up in all products outline objective postings. JDBC is no
special case. Sun considered that the configuration of JDBC ought to be extremely
basic, taking into account stand out strategy for finishing an undertaking for every
component. Permitting copy usefulness just server to confound the client of the API.
 Keep the common cases simple
Since as a general rule, the typical SQL calls utilize the software engineers are
straightforward INSERT's, SELECTs, UPDATE's and DELETE's these request should
be anything but difficult to perform with JDBC. Be that as it may, more intricate SQL
explanations ought to likewise be conceivable.
Conclusion
This part gives subtle elements of the practical prerequisites, non-utilitarian
necessities, asset prerequisites, equipment necessities, programming necessities and so on.
Again the non-utilitarian prerequisites thus contain item necessities, authoritative
prerequisites, client prerequisites, fundamental operational necessities and so on.
CHAPTER 4
Page 20
SYSTEM ANALYSIS
4.1 Introduction to System Analysis

4.1.1 System
A system is an orderly group of interdependent components linked together according to a
plan to achieve a specific objective. Its main characteristics are organization, interaction,
interdependence, integration and a central objective.
4.1.2 System Analysis

System analysis and design are the application of the system approach to problem solving
generally using computers. To reconstruct a system the analyst must consider its elements
output and inputs, processors, controls feedback and environment.
4.2 Existing System

When a user uploads data that already exist in the cloud storage, the user should be deterred
from accessing the data that were stored before he obtained the ownership by uploading it
(backward secrecy).Existing system are not handling data deduplication at small block level.
These dynamic ownership changes may occur very frequently in a practical cloud system, and
thus, it should be properly managed in order to avoid the security degradation of the cloud
service. In the former approach, most of the existing schemes have been proposed in order to
perform a POW process in an efficient and robust manner, since the hash of the file, which is
treated as a “proof” for the entire file, is vulnerable to being leaked to outside adversaries
because of its relatively small size. A data owner uploads data that do not already exist in the
cloud storage, he is called an initial uploaded; if the data already exist, called a subsequent
uploaded since this implies that other owners may have uploaded the same data previously, he
is called a subsequent uploader.
4.3 DISADVANTAGES
Page 21
Client deduplication on the customer side can't create another label when they refresh
the document. In this circumstance, the dynamic Ownerships would come up short. As an
outline, existing dynamic Ownerships can't be reached out to the multi-client condition. At
whatever point information is changed, concerns emerge about potential loss of information.
By definition, information deduplication frameworks store information uniquely in contrast to
how it was composed.
Accordingly, clients are worried about the trustworthiness of their information. One strategy
for deduplicating information depends on the utilization of cryptographic hash capacities to
distinguish copy sections of information. In the event that two distinct snippets of data
produce similar hash esteem, this is known as an impact. The likelihood of a crash relies on
the hash work utilized, and in spite of the fact that the probabilities are little, they are
dependably non zero.
4.4 Proposed System

• Our proposed authorized duplicate check scheme incurs minimal overhead compared
to normal operations.
 In Proposed system we propose a new server-side deduplication plan for mixed data. It
empowers the cloud server to control access to outsourced data despite when the
ownership changes intensely by manhandling randomized joined encryption and
secure ownership pack key scattering, a deduplication service over encoded data.
 The proposed system ensures that selective endorsed access to the common data is
possible, which is believed to be the most basic test for capable and secure dispersed
bigdata storage benefits in the earth where data ownership changes intensely. The
proposed system ensures security in the setting of dynamic data ownership by
exhibiting a hash key framework for dynamic ownership gathering.
4.5 Advantages of the Proposed System

• Generate data tags before uploading as well as audit the integrity of data having been
stored in cloud.
• It Decreases size of occupation and Decreases bandwidth.
• One critical challenge of cloud storage services is the management of the ever
increasing.
Page 22
• volume of data
• Integrity auditing and secure deduplication directly on encrypted data.
4.6 Project Module Description

4.6.1 Admin Modules
 Login
 User Details
 Add User
 Edit User
 Delete User
 View User Details
 Cloud Details
 View Details
 Hash Tag
 View Hash Tags
 Transaction Details
 Select User
 View log Details
 Sign Out
4.6.2 User Modules

 Login
 Show Profile
 Upload a File
 User has to select the file from the local system
 File is break into blocks
 Auditing Process
 For Each Block generate Hash Tag
 Check for the Presence of Hash Tag for each block in Integrity Management
 If present than make the link with exiting block else store the block in cloud
storage and insert a new record in Integrity Management.
 Insert a Transaction Record
 Show Upload Successful Message to user
 Download a File
 View details of all the uploaded file
 User has to select the file to download and initiate the download process
 Retrieve the Block details of the selected file.
 Get the block storage details in cloud
 Download all the blocks
Page 23
 Integrity Check
 Generate the Hash Tag for all the Blocks.
 Compare the Hash Tags of the block with Hash Tags in Table.
 If Hash Tags Comparison pass for all the blocks then merge the blocks and
download the file to the user system Else show Integrity Check file message
to the user.
 Transaction
 Sign out
4.7 TECHNIQUE AND ALGORITHMS

 The key generation algorithm.
 The encryption algorithm.
 The decryption algorithm.
 The tag generation algorithm.
Pseudo code for Tag generation Algorithm Map Reduce

1: map(String key, String value)
2: key: document name
3: value: document contents
4: for each word w in value
5: EmitIntermediate(w, "1")
1reduce(String key, Iterator values)
key: word
values: a list of counts
for each v in values:
result += ParseInt(v);
Emit(AsString(result));
Page 24
Secure Auditing Multi Level Block Signature Verification: Pseudocode
Begin
Get Block Signature (BS)
Convert block signature in to three parts
First Bit (FB), Second Bit (SB), Third Bit (TB) to Last Bit (LB)
Get the Index of Code(IC) FB from zero level BS
Indentifying the corresponding next level BS index using IC
If (presence of SB in second level BS index)
If (get (TB to LB) and check the presence of byte in 3rd level))
Block exist
Else
Else
Block not exist
End
Secure Auditing Uploading Process: Pseudo code

Begin
Get file F for upload
Convert F in to N equal size block
For i = 1 to N
Get ith block
Generate block signature for ith block
If (multilevel block signature indexing whether block exist or not)
Map the existing block to the file index
Else
Encrypt the block
Upload the block in to the cloud
Insert a new record for the block
Map the new block to the file index
Next i
Display upload conformation message
End
4.8 System Design

4.8.1 The activities of the Design process:
1. Interface design-describes the structure and organization of the user interface. Includes a
representation of screen layout, a definition of the modes of interaction, and a description
Page 25
of navigation mechanisms. Interface Control mechanisms- to implement navigation
options, the designer selects form one of a number of interaction mechanism;
a. Navigation menus
b. Graphic icons
c. Graphic images
Interface Design work flow- the work flow begins with the identification of user, task, and
environmental requirements. Once user tasks have been identified, user scenarios are
created and analyzed to define a set of interface objects and actions.
2. Aesthetic design-also called graphic design, describes the “look and feel” of the WebApp.
Includes colour schemes, geometric layout. Text size, font and placement, the use of
graphics, and related aesthetic decisions.
3. Content design-defines the layout, structure, and outline for all content that is presented as
part of the WebApp. Establishes the relationships between content objects.
4. Navigation design-represents the navigational flow between contents objects and for all
WebApp functions.
5. Architecture design-identifies the overall hypermedia structure for the WebApp.
Architecture design is tied to the goals establish for a WebApp, the content to be
presented, the users who will visit, and the navigation philosophy that has been
established.
a. Content architecture, focuses on the manner in which content objects and structured
for presentation and navigation.
b. WebApp architecture, addresses the manner in which the application is structure to
manage user interaction, handle internal processing tasks, effect navigation, and
present content. WebApp architecture is defined within the context of the development
environment in which the application is to be implemented.
Page 26
J2EE uses MVC Architecture
6. Component design-develops the detailed processing logic required to implement

functional components.
4.9 System Architecture
Page 27
Figure 4.9.1 System Architecture
4.10 Use Case Diagram
Page 28
4.10.1 Admin Use Case Diagram
Figure 4.10.1.1 Admin Use Case Diagram
Page 29
4.10.2User Use Case Diagram
Figure 4.10.2.1 User Use Case Diagram
Page 30
4.11 Data Flow Diagram
4.11.1 DFD Admin Session
Figure 4.11.1.1 DFD Admin Session Diagram
Page 31
4.11.2DFD User Session
Figure 4.11.2.1 DFD User Session Diagram
Page 32
4.11.3DFD File Upload Process
Figure 4.11.3.1 DFD File Upload Process
Page 33
4.11.4DFD File Download Process
Figure 4.11.4.1 DFD File Download Process
Page 34
4.12 Sequence Diagram
Figure 4.12.1 DFD Sequence Diagram Upload Process
Page 35
4.13 Context Analysis
Figure 4.13.1 Context Analysis Diagram
Page 36
4.14 Flow Chart Draw the flow chart with proper arrow pointing correct flow
Fig.4.14.1 Flow Chart for Upload Process
Page 37
Fig. 4.14.2 Flow Chart for Block
Page 38
CHAPTER 5
TESTING
5.1 Definition
Unit testing is an improvement system where software engineers make tests as they create
programming. The tests are basic short tests that test practically of a specific unit or module of
their code, for example, a class or capacity.
Utilizing open source libraries like cunit, oppunit and religious recluse it (for C, C++ and
Java) these tests can be consequently run and any issues discovered rapidly. As the tests are
created in parallel with the source unit test exhibits its rightness.
5.2 Validation and System Testing

Approval testing is a worry which covers with reconciliation testing. Guaranteeing that the
application satisfies its detail is a noteworthy standard for the development of a coordination
test. Approval testing additionally covers to a huge degree with System Testing, where the
application is tried concerning its normal workplace. Thusly for some procedures no
reasonable division amongst approval and framework testing can be made. Particular tests
which can be performed in either or the two phases incorporate the accompanying.
 Regression Testing: Where this version of the software is tested with the automated
test harness used with previous versions to ensure that the required features of the
previous version are skill working in the new version.
 Recovery Testing: Where the software is deliberately interrupted in a number of ways

off, to ensure that the appropriate techniques for restoring any lost data will function.
 Security Testing: Where unauthorized attempts to operate the software, or parts of it,
attempted it might also include attempts to obtain access the data, or harm the software
installation or even the system software. As with all types of security determined will
be able to obtain unauthorized access and the best that can be achieved is to make this
process as difficult as possible.
Page 39
 Stress Testing: Where abnormal demands are made upon the software by increasing
the rate at which it is asked to accept, or the rate t which it is asked to produce
information. More complex tests may attempt to crate very large data sets or cause the
soft wares to make excessive demands on the operating system.
 Performance testing: Where the performance requirements, if any, are checked.

These may include the size of the software when installed, type amount of main
memory and/or secondary storage it requires and the demands made of the operating
when running with normal limits or the response time.
 Usability Testing: The process of usability measurement was introduced in the

previous chapter. Even if usability prototypes have been tested whilst the application
was constructed, a validation test of the finished product will always be required.
 Alpha and beta testing: This is where the software is released to the actual end users.
An initial release, the alpha release, might be made to selected users who be expected
to report bugs and other detailed observations back to the production team. Once the
application changes necessitated by the alpha phase can be made to larger more
representative set users, before the final release is made to all users.
The final process should be a Software audit where the complete software project is
checked to ensure that it meets production management requirements. This ensures
that all required documentation has been produced, is in the correct format and is of
acceptable quality. The purpose of this review is: firstly to assure the quality of the
production process and by implication construction phase commences. A formal hand
over from the development team at the end of the audit will mark the transition
between the two phases.
5.3 Integration Testing

Integration Testing can continue in various diverse ways, which can be comprehensively
portrayed as best down or base up. In top down reconciliation testing the abnormal state
control schedules is tried to begin with, conceivably with the center level control structures
Page 40
exhibit just as stubs. Subprogram stubs were introduced in section2 as fragmented
subprograms which are just present to permit the higher. Level control schedules to be tried.
Top down testing can continue in a profundity first or an expansiveness first way. For
profundity first reconciliation every module is tried in expanding subtle element, supplanting
an ever increasing number of levels of detail with real code as opposed to stubs. On the other
hand expansiveness initially would handle by refining every one of the modules at a similar
level of control all through the application .by and by a blend of the two methods would be
utilized. At the underlying stages every one of the modules may be just halfway useful,
potentially being actualized just to manage non-mistaken information. These eventual tried in
expansiveness first way, yet over some stretch of time each future supplanted with progressive
refinements which were nearer to the full usefulness. This permits profundity initially testing
of a module to be performed at the same time with expansiveness initially testing of the
considerable number of modules.
The other significant classification of mix testing is Bottom Up Integration Testing where an
individual module is tried frame a test outfit. Once an arrangement of individual module have
been tried they are then consolidated into an accumulation of modules ,known as assembles,
which are then tried by a moment test saddle. This procedure can proceed until the point that
the fabricate comprises of the whole application. By and by a blend of best down and base up
testing would be utilized. In a substantial programming venture being produced by various
sub-groups, or a littler venture where diverse modules were worked by people. The sub
groups or people would lead base up testing of the modules which they were developing
before discharging them to a coordination group which would gather them together for top-
down testing.
5.4 Unit Testing

Unit testing manages testing a unit overall. This would test the association of many capacities
yet bind the test to one unit. The correct extent of a unit is left to understanding. Supporting
test code, once in a while called Scaffolding, might be important to help an individual test.
This sort of testing is driven by the engineering and usage groups. This concentration is
Page 41
additionally called discovery testing on the grounds that exclusive the points of interest of the
interface are unmistakable to the test. Limits that are worldwide to a unit are tried here.
In the development business, platform is a transitory, simple to amass and dismantle, outline
set around a working to encourage the development of the building. The development
specialists initially construct the platform and afterward the building. Later the framework is
expelled, uncovering the finished building likewise, in programming testing, one specific test
may require some supporting programming. This product builds up can a right assessment of
the test occur. The platform programming may build up state and qualities for information
structures and giving sham outside capacities to the test. Distinctive platform programming
might be required frame one test to another test. Platform programming seldom is considered
piece of the framework.
A few times the platform programming winds up noticeably bigger than the framework
programming being tried. Typically the platform programming is not of an indistinguishable
quality from the framework programming and oftentimes is very delicate. A little change in
test may prompt considerably bigger changes in the framework.
Inner and unit testing can be computerized with the assistance of scope apparatuses. Dissects
the source code and created a test that will execute each option string of execution. Regularly,
the scope device is utilized as a part of a marginally extraordinary way. To begin with the
scope instrument is utilized to enlarge the source by putting data prints after each line of code.
At that point the testing suite is executed creating a review trail. This review trail is broke
down and reports the percent of the aggregate framework code executed amid the test suite.
On the off chance that the scope is high and the untested source lines are of low effect to
theframework's general quality, at that point not any more extra tests are required.
Page 42
5.5Unit Test cases
Sl. No TEST CASE INPUT EXPECTED RESULT STATUS

RESULT OBTAINED
1 Add User User details Successfully Successfully Pass

Registered Registered
2 Upload File Browse file Uploaded Uploaded Pass

from system Successfully Successfully
3 Download Select file to Downloaded Downloaded Pass

File download Successfully Successfully
4 Integrity Downloaded Integrity Integrity Pass

file blocks Status(Pass/False) Status(Pass/False)
5.6 System Test Cases
Page 43
Sl. No TEST CASE INPUT EXPECTED RESULT STATUS
RESULT OBTAINED
1 Upload File File Uploaded Uploaded Pass
Successfully Successfully
2 Download Selected File Downloaded Downloaded Pass
File Successfully Successfully
Conclusion
This segment deals with a couple sorts of testing, for instance, module testing that is a
framework for testing the exact working of a particular unit of the source code. This is
similarly suggested as unit testing. It also provides a brief knowledge regarding different sorts
of blend testing in that solitary programming units are joined and attempted as a social event.
This part similarly focuses on ensuring nature of the item.
Chapter 6
RESULT
The accompanying depictions layout the outcomes or yields that we are going to get
once regulated execution of the considerable number of modules of the framework.
Upload Process
Figure Upload Process Result
Page 44
While uploading the file, first step is break the file in small blocks based on given block size
after that hash code get generated for all blocks, while generating hash code it will check
whether it is new block of data or duplicate block of data based on hash code if hash code
matched with existing hash code means it is duplicate block of data and if it is not matching
means it is new data, all new block of data we will encrypt using AES encryption then we will
upload to the cloud drive. As graph showing the result if file size is less it will take less time
to upload and if file size is big it will take more time to execute.
Download Process
Figure Download Process Result

While downloading the file, first it will check how many blocks is there, after that it will start
downloading that that block from cloud drive. While downloading blocks from cloud drive it
will decrypt block content and after downloading the all blocks it will merge all block, to
Page 45
make a single file. So if file size is less it will take less time to download and file size is big it
will take more time to download.
Screenshot Results
Page 46
Screenshot-1 Admin Login Page
The screenshot-1 shows the Admin Login page.
Page 47
Screenshot-2 Admin Home Page
The screenshot-2 shows the admin home page.
Page 48
Screenshot-3 Admin Profile
The screenshot-3 shows the admin profile details.
Page 49
Screenshot-4 User Details page
The screenshot-4 shows the user details page which has the option to add new user, edit user
details and delete user account.
Page 50
Screenshot-5 Cloud Details
The screenshot-5 shows the cloud details page which consist cloud drive URL, folder name,
Login id and password.
Page 51
Screenshot-6 Hash Details
The screenshot-6 shows the Hash details. Hash is the unique code generated by hash
algorithm for every new block of data.
Page 52
Screenshot-7 Transaction Details
The screenshot-7shows the Transaction details.
Page 53
Screenshot-8 User Login Page
The screenshot-8 shows the User Login Page.
Page 54
Screenshot-9User Home Page
The screenshot-9 shows the User Home page.
Page 55
Screenshot-10 User Details
The screenshot-10 shows the User Details.
Page 56
Screenshot-11 File Upload
The screenshot-11 shows Upload File. Here we have to browse the file from system and click
on submit button. After submitting the file for upload file breaks in small block and each
block hash code get generated, that code first we will match with existing code in data base if
it is new code in that case we will update that hash code in database and encrypt that block
and upload to the cloud drive.
Page 57
Screenshot-12 File Download
The screenshot-11 shows the file download page. Here we have to select the file which one
we want to download then click on option download after that download process get started in
that first it will check how many block is there in that file and that blocks it will download
from cloud drive and merge all small block and make it single file.
Page 58
Screenshot-12 User Transaction Page
The screenshot-11 shows the user Transaction Page.
Page 59
CONCLUSION
Dynamic data ownership services is a basic and testing issue in secure deduplication over
mixed data in dispersed capacity. In this proposed system, we proposed a new secure data
deduplication intend to enhance procedure for the cloud data organization structure. In this
way, the proposed system enhances data assurance and security in conveyed capacity for
every customer who doesn’t have access regarding data. Name consistency is in like manner
guaranteed, while the arrangement empowers full ideal position to be taken of capable data
deduplication over encoded data. To the extent the correspondence cost, the proposed system
is more suitable than the existing system, while hash tag comparison our approach taking less
time compare to existing system. In this way, the proposed system achieves more secure and
well performing approach in dynamic storage for secure and powerful data deduplication.
Future Enhancement
 We should reduce comparison time while deduplication of data, because when data
block will increase it will take more time to compare with new block hash tag.
 For data security we should use Advanced Encryption Techniques. Like AES and
DES.
Page 60
REFERENCES
[1] D. T. Meyer, and W. J. Bolosky, “A study of practical deduplication,” Proc. USENIX

Conference on File and Storage Technologies, 2011.
[2] M. Dutch, “Understanding data deduplication ratios,” SNIA Data Management Forum,
2008.
[3] W. K. Ng, W. Wen, and H. Zhu, “Private data deduplication protocols in cloud
storage,” Proc. ACM SAC’12, 2012.
[4] M. W. Storer, K. Greenan, D. D. E. Long, and E. L. Miller, “Secure data
deduplication,” Proc. StorageSS’08, 2008.
[5] N. Baracaldo, E. Androulaki, J. Glider, A. Sorniotti, “Reconciling end-to-end
confidentiality and data reduction in cloud storage,” Proc. ACM Workshop on Cloud
Computing Security, pp. 21–32, 2014.
[6] P. S. S. Council, “PCI SSC data security standards overview,” 2013.
[7] D. Harnik, B. Pinkas, and A. Shulman-Peleg, “Side channels in cloud services, the
case of deduplication in cloud storage,” IEEE Security & Privacy, vol. 8, no. 6, pp.
40–47, 2010.
[8] C. Wang, Z. Qin, J. Peng, and J. Wang, “A novel encryption scheme for data
deduplication system,” Proc. International Conference on Communications, Circuits
and Ssytems (ICCCAS), pp. 265–269, 2010.
[9] Malicious insider attacks to rise, http://news.bbc.co.uk/2/hi/7875904.stm
[10] Data theft linked to ex-employees, http://www.theaustralian.com.au/australian-
it/datatheftlinked- to-ex-employees/story-e6frgakx-1226572351953,2002.
[11] J. R. Douceur, A. Adya, W. J. Bolosky, D. Simon, and M. Theimer, “Reclaiming space
from duplicate files in a serverless distributed file system,” Proc. International
Conference on Distributed Computing Systems (ICDCS), pp. 617–624, 2002.
[12] P. Anderson, L. Zhang, “Fast and secure laptop backups with encrypted de-
duplication,” Proc. USENIX LISA, 2010.
[13] Z. Wilcox-O’Hearn, B. Warner, “Tahoe: the least-authority filesystem,” Proc. ACM
StorageSS, 2008.
Page 61
[14] A. Rahumed, H. C. H. Chen, Y. Tang, P. P. C. Lee, J. C. S. Lui, “A secure cloud
backup system with assured deletion and version control,” Proc. International
Workshop on Security in Cloud Computing, 2011.
[15] J. Xu, E. Chang, and J. Zhou, “Leakage-resilient client-side deduplication of encrypted
data in cloud storage,” ePrint, IACR, http://eprint.iacr.org/2011/538.
[16] M. Bellare, S. Keelveedhi, and T. Ristenpart, “Message-locked encryption and secure
deduplication,” Proc. Eurocrypt 2013, LNCS 7881, pp. 296–312, 2013.Cryptology
ePrint Archive, Report 2012/631, 2012.
[17] S. Halevi, D, Harnik, B. Pinkas, and A. Shulman-Peleg, “Proofs of ownership in
remote storage systems,” Proc. ACM Conference on Computer and Communications
Security, pp. 491–500, 2011.
[18] M. Mulazzani, S. Schrittwieser, M. Leithner, and M. Huber, “Dark clouds on the
horizon: using cloud storage as attack vector and online slack space,” Proc. USENIX
Conference on Security, 2011.
[19] A. Juels, and B. S. Kaliski, “PORs: Proofs of retrievability for large files,” Proc. ACM
Conference on Computer and Communications Security, pp. 584–597, 2007.
[20] G. Ateniese, R. C. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson, and D.
Song, “Provable data possession at untrusted stores,” Proc. ACM Conference on
Computer and Communications Security, pp. 598–609, 2007.
[21] J. Li, X. Chen, M. Li, J. Li, P. Lee, and W. Lou, “Secure deduplication with efficient
and reliable convergent key management,” IEEE Transactions on Parallel and
Distributed Sytems, Vol. 25, No. 6, 2014.
[22] G.R. Blakley, and C. Meadows, “Security of Ramp schemes,” Proc. CRYPTO 1985,
pp. 242–268, 1985.
[23] J. Li, Y. K. Li, X. Chen, P. Lee, and W. Lou, “A hybrid cloud approach for secure
authorized deduplication,” IEEE Transactions on Parallel and Distributed Systems,
Vol. 26, No. 5, pp. 1206–1216, 2015.
[24] M. Bellare, S. Keelveedhi, T. Ristenpart, “DupLESS: Serveraided encryption for
deduplicated storage,” Proc. USENIX Security Symposium, 2013.
[25] M. Bellare, S. Keelveedhi, “Interactive message-locked encryption and secure
deduplication,” Proc. PKC 2015, pp. 516–538, 2015.
Page 62
[26] Y. Shin and K. Kim, “Equality predicate encryption for secure data deduplication,”
Proc. Conference on Information Security and Cryptology (CISC-W), pp. 64–70,
2012.
[27] X. Jin, L. Wei, M. Yu, N. Yu and J. Sun, “Anonymous deduplication of encrypted data
with proof of ownership in cloud storage,” Proc. IEEE Conf. Communications in
China (ICCC), pp.224-229, 2013.
[28] D. Naor, M. Naor, and J. Lotspiech, ”Revocation and tracing schemes for stateless
receivers,” Proc. CRYPTO 2001, Lecture Notes in Computer Science, vol. 2139, pp.
41–62, 2001.
[29] K. C. Almeroth, and M. H. Ammar, “Multicast group behavior in the Internet’s
multicast backbone (MBone),” IEEE Communication Magazine, vol. 35, pp. 124–129,
1997.
[30] Crypto++ Library 5.6.2, http://www.cryptopp.com/.
[31] S. Rafaeli, D. Hutchison, “A Survey of Key Management for Secure Group
Communication,” ACM Computing Surveys, Vol. 35, No. 3, pp. 309-329, 2003.
[32] L. Mingqiang, C. Qin, P.P.C. Lee, and J. Li, “Convergent Dispersal: Toward Storage-
Efficient Security in a Cloud-of- Clouds,” Proc. USENIX Conference on Hot Topics in
Storage and File Systems, 2014.
[33] L. Mingqiang, C. Qin, P.P.C. Lee, “CDStore: Toward Reliable, Secure, and Cost-
Efficient Cloud Storage via Convergent Dispersal Dispersal,” Proc. USENIX Annual
Technical Conference, pp. 120, 2015.
[34] J. Li, X. Chen, X. Huang, S. Tang, Y. Xiang, M. Hassan, and A. Alelaiwi, “Secure
Distributed Deduplication Systems with Improved Reliability,” IEEE Transactions on
Computer, Vol. 64, No. 2, pp. 3569–3579, 2015.
Page 63

Thesis Final 11-02-19

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Thesis Final 11-02-19

Uploaded by

Copyright:

Available Formats

ABSTRACT

In distributed storage administrations, deduplication innovation is generally used to lessen the

S. No. Topics Page No.

Figure No. Name of Figure Page No.

Table No. Table Name Page No.

JVM Java Virtual Machine

1.1 Problem Statement

1.3 Organization of the project report

The financial aspects of information deduplication make it more than convincing; it is

M. W. Storer, K. Greenan, D. D. E. Long, and E. L. Miller [4]has proposed about the

N. Baracaldo, E. Androulaki, J. Glider, A. Sorniotti[5] represented that Cloud reckoning has

J. R. Douceur, A. Adya, W. J. Bolosky, D. Simon, and M. Theimer[11] in this it's address

A. Rahumed, H. C. H. Chen, Y. Tang, P. P. C. Lee, J. C. S. Lui[14] has pictured that Cloud

Figure 2.1 Cloud Computing Architecture

3.1 General Description

3.1.1 Users Perspective

3.2 Non Functional Requirement

3.3 System Requirement

3.3.1 Hardware Requirement

3.3.2 Software Requirements

 Operating System : Win 2000/ XP

3.4.1 Technical Feasibility

3.4.2 Economic Feasibility

3.4.3 Operational Feasibility

3.5 Resource Requirement

Java stage gives you the accompanying elements:

Advantage of java technology:

Figure 3.5.2.1 Architecture of jsp model 1

Here there are a couple of issues we can do with JavaScript

 JDBC must be implemented on top of basic database interfaces

4.1 Introduction to System Analysis

4.1.2 System Analysis

4.2 Existing System

4.4 Proposed System

4.5 Advantages of the Proposed System

4.6 Project Module Description

4.6.2 User Modules

4.7 TECHNIQUE AND ALGORITHMS

Pseudo code for Tag generation Algorithm Map Reduce

2: key: document name

3: value: document contents

4: for each word w in value

1reduce(String key, Iterator values)

values: a list of counts

for each v in values:

Secure Auditing Uploading Process: Pseudo code

4.8 System Design

6. Component design-develops the detailed processing logic required to implement

4.9 System Architecture

4.10 Use Case Diagram

Figure 4.10.1.1 Admin Use Case Diagram

Figure 4.10.2.1 User Use Case Diagram

4.11.1 DFD Admin Session

Figure 4.11.1.1 DFD Admin Session Diagram

Figure 4.11.2.1 DFD User Session Diagram

Figure 4.11.3.1 DFD File Upload Process

Figure 4.11.4.1 DFD File Download Process

Figure 4.12.1 DFD Sequence Diagram Upload Process

Figure 4.13.1 Context Analysis Diagram

Fig.4.14.1 Flow Chart for Upload Process

5.2 Validation and System Testing

 Recovery Testing: Where the software is deliberately interrupted in a number of ways

 Performance testing: Where the performance requirements, if any, are checked.

 Usability Testing: The process of usability measurement was introduced in the