You are on page 1of 34


S7 R
Parity cloud service (PCS)
Modeling the parity cloud service
Feasibility of the PCS
The increasing amount of user data provides a
momentum leading to a large capacity storage
device, such as tera -scale HDD
Nowadays, tera-scale HDDs are widely used in
While large capacity storage devices afford
users sufficient space for user data
A failure of the device causes a great deal of
recovery cost or the loss of the whole user data
users generally equip another storage device
and backup their data
External Hard disk and usb drivers are
commonly used external storage devices
using portable storage devices cannot be a safe
way for personal data backup
A much more safe and reliable way for data
protection is using an expensive system such as
RAID or server as a private backup system
it is not reliable but provide strong privacy
They are not practical for personal use because
of both the cost and maintenance burden
Recently, lots of internet service providers
(ISPs) provide storage cloud services to their
Provides high reliability, convenience, and low
There are many users who believe that their
data can be revealed to others
several works pointed out that the privacy
protection is the major problem in cloud
computing adaptation
encryption-based data protection can solve
the problem
An authorized organization can decrypt the
encrypted user data when necessary
Reliability: The service should be able to
recover user data, when necessary, with
sufficiently high probability

Economical efficiency: The cost for data
recovery service should be sufficiently low to
increase the user accessibility
Convenience: General users should be able to easily
use the data recovery service.

Privacy protection: User data must not be exposed to
anyone other than the owner of the data, neither
intentionally nor unintentionally
Parity Cloud Service (PCS) that satisfies all of the
above criteria
PCS provides with a parity-based recovery
Extremely simple
Can completely relieve users of their concern
about privacy protection
Easy to use
Requires a reasonable server-side cost
Recover user data with sufficiently high
generates virtual disk in user system for private
data backup
makes parity group across virtual disks of
multiple users
stores the parity data of the parity group in the
cloud storage
Each user is enough to backup their files to their
own virtual disk for future data recovery
If original file is unavailable from the file
system, we can request to the PCS agent
software for the file recovery.

A PCS agent software, installed in each user
It creates a virtual disk on the user storage
The Virtual Disk Parity Group (VDPG), whose
parity data are stored in the cloud storage
Parity generation is managed by the PCS server.
1 .The Virtual Disk Interface : provides
user with interfaces to the virtual disk.
Users can backup their data to the virtual disk
via VDI
VDI provides users with general file copy and
delete operations
users can regard the virtual disk as a ordinary
storage volume
Recovery Manager : communicates with PCS
server in the cloud for data recovery.
Recovery Managers in the VDPG and the PCS
server collaborates both to generate parity
blocks for each data block in the group and to
recover a data block in the group
Storage Manager : creates and manages virtual
It stores data blocks from the VDI to the virtual
reads data blocks for Recovery Manager
maintains disk metadata and file metadata
The PCS server generates parity groups for each
virtual disk
It maintains metadata for each VDPG
Metadata, group ID, IP address of each member
machine and accounting information of each
collaborates with Recovery Managers to
generate parity blocks
stores the parity block to the cloud storage.

In this section there are three process

1. Initial Parity Generation
2. Parity Block Update
3. Data Block Recovery

The only operation used for parity generation
and data recovery is Exclusive-OR
(XOR), which is a simple mixing function in
XOR operation
AA=0 and A+0=A

At the time when a virtual disk is created by the
PCS agent, the seed block for the virtual disk is
PCS server sends the initialize message to each
Recovery Manager in the group
The initialize message contains predecessor
and successor
Indicates from/to whom the intermediate parity
block will be received/sent
(((((r S1) S2) S3) S4) r= S1 S2 S3 S4)
Final parity block = S1 S2 S3 S4
The initialization process occurs only once for
each parity group
seed block is stored in the metadata region of
each virtual disk
The Storage Manager in PCS agent maintains
parity generation bitmap (PG-bitmap)
The bitmap is initialized (set to 0) after the
initialization process
Zero indicates that parity block is not generated
for data block
Parity block for the data block is updated
corresponding parity block in PCS also be
When a block (Bold) in node I is to be updated
to a new block (Bnew)
Storage Manager refers to the corresponding
value in the PG-bitmap
If it is 0 then the Storage Manager generates an
intermediate parity block (Pt)
(Pt= Bnew Si) set PG Bitmap to 1
(Pt= Bnew Bold)
Recovery Manager sends an update request
it contains the updated block # and Pt, to the
PCS server
an update request message arrives from node I
PCS server identifies to which group node I
belongs and refers to the PG-bitmap of the

If it is zero, PCS server generates the parity
block (Pnew)
= PtP
= PtP

When a data block is corrupted the block can
be recovered using the parity block provided by
the PCS server
Assume that the n-th data block in node i has
been corrupted, B

Upon receiving message,PCS server identifies
the VDPG group
It generates Pr=Pn+r (if group size is even)
Pr=Pn (if group size is odd)

Each node generates Ej=B
+ r

Bin = Pr E1 Ei-1 Ei+1 E|VDPG|
A node can belong to multiple VDPGs for higher
recoverability and faster recovery time
As more vdpgs a node belongs to, the service
cost increases
As the group size increases, the service cost
The recoverability and recovery time become

Namely, there is a tradeoff between the
efficiency (recoverability + recovery time) of the
PCS and its service cost
It is important to decide both the number of
groups for a node and the group size,
For faster recovery time, it is recommended to
select nodes with the same time zone

Parity Cloud Service: A Privacy-Protected Personal Data
Recovery Service by Chi-won Song, Sungmin Park, Dong-
wook Kim, Sooyong Kang
2011 International Joint Conference of IEEE TrustCom-11/IEEE ICESS-

Cloud Security Alliance, Security Guidance for Critical
Areas ofFocus in Cloud Computing, 2011.