Professional Documents
Culture Documents
I NTRODUCTION
II introduces the system and threat model, and our design goals. Then we provide the detailed description
of our scheme in Section III. Section IV gives the security analysis and performance evaluation, followed by
Section V which overviews the related work. Finally,
Section VI gives the concluding remark of the whole
paper.
i
A
i
g
a
i
a
T
D
a
D
s
e
P ROBLEM S TATEMENT
2)
3)
4)
5)
retrieving a copy of the whole data or introducing additional online burden to the cloud users.
Storage correctness: to ensure that there exists
no cheating cloud server that can pass the TPAs
audit without indeed storing users data intact.
Privacy-preserving: to ensure that the TPA cannot derive users data content from the information collected during the auditing process.
Batch auditing: to enable TPA with secure and
efficient auditing capability to cope with multiple auditing delegations from possibly large
number of different users simultaneously.
Lightweight: to allow TPA to perform auditing
with minimum communication and computation overhead.
T HE P ROPOSED S CHEMES
TPA
1. Retrieve file tag t, verify its
signature, and quit if fail;
2. Generate a random challenge
Cloud Server
{(i,i )}iI
{,,R}
P
3. Compute = iI i mi , and also
Q
= iI ii ;
4. Randomly pick r Zp , and compute
R = e(u, v)r and = h(R);
5. Compute = r + mod p ;
R e( , g) = e((
sc
Y
H(Wi )i ) u , v)
(1)
i=s1
i=s1
sc
Y
(H(Wi )i ui mi ) , g)x
i=s1
sc
Y
H(Wi )i ) u , v)
i=s1
sc
Y
H(Wi )i ) u +r , v)
i=s1
sc
Y
H(Wi )i ) u , v)
= e((
= e((
i=s1
TPA
1. Verify file tag tk for each
user k, and quit if fail;
Cloud Server
For each user k (1 k K):
{(i,i )}iI
5. Compute k = rk + k k mod p ;
sc
X
i=s1
sc
Y
i
k,i
.
i=s1
K
Y
k , g) =
k=1
K
Y
e((
k=1
sc
Y
H(Wk,i )i )k uk k , vk )
i=s1
(2)
The batch protocol is illustrated in Fig. 3. Here the
left-hand side (LHS) of Equation 2 expands as:
LHS =
R1 R2 RK
K
Y
e(k k , g)
k=1
K
Y
k=1
K
Y
k=1
Rk e(k k , g)
3.6
e((
sc
Y
i=s1
H(Wk,i )i )k uk k , vk )
[21]. Hence, supporting data dynamics for privacypreserving public auditing is also of paramount importance. Now we show how to build upon the existing work [10] and adapt our main scheme to support
data dynamics, including block level operations of
modification, deletion and insertion.
In [10], data dynamics support is achieved by
replacing the index information i with mi in the
computation of block authenticators and using the
classic data structure Merkle hash tree (MHT) [22]
for the underlying block sequence enforcement. As
a result, the authenticator for each block is changed
to i = (H(mi ) umi )x . We can adopt this technique in our design to achieve privacy-preserving
public risk auditing with support of data dynamics.
Specifically, in the Setup phase, the user has to
generate and send the tree root T RMHT to TPA as
additional metadata, where the leaf nodes of MHT
are values of H(mi ). In the Audit phase, besides
{, , R}, the servers response should also include
{H(mi )}iI and their corresponding auxiliary authentication information aux in the MHT. Upon receiving
the response, TPA should first use T RMHT and aux
to authenticate {H(mi )}iI computed by the server.
Once {H(mi )}iI are authenticated, TPA can then perform the auditing
i )}iI } via EquaQ on {, , R, {H(m
i
is
now
replaced by
tion
1,
where
H(W
)
i
s1 isc
Q
i
.
Data
privacy
is
still
preserved
due
H(m
)
i
s1 isc
to the random mask. The details of handling dynamic
operations are similar to [10] and thus omitted.
3.7 Learning from
Though our scheme prevents the TPA from directly
deriving from , it does not rule out the possibility of offline guessing attack from the TPA using
valid from the response. Specifically, the TPA can
always guess whether the stored data contains certain
Qc
?
H(Wi )i )
message m,
by checking e(, g) = e(( si=s
1
u , v), where
is constructed from random coefficients chosen by the TPA in the challenge and the
guessed message m.
Thus, our main scheme is not
semantically secure yet. However, we must note that
HLA can be constructed by homomorphic identification protocols. One may apply the random masking technique we used to construct the corresponding zero knowledge proof for different homomorphic
identification protocols. Therefore, it follows that our
privacy-preserving public auditing system for secure
cloud storage can be generalized based on other complexity assumptions, such as factoring [23].
E VALUATION
4.1
Security Analysis
We need to prove that the cloud server cannot generate valid response for the TPA without faithfully
storing the data, as captured by Theorem 1.
Theorem 1: If the cloud server passes the Audit
phase, then it must indeed possess the specified data
intact as it is.
Proof: The proof consists of two steps. First, we
show that there exists an extractor of in the random
oracle model. Once a valid response {, } are obtained, the correctness of this statement follows from
Theorem 4.2 in [13].
Now, the cloud server is treated as an adversary.
The extractor controls the random oracle h() and
answers the hash query issued by the cloud server.
For a challenge = h(R) returned by the extractor, the
cloud server outputs {, , R} such that the following
equation holds.
R e( , g) = e((
sc
Y
H(Wi )i ) u , v).
(3)
i=s1
R e( , g) = e((
sc
Y
H(Wi )i ) u , v).
(4)
i=s1
t multiplications in group G.
t exponentiations
m-M ultExptG ()
t
P airG
1 ,G2
t
m-M ultP airG
1 ,G2
g ai ,
t m-term exponentiations
i=1
g ai .
460
300
460
300
411.00
270.20
407.66
265.87
507.79
476.81
504.25
472.55
160
40
160
40
for g G, |ai | = .
Qm
Qm
i=1
[13]
Sampled blocks c
e(ui , gi ).
e((
H(Wi )i ) , g x )e(u , g x )multi-user setting is very similar to that of Theorem 2,
i=s1
and thus omitted here. For the storage correctness
sc
Y
guarantee, we are going to reduce it to the single-user
(
H(Wi )i )x( ) ux( )
case. We use the forking technique as in the proof
i=s1
of Theorem 1. However, the verification equation
sc
Y
for the batch audits involves K challenges from the
i x( )
x( )
(
H(Wi ) )
u
random oracle. This time we need to ensure that all
i=s1
the other K 1 challenges are determined before the
sc
Y
x i
forking of the concerned random oracle response. This
(
(i /H(Wi ) ) )
can be done using the idea in [24]. As soon as the
i=s1
s
adversary issues the very first random oracle query
c
Y
e( , g) =
e( , g) =
sc
Y
ii )
=
=
i=s1
ux(
ux(
sc
X
i=s1
mi i ) =
sc
Y
R e(( i=s
H(W
)
)
u
,
v)/e(
, g). Finally, backi
1
patch = h(R) since the simulator is controlling the
random oracle h(). We remark that this backpatching
technique in the random oracle model is also used in
the proof of the underlying scheme [13].
4.1.3 Security Guarantee for Batch Auditing
Now we show that our way of extending our result to
a multi-user setting will not affect the aforementioned
4.2
Performance Analysis
We begin by estimating the cost in terms of basic cryptographic operations, as notated in Table 1. Suppose
there are c random blocks specified in the challenge
10
520
510
500
500
individual auditing
batch auditing (c=460)
batch auditing (c=300)
480
460
440
420
490
480
470
460
450
440
individual auditing
batch auditing (c=460)
batch auditing (c=300)
430
420
400
50
100
150
200
message chal during the Audit phase. Under this setting, we quantify the cost introduced of the privacypreserving auditing in terms of server computation,
auditor computation as well as communication overhead.
On the server side, the generated Q
response includes
i
an aggregated authenticator =
G1 , a
iI i
r
random factor R = e(u, v) GT , and P
a blinded linear
combination of sampled blocks = iI i mi + r
Zp , where = h(R) Zp . The corresponding computation cost is c-M ultExp1G1 (|i |), Exp1GT (|p|), and
Hash1Zp + AddcZp + M ultc+1
Zp , respectively. Compared
to the existing HLA-based solution for ensuring remote data integrity [13]1, the extra cost for protecting
the user privacy, resulted from the random mask R,
is only a constant: Exp1GT (|p|) + M ult1Zp + Hash1Zp +
Add1Zp , which has nothing to do with the number of
sampled blocks c. When c is set to be 300 to 460 for
high assurance of auditing, as discussed in Section 3.4,
the extra cost for privacy-preserving guarantee on the
server side would be negligible against the total server
computation for response generation.
Similarly, on the auditor side, upon receiving
the response {, R, }, the corresponding computation cost for response validation is Hash1Zp + cM ultExp1G1 (|i |) + HashcG1 + M ult1G1 + M ult1GT +
2
Exp3G1 (|p|) + P airG
, among which only Hash1Zp +
1 ,G2
1
2
ExpG1 (|p|) + M ultGT account for the additional constant computation cost. For c = 460 or 300, and considering the relatively expensive pairing operations,
this extra cost imposes little overhead on the overall
cost of response validation, and thus can be ignored.
For the sake of completeness, Table 2 gives the experiment result on performance comparison between
our scheme and the state-of-the-art [13]. It can be
1. We refer readers to [13] for a detailed description of their HLAbased solution.
410
10
12
14
16
18
11
R ELATED WORK
Ateniese et al. [8] are the first to consider public auditability in their defined provable data possession
(PDP) model for ensuring possession of data files on
untrusted storages. Their scheme utilizes the RSAbased homomorphic linear authenticators for auditing
outsourced data and suggests randomly sampling a
few blocks of the file. However, the public auditability
in their scheme demands the linear combination of
sampled blocks exposed to external auditor. When
used directly, their protocol is not provably privacy
preserving, and thus may leak user data information
to the auditor. Juels et al. [11] describe a proof
of retrievability (PoR) model, where spot-checking
and error-correcting codes are used to ensure both
possession and retrievability of data files on remote archive service systems. However, the number
of audit challenges a user can perform is fixed a
priori, and public auditability is not supported in their
main scheme. Although they describe a straightforward Merkle-tree construction for public PoRs, this
approach only works with encrypted data. Dodis et
al. [25] give a study on different variants of PoR
with private auditability. Shacham et al. [13] design an
improved PoR scheme built from BLS signatures [17]
with full proofs of security in the security model
defined in [11]. Similar to the construction in [8], they
use publicly verifiable homomorphic linear authenticators that are built from provably secure BLS signatures. Based on the elegant BLS construction, a compact and public verifiable scheme is obtained. Again,
their approach does not support privacy-preserving
auditing for the same reason as [8]. Shah et al. [9], [14]
propose allowing a TPA to keep online storage honest
C ONCLUSION
ACKNOWLEDGMENTS
This work was supported in part by the US National
Science Foundation under grant CNS-0831963, CNS0626601, CNS-0716306, and CNS-0831628.
12
A PPENDIX A
Z ERO K NOWLEDGE P UBLIC AUDITING
Here we present a public auditing scheme with provably zero knowledge leakage. The setup phase is
similar to our main scheme presented in Section 3.4.
The secret parameters are sk = (x, ssk) and the public
parameters are pk = (spk, v, g, u, e(u, v), g1), where
g1 G1 is an additional public group element.
In the audit phase, upon receiving challenge
chal = {(i, i )}iI , the server chooses three random
elements rm , r , Zp , and calculates R = e(g1 , g)r
e(u, v)rm GT and = h(R) Zp . Let de
note
P the linear combination of sampled blocks =
i mi , and denote the aggregated authenticator
iIQ
= iI ii G1 . To make the auditing scheme with
zero knowledge leakage, the server has to blind both
and . Specifically, the server computes: = rm +
mod p, and = g1 . It then sends {, , , R} as
the response proof of storage correctness to the TPA,
where = r + mod p. With the response from
the server, the TPA runs VerifyProof to validate
the response by first computing = h(R) and then
checking the verification equation
?
R e( , g) = e((
sc
Y
i=s1
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
i=s1
R EFERENCES
[1]
[2]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]