You are on page 1of 12

Information Processing and Management 57 (2020) 102355

Contents lists available at ScienceDirect

Information Processing and Management


journal homepage: www.elsevier.com/locate/infoproman

Blockchain-based privacy-preserving remote data integrity


T
checking scheme for IoT information systems
Quanyu Zhaoa, Siyi Chena, Zheli Liub, Thar Bakerc, Yuan Zhang
⁎,a

a
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China
b
College of Cyber Science, College of Computer Science, Tianjin Key Laboratory of Network and Data Security Technology, Nankai University, China
c
Department of Computer Science, Liverpool John Moores University, United Kingdom

ARTICLE INFO ABSTRACT

Keywords: Remote data integrity checking is of great importance to the security of cloud-based information
Data integrity checking systems. Previous works generally assume a trusted third party to oversee the integrity of the
Privacy outsourced data, which may be invalid in practice. In this paper, we utilize the blockchain to
Blockchain construct a novel privacy-preserving remote data integrity checking scheme for Internet of
Public verifiable
Things (IoT) information management systems without involving trusted third parties. Our
Security
scheme leverages the Lifted EC-ElGamal cryptosystem, bilinear pairing, and blockchain to sup-
port efficient public batch signature verifications and protect the security and data privacy of the
IoT systems. The results of the experiment demonstrate the efficiency of our scheme.

1. Introduction

With the prolifieration of the Internet of Things (IoT) devices, including the including computers, cell phones, smart IoT
equipments, etc., people are generating tremendous amounts of data every second. Storing and managing these data are bringing
increasing challenge to ordinary Data Owners (DOs). Therefore, more and more DOs of IoT systems divert to cloud storage services
provided by big companies such as Amazon, Google, etc, and run their information management systems in a remote manner.
Transferring data to the cloud servers is an effective method to reduce the storage pressure. However, the DO lose direct control of
its data, which might lead to the security risks and privacy breaches. For instance, the cloud servers delete some low-inquiry or low-
value data to reduce the storage cost, modify the primary data to cater the research departments’requirement. The individual’s data
always include sensitive or important information such as the medicine information, health information, the contract information etc.
Then, it will be a terrible disaster for DO if the data are sold, modified, destroyed or deleted intentionally or unintentionally. Some
recent surveys (Alfandi, Otoum, & Jararweh, 2020; Ateniese, Burns, & Curtmola, 2011; Ateniese & Curtmola, 2007; Blum, Evans, &
Gemmell, 1994) show that the integrity of the outsourced files is one of the major security concerns in the data storage management.
Apparently, remote checking the integrity of the data is a popular method of guaranteeing both the privacy and the security.

1.1. Remote data integrity checking

An excellent remote data checking scheme should have the following merits. Firstly, the most important ones are individual data
security and privacy which guarantees that nobody can obtain the content of private data without permissions of DOs. If the

Corresponding author.

E-mail addresses: quanyu.zhao1105@gmail.com (Q. Zhao), yysfs1y1@gmail.com (S. Chen), liuzheli@nankai.edu.cn (Z. Liu),
t.baker@ljmu.ac.uk (T. Baker), zhangyuan@nju.edu.cn (Y. Zhang).

https://doi.org/10.1016/j.ipm.2020.102355
Received 18 March 2020; Received in revised form 3 July 2020; Accepted 4 July 2020
Available online 18 July 2020
0306-4573/ © 2020 Elsevier Ltd. All rights reserved.
Q. Zhao, et al. Information Processing and Management 57 (2020) 102355

individual data has modified or destroyed, this situation can always be found in the checking phase. Nextly, the operation right of the
data should be controlled by the DO, and it should support dynamic data updating operations including modifications, insertions and
deletions. In addition, the remote data integrity checking scheme should against the internal or external attacks, such as procras-
tinating auditors, malicious cloud servers, etc. Last but not least, the efficiency of the integrity checking scheme is also a key factor for
the practice application in the data management.
Uploading the huge amounts of individual’s data to the cloud servers, the DO will delete the local files to relieve the storage stress.
With the limits of bandwidth, it is usually impossible to download the entirety of the data to check its integrity. Then, the auditors in
the existing remote data integrity checking schemes always employ probabilistic verification to check the integrity of the data. The
probabilistic verification method chooses a proportion of data for checking once. This probabilistic verification checking method
guarantees the security, meanwhile, it satisfies the practice application requirements.

1.2. Related work

With the value of outsourced data becoming more and more important, the distributed and online storage systems are facing
greater security and privacy challenge. Checking the integrity of the data is a method for protecting the security and privacy. The first
remote data integrity checking was proposed by Blum et al. (1994). The process of checking the integrity of the data didn’t reveal any
information about the entire data. The cloud servers were assumed to be trustless in the Provable Data Possession (PDP) schemes
(Ateniese et al., 2011; Ateniese & Curtmola, 2007). The DO divides the data into some metadata, stores the files and the metadata on
the cloud servers, and deletes the local storage files. A small part of the metadata was chosen randomly to check the integrity of the
data without the entire file and the summary of the data. Checking the correctness of the response from the storage server is
equivalent to guaranteeing the unaltered files. This probabilistic verification method is suitable for checking operation and reducing
computing and communication costs.
Another PDP scheme (Ateniese, Pietro, & Mancini, 2008) supports dynamical updating their files after they stored their data on
the cloud servers, the operation includes the updating, modification, deletion, and appending. It offered an efficient way to check the
integrity of the data with a weakness of querying the data in limited times and not supporting the insertion operation. Erway, Kp, and
Papamanthou (2015) used Merkle Hash Tree (MHT) to improve the previous PDP schemes. However, Wang, Wang, and Ren (2011)
point out that MHT tree cannot verify the indices of the data block, and it might suffer the replace attack. A few mechanisms
(Barsoum & Hasan, 2015; Erway et al., 2015; Wang, Wang, & Li, 2009; Wang et al., 2011; Yang & Jia, 2013; Yu, Li, & Ni, 2016; Zhu,
Ahn, & Hu, 2013) were proposed to check the integrity of the outsourced data with the ability of the dynamical operation. Traditional
cryptography techniques, such as message authentication codes and digital signatures, cannot be used to audit the integrity of the
data. Then, integrity checking of remote data on the cloud servers is a big challenge and attracts more and more researchers. Some
schemes (Ateniese, Kamara, & Katz, 2009; Boneh, Lynn, & Shacham, 2001; Li, Huang, & Liu, 2019; Li, Zhang, & Liu, 2016; Umair,
Asim, & Thar, 2020; Wang, He, & Tang, 2016; Yu, Au, & Ateniese, 2017; Zhou, 2019) employed the cryptography techniques such as
the public key and bilinear pairing, homomorphic encryption, digital signature to establish the mechanisms.
Batch verification can reduce the computational cost and the communication overhead of transmitting integrity tags. (Shen, Liu, &
Chen, 2019) employs the batch verification to establish the secure real-time traffic data aggregation for vehicular cloud. Hu, Yeh, and
Liao (2019) proposes the autonomous and malware-proof blockchain-based firmware update platform, it is similar to other schemes
(Cui, Zhang, & Zhong, 2017; Shen, Zhang, & Shen, 2017) which adopt the batch verification to increase the efficiency. We also
employ this technology to enhance the efficiency of checking in our scheme.
Usually, there are always two participants in the data checking scheme, including DO and the cloud servers. DO checks the
integrity of their data by carrying out a ”two-parts” remote data checking protocol based on the fog computing (Huang, Li, & Liu,
2019; Tariq, Asim, & Obeidat, 2019). However, the audit result from either the DO or the cloud servers might be regarded as
unreliable result for another participant. Then, a third party auditor with the access and the capacity of verifying the integrity of the
outsourced data is employed to achieve the public verification in some protocols (Wang et al., 2011; Zhu et al., 2013). However, the
third party increases the privacy and the security threat in the auditing phase. They may reveal some information for some benefits in
the absence of others, or be curious of others’ information. The owners’ private information or sensitive information is disclosed
either actively or passively since the auditor may collude with the cloud servers or other attackers. Finding a trusted third party to
audit the integrity of the data without revealing privacy is a big challenge in practice. Some previous works (Ateniese & Curtmola,
2007; Erway et al., 2015; Gan, Wang, & Fang, 2018; Ristenpart, Tromer, & Shacham, 2009; Yu, Zhang, & Ni, 2015) have solved this
problem. Two PDP schemes (Ateniese & Curtmola, 2007) based on the RSA number verify the integrity of the data without the third
party, and both of them have a high computation cost. DPDP (Erway et al., 2015) proposed an efficient auditing scheme to verify the
integrity of the data with the ability of supporting the dynamic updating.
The situation that a fix auditor may be corrupted, is not conducive to the integrity of the data. The public verification is one of the
most characteristics which means an external auditor or each entity with the ability of verifying the integrity of the data can play a
role as the auditor in the auditing scheme. Ateniese et al proposed two PDP schemes (Ateniese & Curtmola, 2007) to achieve public
verification firstly, and some works (Li et al., 2016; Wang et al., 2009; Zhu et al., 2013) based on their constructions provided some
other characters in additional public verification.
The privacy of the outsourced data in the remote data integrity checking scheme (Wang, Wang, & Ren, 2010) was defined by the
requirements that the verifier cannot gain the entire blocks from the verification process. However, this definition of privacy isnnot
widely accepted since each block may contain some confidential or sensitive data and it cannot be captured by others.
Wang, Chow, and Wang (2013) presented the concept of zero knowledge public auditing, which means that the verifier gains nothing

2
Q. Zhao, et al. Information Processing and Management 57 (2020) 102355

except the auditing information. In other words, the verifier attacks the audition data by guessing attack, and the probability of
success is close to 0. Nevertheless, this work did not provide a formal security model.
Recently, a few schemes (Ateniese et al., 2009; Boneh et al., 2001; Erway et al., 2015; Gan et al., 2018; Li et al., 2016; Liu, Li, & Li,
2019b; Ristenpart et al., 2009; Wang et al., 2016; Wang et al., 2011; Yang & Jia, 2013; Yu et al., 2017; Yu et al., 2016; Yu et al., 2015;
Zhu et al., 2013) were proposed to protect the privacy and security of the outsourced data by using a trusted third party. However,
existing the trusted third party in the information system will increase the communication and computation complexity of the
network. Under most conditions, it is difficult to find an always online trusted third party in practice, and it also brings the risks of
privacy disclosure from the malicious third party auditor. Even through there is an always online trusted third party in practice. They
may be single point of failure or suffer from the attack from the adversaries. This situation often happens in practice. Due to these
shortcoming from the trusted third party, some works (Ateniese & Curtmola, 2007; Butt, Iqbal, & Salah, 2019; Erway et al., 2015;
Hua, Tang, & Pan, 2017; Liu, Wang, & Wang, 2019a; Tseng, Wong, & Otoum, 2020; Yu et al., 2015) devote to remove the trusted third
party from these schemes. However, these works mainly rely on complicated secure multiparty compution (SMC) techniques and are
not suitable for IoT systems that involves devices with limited computation resources.
Considering blockchains’ excellent advantages in terms of preventing data loss and illegal modificaitons, we design a novel
privacy-preserving remote data integrity checking scheme for IoT information systems based on the blockchain. The contributions of
this paper are summarized as follows.
• In this paper, we propose blockchain-based privacy-preserving remote data integrity checking scheme without TTP, then we
emphasize that our scheme is immune to privacy leakage from the third party and to collusion attacks of the cloud servers and the
third party.
• Our scheme establishes a model based on the blockchain technology (a public distributed ledger). Once reaching the consensus,
everyone, including the verifier, has the access to query the proof of the data for unlimited times from the blockchain, and cannot
manipulate the data on the blockchain.
• Our scheme provides a solution to ensure the privacy-preserving public verification as well as the batch verification for multi-
user data or multi-data simultaneously. It is secure against the malicious server and immune to the delayed auditor. Otherwise, it
reduces the cost of the communication and storage.
• The security analysis and experiment results demonstrate that this protocol is suitable for a secure and practical real-world
application.
The rest of the paper is organized as follows. Some preliminaries are presented in Section 2. We put forward the system model in
the Section 3. Our construction and its analysis are proposed in the Section 4. Section 5 shows the performance evaluation of our
scheme, and Section 6 concludes our work.

2. Preliminaries

We utilize some techniques, including Lifted EC-ElGamal cryptosystem, bilinear pairing and aggregated signature, to establish
blockchain-based privacy-preserving remote data integrity checking scheme without TTP for IoT. For the complete of our scheme,
some preliminaries are detailed as follows.

2.1. Lifted EC-ElGamal cryptosystem

A few works (Desmedt, 1994; ElGamal, 1985; Liu, Guo, & Fan, 2018) used the elliptic curve group to establish the Lifted EC-
ElGamal cryptosystem. The Lifted EC-ElGamal cryptosystem is always constituted by three components, including Key Generation,
Encryption, Decryption.

(1) Key Generation. Use an elliptic curve group E(Fq) with the order q, the generator G, the private and public key (X, Y) are
generated by computing Y = X ·G .
(2) Encryption. The data m ∈ L, L = {0, 1, , t }, (t ≪ q), DO runs the encryption algorithm to obtain the ciphertext
C = (c, c ) = (r·G, m ·G + r· Y ), where r is a random number from Zq*.
(3) Decryption. The data m can be recovered using the Pollard’s lambda algorithm (Grissa, Yavuz, & Hamdaoui, 2017) with the time
complexity O ( t ) . The ciphertext was decrypted by solving the Eq. (1).

m = logG (c x·c ) (1)


Due to the character of homomorphism in encrypted and decrypted algorithm, the encryption method and the ciphertext can be
batch encrypted and decrypted by using the Eqs. (2) and (3) respectively. Then, the DO downloads some ciphertexts and recovers
the plaintext in the aggregated form or the individual form.

C3 = C1 + C2 = (c1 + c2, c1 + c2 ) = (r3·G, (m1 + m2)·G + r3·Y ) (2)

m = m1 + m2 = logG [(c1 + c2 ) x·(c1 + c2)] (3)

3
Q. Zhao, et al. Information Processing and Management 57 (2020) 102355

2.2. Bilinear pairing

A bilinear pairing (James, Gayathri, & Reddy, 2018) maps two Gap Diffie-Hellman group G1, G2 elements to another multi-
plicative cyclic group G3 element. Where g1, g2 are the generators of the G1 and G2 with the same order q. Therefore, three properties
of the bilinear pairing have been listed as follows:

• Bilinear. ∀a, b ∈ Z , e (g , g ) = e (g , g ) .
P
a b ab

• Non-degenerate. e(g , g ) ≠ 1.
1 2 1 2
1 2

• Efficient computation. e(v , v ) can be solved in the polynomial time for ∀v


1 2 1 ∈ G1, ∀v2 ∈ G2.

2.3. Aggregated signature

The aggregated signature is a digital signature with the ability of converging some signatures into one signature and batch
verifying its, such as CL-signature scheme (Camenisch & Lysyanskaya, 2004). The CL-signature scheme includes three algorithms
described which were listed as follows.

(1) Key Generation. Based on an elliptic curve group E(Fq) with the order q, the generator g1, g2, g3, hash function H( · ). User Ui,
(i = 1, 2, , n) generates the private and public key (xi, yi), satisfied yi = x i · g1 .
(2) Signature. User Ui computes the signature σi by the Eq. (4), where hi = H (m1)

i = x i ·g2 + x i · hi ·g3 (4)

(3) Batch verification. Using the signature σi and m, we verified the signature by solving the Eq. (5).
n n n
e( i, g1) = e (g2 , Yi )· e (g3, H (Mi)· Yi )
i=1 i=1 i=1 (5)

3. System model

We consider the data outsourced server that the DO has a large number of files beyond the ability of the local storage which
should rent the storage space from the cloud. The cloud servers are equipped with enough storage space and computation resources.
The cloud servers may destroy the individual data unintentionally or intentionally. The DO, the cloud servers or the external auditors
who have expertise and capabilities of doing the verification work, can play the role of the auditor to check the integrity of the data

Fig. 1. The model of the data checking.

4
Q. Zhao, et al. Information Processing and Management 57 (2020) 102355

for an unlimited number of times, and the auditor will receive some rewards for the verification work. Moreover, we take the privacy
and security of the data into account. The architecture of the remote data checking auditing is illustrated in Fig. 3, which includes
four entities, including the DO, the cloud servers, the key generate center (KGC) and the auditors. The KGC runs the algorithm of
generating the key for the entities, and publishes some public information.

3.1. Trust model

This work focuses on the outsourced data integrity checking in the cloud environment. In order to make the model more suitable
for practical application in the information management system, some assumptions are put forward as follows.
The DO is semi-honest. The outsourced data was stored in the cloud servers, and the DO paid the fee for the storage. The DO may
intend to reduce the fee of the data storage, and they may cheat the cloud servers in this method which they may pretend to be
someone else and apply for the space storage.
The cloud servers are semi-honest. They are curious of the content of the outsourced data, and may sell the data to the other
departments for some benefits. Moreover, they may increase the storage space of data for an additional fee. Last but not least, they
always neglect the destroyed data, do not inform the DO to take the corresponding measures to reduce the damage and try their best
to cheat the auditor or the DO. The cloud servers always generate a valid response for passing the verification without being detected
when the data are destroyed.
The auditor is a semi-honest entity, who is curious about the content storage data, and may leak the information of the individual
data. They use all their resources and computation power to obtain some useful information. Using the additional resource and
computation to audit the outsourced data is another way to gain some benefits.

3.2. Design goals

To ensure the privacy and security of the outsourced data, we aim to propose a protocol for verifying the integrity of the data
which achieves the following goals.
Correctness: The protocol ensures that the cloud servers cannot pass the audit if they modify or erase the data stored by DO.
Privacy: The auditor gains no more information than the integrity checking result.
Dynamic updating: The protocol should allow the DO to perform updating operations including the modifications, insertions,
deletions and appending on its own data.
Public verification: The auditor, the DO, the cloud servers or an external auditor has the ability to check the integrity of the
outsourced data.
Security: Operations including data modification, insertion, deletion, appending, and destroying can be performed only after
being authorized by the DO.

4. Our construction and its analysis

In this section, we put forward our construction primitively. Nextly, we show that our scheme achieves the properties of the
correctness, security, privacy and dynamical updating.

4.1. Our construction

Our construction consists of three phases, including setup, storage, and verification phase. The detail of each phase is described as
follows.
Setup phase
Based on the elliptic curve group E(Fq), the KGC gains the public parameter {E (Fq), q, g1, g2 , Zp*, t } and uploads it to the blockchain.
The symbol q is the order of the group and g1, g2 are the generators of the group. Zp* represents an additive group and a set
A = {0, 1, 2, , t }, which t ≪ q.
The DO stores the large number of files on the cloud servers. Then, the DO and KGC executes the key generation algorithm in the
Lifted EC-ElGamal cryptosystem to obtain the private key xi and the corresponding public key Yi by computing Yi = x i · g1. The cloud
servers gain the private key and the corresponding public key (xc, Yc) by using this method.
Storage phase
In this phase, the DO divides those files into some data blocks, encrypts the blocks into the ciphertexts, signs it, and sends the
ciphertexts and signatures to the cloud servers. The cloud servers verify the signatures and stores the signatures and ciphertexts.
Thereafter, the cloud servers generate the abstracts of the ciphertexts and uploads the abstracts to the blockchain. It is detailed as
follows.
Step 1. Due to the limited storage space, the DO rents the external space to store his data. Then he divides his data M into some
data blocks mj, (1 ≤ j ≤ n) by using the RS code (Oh, Ha, & Park, 2016), and the blocks are named fj, (1 ≤ j ≤ n), denoted the set of
the data blocks by L = {m1, m2, , mn} .
Step 2. For each data block mj, the DO chooses a random number r j ZP* , and encrypts the data mj into the ciphertext Ci = (ci1, ci2 )
by computing the matrix 6 and 7.

5
Q. Zhao, et al. Information Processing and Management 57 (2020) 102355

g1
g1
Ci1 = (c11, c12, , c1n) = (r1, r2, , rn)
g1 (6)
Ci2 = (c21, c22, , c2n)
Y1 g1
Y1 g1
= (r1, r2, , rn ) + (m1, m2, , mn )
Y1 g1 (7)
Step 3. When the ciphertext Ci is generated, the signature Sigi = (sigi1, sigi2, , sigin) of the data mj, (j = 1, 2, , n) is generated by
the Eq. (8).
Sigij = x i H1 (IDi Yi fnj c1j c2j ) g2 (8)
where IDi is the identity of the DO, the file name is represented by fnj, and the secure hash function is represented by H1( · ).
Step 4. After gaining the signature and the ciphertext of the data mj, the DO computes and uploads the ciphertext
{IDi , Sigij , Ci, fnj }Yc to the cloud servers, and pays for the storage.
Step 5. In this step, the cloud servers verify the validity of the fee. Thereafter, the cloud servers verify the signatures of the files by
checking the Eq. (9).
e (Sigij , g1)=e (g2 , H1 (IDi Yi fnj c1j c2j ) Yi ) (9)
Due to the homomorphic character of the signature algorithm, the signature can be batch verified by the following Eq. (10).
n n
e( Sigj , g1)=e (g2 , H1 (IDi Yi fnj c1j c2j ) Yi )
j=1 j=1 (10)
Step 6. The cloud servers obtain the tags τj of each ciphertext Cij, (1 ≤ j ≤ n) by the Eq. (11).

j = x c H1 (IDi Yi fnj c1j c2j rc, j ) g2 (11)


Where rc, j ZP* is a random number. Then the cloud servers upload {IDi, τj, fnj, rc,j} to the blockchain. Moreover, the cloud servers
store {IDi, τj, Ci, fnj, rc,j}.
Step 7. The DO batch verifies the information {IDi, τj, fnj, rc,j}, (1 ≤ j ≤ n) in Eq. (12). The DO deletes the local blocks mj,
(1 ≤ j ≤ n), if it is correctness.
n n
e( j, g1)=e (g2 , YC H1 (IDi Yi fnj c1j c2j rc, j ))
j=1 j =1 (12)
Verification phase
When the data owner downloads the data or desires to check the integrity of the individual data, they will publish the checking
requirement on the taskbar. Then, the internal (include DO and the cloud servers) or external entities will regard as the auditor to
launch the integrity verification of the files. The auditing process can be narrated as follows.
Step 1. The auditor queries the blocks from the cloud servers. Then the cloud servers send the ciphertext Cj, (1 ≤ j ≤ n) to the
auditor.
Step 2. The auditor queries the blockchain, obtains {IDi, τj, fnj, rc,j}, (1 ≤ j ≤ n), and verifies the signatures of the blocks by the
Eq. (13).
n n
e( j, g1)=e (g2 , Yc H1 (IDi Yi fnj c1j c2j rc , j ))
j=1 j=1 (13)
If the equation is correct, it means the cloud servers do not modify the blocks. Thereafter, the auditor uploads the checking trans-
action including the checking text, result, time and the auditor’s signature to the blockchain.

4.2. Analysis

Therefore, we formally prove the correctness, security, privacy and discuss the dynamical analysis. The details of its analysis are
described as follows.
Correctness
The DO rents the external space to store his/her data. The correctness of the data is very important for the DO since it may include
some important or sensitive information such as contract information, healthy information, financial information and so on.
Lemma 1. The correctness of our scheme means the cloud servers’ illegal operation can pass the checking with a negligible probabilistic.
Proof. The auditor downloads {IDi, τj, fnj, rc,j} from the blockchain, and gains the ciphertext from the cloud servers. If the Eq. (12) or

6
Q. Zhao, et al. Information Processing and Management 57 (2020) 102355

the Eq. (13) holds, the auditor believes the signatures and the information stored on the blockchain can be modified with a negligible
probabilistic.
The cloud servers modify, destroy, delete, insert on the data without the DOs’ permission, he will try the best to cheat the auditor
for passing the checking. It means that the cloud servers should gain another random number r j1 ZP* and compute the tags τij by the
Eq. (14).

ij = x c H1 (IDi Yi fnj c1j c2j r j,1) g2 (14)


If the tags j = ij and rj1 ≠ rj simultaneously, then the cloud servers pass the checking. However, it is infeasible for the cloud servers
to find rj1 by using the PPT arithmetic such that j = x c H1 (IDi Yi fnj c1j c2j r j,1) g2 . If the signature is changed, the illegality operation
will be found with un-negligible probabilistic. Then, the cloud servers’ illegal operation can pass the checking with a negligible
probabilistic. This concludes the proof of the Lemma 1. □
In the proposed protocol, the integrity of the data can be checked. If the data was modified, destroyed, deleted and inserted, the
tags cannot pass the checking process. Moreover, the checking transaction will be uploaded to the blockchain, which is a distribute
ledge storing on all nodes. If the auditor cheats the DO in the checking process, the checking transaction will be tracked in the next
verification phase.
Security
Before proving the security of the proposed protocol, we review the ECC encryption algorithm and the hash function in detail.
ECC encryption algorithm (Gen, Enc, Dec) is an asymmetric indistinguishable encryption algorithm with the presence of the
attackers. For the whole probabilistic polynomial-time adversaries A and all i, there exists a negligible probabilistic ε such that
Pr [A (1i , Deck (Ci )) = m] (15)
The adversaries acts as cloud servers, auditors or others external entities. The probability is taken over the adversaries A, the random
choice of the data m and the key k, and any random choice in the decryption process.
Except the ECC encryption algorithm, we employed hash function and bilinear pairing to establish the proposed scheme. Hash
function y = H (x ) is a one-way function with the character of gaining y from x easily. On the opposite side, it is infeasible to obtain x
from y. Obtaining the fixed length of the output from any length of input is another character of the hash function. It is infeasible to
find two different input messages with the same output.
Lemma 2. The proposed protocol is secure against the adversaries from internal or external with the polynomial-time attack algorithm.
Proof. As we all know, ECC encryption algorithm is a difficult problem that the adversaries have no polynomial-time attack algorithm
to work on it. However, gaining the plaintext from the ciphertext in our scheme is equated with solving difficult discrete logarithms
problem since the ECC encryption algorithm is one of the components of our scheme. Then, the plaintext cannot be obtained
successful by the adversaries from internal or external with the equipment of the polynomial-time algorithm and good performance
hardware.
On the other hand, the ciphertext is generated by using DO’s public key and carrying out the ECC encryption algorithm. Then, it
can’t be decrypted unless using DO’s private key or attack the ECC encryption algorithm successfully. In other word, nobody, except
the DO, can gain the plaintext. The DO obtains the aggregation data or single data by computing the Eq. (16).
n n
mj = log g1 (cj x i · cj ) = log g1 ( (mj ·g1 + r j·Y1 xi r j g1))
j=1 j=1 (16)
This concludes the proof of the Lemma 2. □
Privacy
The meaning of the privacy in this protocol includes two sides of the interpretations. One of the interpretations is that the cloud
servers cannot obtain any information in the storing phase. The other one is that no auditor gains any information in the verification
phase.
Lemma 3. Under the semi-honest model, no entity gains any information about the DO’s data.
Proof. Primarily, the signature Sigj comes from the identity DO, the file name, ciphertext and the private key. The cloud servers verify
the legality of the signature by using the information {IDi, Sigj, fnj, Cj} with the tool of the bilinear pairing. Due to the peculiarity of
the bilinear pairing, the data is stored in the cloud servers in the occultation method. The cloud servers do not gain any information
from the signature in the verification phase. Then, the privacy of the data is guaranteed on the cloud servers side.
The auditor from the internal or external entities with the ability of the computing power and computing resources will check the
integrity of the data. The auditor may be curious of the content and try to obtain it. In the verification phase, the auditor downloads
{IDi, τj, fnj, rc,j}, (1 ≤ j ≤ n) from the blockchain, gains the Cj, (1 ≤ j ≤ n) from the cloud servers and checks the integrity of the data
by the Eq. (13). In the process of the verification phase, no one will gain information with respect to the plaintext from the auditing
information since it is difficult to solve the discrete logarithms problem. The auditor cannot obtain information about the data. Then
the privacy of the data will be protected on the auditor’s side. This concludes the proof of the Lemma 3. □
Dynamical
In practice, the DO wants to dynamic inquire and operate on their data. In our scheme, the data can be operated as modification,

7
Q. Zhao, et al. Information Processing and Management 57 (2020) 102355

insertion, and deletion without downloading the whole original files. The dynamic operation analysis are shown as follows.
Modification: In the cloud storage management system, the DO modifies the data frequently. The DO often replaces the primary
blocks mj with new data blocks mj . The DO chooses a random number r j ZP* and computes the ciphertext Cij and signatures sigij by
Eqs. (17) and (18).

( )
Cij = ci1, j, ci2, j = (r j · g1, mj ·g1 + r j · Yi ) (17)

sigj = x i H1 (IDi Yi fnj ci1, j ci2, j ) g2 (18)


The DO computes and sends {IDi , sigj , Cij, fnj }Yc to the cloud servers. The cloud servers send {IDi , j, fnj , rc, j, fm} to the blockchain and
store the {IDi , j, C j , fnj , rc , j}, the symbol fm includes the meaning of the modification of the original file and some other meaning.
Thereafter, the cloud servers delete {IDi, τj, Cj, fnj, rc,j} from the local database.
Data Insertion: The files are stored on the cloud servers in a certain order. The DO may insert a new file mj before the file mj. Then,
the DO runs the encryption algorithm and signature algorithm to obtain the C j 1 and sigj 1.

sigj 1 = x i H1 (IDi Yi fnj 1 ci1 j 1 ci2 j 1) g2 (19)


The cloud servers verify the signature sigj 1, upload {IDi , j 1, fnj 1, rc, j} to the blockchain, and insert {IDi , j 1, C j 1, fnj 1, rc , j} before
the file fnj in the cloud servers.
Data deletion: When the DO sends the file mj deletion requirement to the cloud servers. The cloud servers delete the file mj and
other information about them. Then, the cloud servers upload a deletion information about the file mj to the blockchain. The auditor
verifies the deletion information simultaneously.

5. Evaluation

In this section, we analyze the probability of the illegality behavior detection, perform the comparison with some existing
schemes, and conduct an experiment to test our scheme’s efficiency.
Due to limited computation power and resources, we cannot download all the blocks to verify the integrity of the files. Then, the
auditor always chooses some blocks to verify the integrity of the data randomly. This is a probabilistic verification. Assume that the
files are divided into ω blocks, and the blocks are stored in the cloud servers. The cloud servers modify, delete or destroy u blocks with
active or passive operation, the auditor chooses blocks to verify the integrity. The probabilistic of the illegality operation being
found is η. Then, η can be described in Eq. (20).
C u
= Pr [x 1] = 1 Pr [x < 0] = 1
C (20)
When the cloud server’s has some illegality operation, Pr[x ≥ 1] represents the probability of catching the cloud server’s some
illegality operation, and Pr[x < 0] represents the illegality operation haven’t be grasped. C and C u express the combination
algorithm in the way of choosing entities from ω entities or u entities randomly. Assume that = 1000 , Table 1 shows the
probabilistic of the illegality being grasped.
From the Table 1, the probability of successful error detection will increase with the increasing number of the modified blocks.
Even though a small number of the blocks are modified, the illegality operation can be grasped with a high probability by choosing
enough number of blocks for auditing. If the cloud servers modify, delete or destroy 1% of the blocks, the auditor will detect the cloud
server’s illegality behavior with a probability of more than 99% when he or she has chosen 368 blocks to check the integrity of the
data. If more blocks are modified, deleted, destroyed, the blocks chosen to check the integrity of the data will be less than 368, and
the probability will be greater than 99%. Thus, the cloud servers modify a small part of the file, and more verification blocks should
be verified in order to make the probability in grasping the illegality operation non negligible. No matter how many blocks are
modified, the probability of the illegality behavior being detected is significant under the batch verification.
We compare our scheme with other schemes in this subsection. The comparison includes the encryption algorithm, the com-
plexity, the dynamic supporting, and the needing of the third party auditor. The complexity contains the computation complexity, the
communication complexity, and the verifier storage complexity. The computation complexity obtains the encryption cost at the DO
side, the signature cost at the DO side, the tag generating cost at the cloud servers side, the verification cost at the cloud servers side
and auditor side. The storage complexity means the storage cost at the auditor side. Notes that n represents the max number of the
encryption file. Taking the cost of the computation, storage and communication into consideration, the comparison among our

Table 1
The probabilistic of the illegality being detection.
u 10 20 30 40 50 60 70 80 90

5 4.91% 9.63% 14.15% 18.50% 22.66% 26.66% 30.48% 34.15% 37.66%


10 9.60% 18.37% 26.36% 33.64% 40.27% 46.29% 51.77% 56.73% 61.23%
15 14.09% 26.30% 36.38% 46.03% 53.93% 60.74% 66.60% 71.63% 75.95%

8
Q. Zhao, et al. Information Processing and Management 57 (2020) 102355

Table 2
Comparison with some related works.
Scheme Cryptography Complexity Dynamic auditing Third-party auditor

Server Auditor Communication Verifier


computation computation storage

PDP [3] Public-key O(1) O(1) O(1) O(1) No No


DPDP [13] Public-key O(log n) O(log n) O(log n) O(1) Yes No
PADD [9] Public-key O(log n) O(log n) O(log n) O(1) Yes Yes
RDPC [21] Symmetric-key O(log n) O(log n) O(log n) O(1) No No
ESAOD [22] Symmetric-key O(1) O(1) O(1) O(1) No Yes
Our scheme Asymmetric-key O(1) O(1) O(1) 0 Yes No

scheme with other schemes is showed in Table 2.


As shown in Table 2, the related schemes keeps the storage cost at the verifier side at a constant size, but our scheme transfers the
storage to the blockchain. This storage method has three benefits: first, it reduces the cost of storage on the verifier’s side; next, it
reduces the probability of the abstract being destroyed; last, if the auditor cheats the DO in the verification phase, we track the
transaction on the blockchain to grasp the illegal auditor. The third-party auditor is appointed in the ESAOD scheme, PADD scheme.
However, there is no third-party auditor scheme in the PDP scheme, DPDP scheme, RDPC scheme and our scheme. The existence of
the third-party auditor may attract more attacks from the third-party auditor’s side. If the abstract of the data is destroyed, it’s
impossible to check the integrity of the data.
The cloud servers may operate on the DO’s privacy data, the auditor should check some tags within a short time. It is more
advantageous to aggregate different DO’s signatures for checking at one time. In our scheme, we aggregate the signatures from one
data owner or from different data owners into a single signature, and check the aggregation signatures by carrying out the batch
verification algorithm. This batch verification method reduces the cost in the verification phase.
To show the efficiency of our scheme, we conduct the integrity checking experiment for the outsourced data. We operate the
experiment on personal computer with equipping Intel(R) Core(TM) i5-6267U CPU @ 2.90GHz 8G RAM. We complete the pro-
gramming in Python 3.7.1. ECC encryption in our scheme is achieved by the ElGamal algorithm, and the size of the key is set as
32bits. The size of the data varies from 1024bits to 8192bits. Then the result of the experiment is recorded the average time for the
Setup algorithm, the signature generation and the verification, the abstract generation in the storage phase, the abstract verification
in the public verification 10,000 times. The record is list in Table 3.
Table 3 shows that the average time for the Setup algorithm is correlative to the size of the key, but not correlative to the size of
the data, for it only includes the time of generation of the public key and the private key. The variation of the average time is shown
in Fig. 5. The average time for the encryption algorithm and the decryption algorithm varies linearly with the increasing size of the
data. We reflect the average time of the encryption, decryption, the signature generation, verification, the abstract generation and
verification in Fig. 3. The Fig. 3 shows that the average time for the signature generation, signature verification, abstract generation
and abstract audition is not correlative to the size of the data since those operations are on the ciphertext and the size of the
ciphertext are correlative to the size of the data and the size of the key. The cost of the generated signature, signature verification,
generated abstract and abstract verification increases with the size of the key varying from 16 bits to 256 bits. The result of the
experiment shows that our scheme is efficient.

6. Conclusion and future work

In this paper, we propose a privacy-preserving remote data integrity checking scheme for IoT information systems without TTP
based on blockchain. The proposed scheme is more suitable for practical applications in the data management system since it does not
need a third party in the remote data integrity checking phase. Our scheme resists the leaking of data privacy caused by the un-trusted
third party. We employ the blockchain technology to construct the model, then, the auditor queries the proof of the data for an
unlimited number of times and uploads the auditing transaction to the blockchain. The DO or other auditors are able to track the
illegal auditing transaction. Our scheme satisfies the correctness, privacy, dynamics, public verification and security. The results of
the experiments demonstrate that our scheme is efficient in the communication and computation. We will research the automated
remote data integrity checking in the future work. The automated remote data integrity checking means the integrity will be checked

Table 3
The performance of our scheme.
Size of the data Key Gen(ms) Enc(ms) Dec(ms) Sig Gen(ms) Sig Ver(ms) Abs Gen(ms) Abs Ver(ms)

1024bits 25.054 0.956 1.248 0.322 0.357 0.179 0.314


2048bits 1.784 2.031 0.278 0.439 0.176 0.311
4096bits 3.498 3.986 0.245 0.348 0.181 0.314
8192bits 6.684 7.552 0.245 0.421 0.163 0.320

9
Q. Zhao, et al. Information Processing and Management 57 (2020) 102355

Fig. 2. The variation of the average time with the size of the key.

Fig. 3. The average time of the cost. Fig 3-a. the average time of the encryption; Fig 3-b. the average time of the decryption; Fig 3-c. the average time
of the signature generation; Fig 3-d. the average time of the signature verification; Fig 3-e. the average time of the abstract generation; Fig 3-f. the
average time of the abstract verification; X axles are the size of the data in units of 1024 bits.

in real-time. If the data gets modified, deleted, destroyed or inserted illegally by the cloud servers or attackers, the DO will be notified
in the first time. This would bring more time to the DO for dealing with the crisis.

CRediT authorship contribution statement

Quanyu Zhao: Conceptualization, Methodology, Data curation, Writing - original draft. Siyi Chen: Data curation, Writing -
original draft. Zheli Liu: Conceptualization, Methodology, Writing - review & editing. Thar Baker: Conceptualization, Methodology,
Writing - review & editing. Yuan Zhang: Conceptualization, Methodology, Writing - review & editing.

Acknowledgments

This work was supported in part by National Key R&D Program of China (2018YFB1004301), NSFC-61872179, Fundamental
Research Funds for the Central Universities (020214380052), NSFC-61425024, NSFC-61872176.

10
Q. Zhao, et al. Information Processing and Management 57 (2020) 102355

References

Alfandi, O., Otoum, S., & Jararweh, Y. (2020). Blockchain solution for iot-based critical infrastructures: byzantine fault tolerance. NOMS 2020 - 2020IEEE/IFIP network
operations and management symposium, budapest, hungary1–4 10.1109/NOMS47738.2020.9110312
Ateniese, G., Burns, R., & Curtmola, R. (2011). Remote data checking using provable data possession. ACM Transactions on Information and System Security, 14(1), 12.
https://doi.org/10.1145/1952982.1952994.
Ateniese, G., & Curtmola, R. (2007). Provable data possession at untrusted stores. Proceedings of the 14th ACM conference on computer and communications security.
Acm598–609. https://doi.org/10.1145/1315245.1315318.
Ateniese, G., Kamara, S., & Katz, J. (2009). Proofs of storage from homomorphic identification protocols. International conference on the theory and application of cryptology
and information security. Berlin, Heidelberg: Springer319–333 10.1007/978-3-642-10366-7_19
Ateniese, G., Pietro, R., & Mancini, L. (2008). Scalable and efficient provable data possession. Proceedings of the 4th international conference on security and privacy in
communication netowrks9ACMhttps://doi.org/10.1145/1460877.1460889.
Barsoum, A., & Hasan, M. (2015). Provable multicopy dynamic data possession in cloud computing systems. IEEE Transaction Information Forensics and Security, 10(3),
485–497. https://doi.org/10.1109/TIFS.2014.2384391.
Blum, M., Evans, W., & Gemmell, P. (1994). Checking the correctness of memories. Algorithmica, 12(2–3), 225–244. https://doi.org/10.1007/BF01185212.
Boneh, D., Lynn, B., & Shacham, H. (2001). Short signatures from the weil pairing. International conference on the theory and application of cryptology and information
security. Berlin, Heidelberg: Springer514–532 10.1007/3-540-45682-1_30
Butt, T., Iqbal, A., & Salah, R. (2019). Privacy management in social internet of vehicles: Review, challenges and blockchain based solutions. IEEE access : practical
innovations, open solutions, 7, 79694–79713. https://doi.org/10.1109/ACCESS.2019.2922236.
Camenisch, J., & Lysyanskaya, A. (2004). Signature schemes and anonymous credentials from bilinear maps. Annual international cryptology conference. Berlin, Heidelberg:
Springer56–72 10.1007/978-3-540-28628-8_4
Cui, J., Zhang, J., & Zhong, S. (2017). SPACF: A secure privacy-preserving authentication scheme for VANET with cuckoo filter. IEEE Transactions on Vehicular
Technology, 66(11), 10283–10295. https://doi.org/10.1109/TVT.2017.2718101.
Desmedt, Y. (1994). Threshold cryptography. European Transactions on Telecommunications, 5(4), 449–458. https://doi.org/10.1002/ett.4460050407.
ElGamal, T. (1985). A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Transactions on Information Theory, 31(4), 469–472. https://
doi.org/10.1109/TIT.1985.1057074.
Erway, C., Kp, A., & Papamanthou, C. (2015). Dynamic provable data possession. ACM Transactions on Information and System Security, 17(4), 15. https://doi.org/10.
1145/2699909.
Gan, Q., Wang, X., & Fang, X. (2018). Efficient and secure auditing scheme for outsourced big data with dynamicity in cloud. Science China Information Sciences, 61(12),
https://doi.org/10.1007/s11432-017-9410-9 122–104
Grissa, M., Yavuz, A., & Hamdaoui, B. (2017). Preserving the location privacy of secondary users in cooperative spectrum sensing. IEEE Transactions on Information
Forensics and Security, 12(2), 418–431. https://doi.org/10.1109/TIFS.2016.2622000.
Hu, J., Yeh, L., & Liao, S. (2019). Autonomous and malware-proof blockchain-based firmware update platform with efficient batch verification for internet of things
devices. Computers and Security, 86, 238–252. https://doi.org/10.1016/j.cose.2019.06.008.
Hua, J., Tang, A., & Pan, Q. (2017). Practical –anonymization for collaborative data publishing without trusted third party. Security and Communication Networks, 27,
1–10. https://doi.org/10.1155/2017/9532163.
Huang, Y., Li, B., & Liu, Z. (2019). Towards practical oblivious data access in fog computing environment. IEEE Transactions on Service Computing. https://doi.org/10.
1109/TSC.2019.2962110.
James, S., Gayathri, N., & Reddy, P. (2018). Pairing free identity-based blind signature scheme with message recovery. Cryptography, 2(4), 29. https://doi.org/10.
3390/cryptography2040029.
Li, J., Huang, Y., & Liu, Z. (2019). Searchable symmetric encryption with forward search privacy. IEEE Transactions on Dependable and Secure Computing. https://doi.
org/10.1109/TDSC.2019.2894411.
Li, J., Zhang, L., & Liu, J. (2016). Privacy-preserving public auditing protocol for low-performance end devices in cloud. IEEE Transactions on Information Forensics and
Security, 11(11), 2572–2583. https://doi.org/10.1109/TIFS.2016.2587242.
Liu, Y., Guo, W., & Fan, C. (2018). A practical privacy-preserving data aggregation (3PDA) scheme for smart grid. IEEE Transactions on Industrial Informatics, 15(3),
1767–1774. https://doi.org/10.1109/TII.2018.2809672.
Liu, Y., Wang, Y., & Wang, X. (Wang, Wang, 2019a). Privacy-preserving raw data collection without a trusted authority for IoT. Computer Networks, 148, 340–348.
https://doi.org/10.1016/j.comnet.2018.11.028.
Liu, Z., Li, B., & Li, J. (Li, Li, 2019b). NewMCOS: Towards a practical multi-cloud oblivious storage scheme. IEEE Transactions on Knowledge and Data Engineering.
https://doi.org/10.1109/TKDE.2019.2891581.
Oh, J., Ha, J., & Park, H. (2016). RS-LDPC concatenated coding for the modern tape storage channel. IEEE Transactions on Communications, 64(1), 59–69. https://doi.
org/10.1109/TCOMM.2015.2504362.
Ristenpart, T., Tromer, E., & Shacham, H. (2009). Hey, you, get off of my cloud: exploring information leakage in third-party compute clouds. Proceedings of the 16th ACM
conference on computer and communications security. ACM199–212 10.1145/1653662.1653687
Shen, H., Zhang, M., & Shen, J. (2017). Efficient privacy-preserving cube-data aggregation scheme for smart grids. IEEE Transactions on Information Forensics and
Security, 12(6), 1369–1381. https://doi.org/10.1109/TIFS.2017.2656475.
Shen, J., Liu, D., & Chen, X. (2019). Secure real-time traffic data aggregation with batch verification for vehicular cloud in VANETs. IEEE Transactions on Vehicular
Technology, 69, 807–817. https://doi.org/10.1109/TVT.2019.2946935.
Tariq, N., Asim, M., & Obeidat, F. (2019). The security of big data in fog-enabled IoTapplications including blockchain: A survey. Sensors, 19(8), 59–69. https://doi.
org/10.3390/s19081788.
Tseng, L., Wong, L., & Otoum, S. (2020). Blockchain for managing heterogeneous internet of things: A perspective architecture. IEEE Network, 34(1), 16–23. https://
doi.org/10.1109/MNET.001.1900103.
Umair, K., Asim, M., & Thar, B. (2020). A decentralized lightweight blockchain-based authentication mechanism for IoT systems. Cluster Computing, 64(1), 59–69.
https://doi.org/10.1007/s10586-020-03058-6.
Wang, C., Chow, S., & Wang, Q. (2013). Privacy-preserving public auditing for secure cloud storage. IEEE Transactions on Computers, 62(2), 362–375. https://doi.org/
10.1109/TC.2011.245.
Wang, C., Wang, Q., & Ren, K. (2010). Privacy-preserving public auditing for data storage security in cloud computing. Infocom, 2010, 1–9. https://doi.org/10.1109/
INFCOM.2010.5462173.
Wang, H., He, D., & Tang, S. (2016). Identity-based proxy-oriented data uploading and remote data integrity checking in public cloud. IEEE Transactions on Information
Forensics and Security, 11(6), 1165–1176. https://doi.org/10.1109/TIFS.2016.2520886.
Wang, Q., Wang, C., & Li, J. (2009). Enabling public verifiability and data dynamics for storage security in cloud computing. European symposium on research in computer
security. Berlin, Heidelberg: Springer355–370 10.1016/S0031-8914(53)80099-6
Wang, Q., Wang, C., & Ren, K. (2011). Enabling public auditability and data dynamics for storage security in cloud computing. IEEE Transactions on Parallel and
Distributed Systems, 22(5), 847–859. https://doi.org/10.1109/TPDS.2010.183.
Yang, K., & Jia, X. (2013). An efficient and secure dynamic auditing protocol for data storage in cloud computing. IEEE Transactions on Parallel and Distributed Systems,
24(9), 1717–1726. https://doi.org/10.1109/TPDS.2012.278.
Yu, Y., Au, M., & Ateniese, G. (2017). Identity-based remote data integrity checking with perfect data privacy preserving for cloud storage. IEEE Transactions on
Information Forensics and Security, 12(4), 767–778. https://doi.org/10.1109/TIFS.2016.2615853.
Yu, Y., Li, Y., & Ni, J. (2016). Comments on ”public integrity auditing for dynamic data sharing with multiuser modification. IEEE transactions on information forensics and

11
Q. Zhao, et al. Information Processing and Management 57 (2020) 102355

securityvol. 11. IEEE transactions on information forensics and security 658–659 10.1109/TIFS.2015.2501728
Yu, Y., Zhang, Y., & Ni, J. (2015). Remote data possession checking with enhanced security for cloud storage. Future Generation Computer Systems, 52, 77–85. https://
doi.org/10.1016/j.future.2014.10.006.
Zhou, Z. (2019). Abductive learning: Towards bridging machine learning and logical reasoning. Science China Information Sciences, 062(007), 220–222. https://doi.
org/10.1007/s11432-018-9801-4.
Zhu, Y., Ahn, G., & Hu, H. (2013). Dynamic audit servers for outsourced storages in clouds. IEEE Transactions on Servers Computing, 6(2), 227–238. https://doi.org/10.
1109/TSC.2011.51.

12

You might also like