You are on page 1of 10

Received May 14, 2020, accepted June 1, 2020, date of publication June 9, 2020, date of current version June

22, 2020.
Digital Object Identifier 10.1109/ACCESS.2020.3001007

Secure Secret Sharing With Adaptive Bandwidth


in Distributed Cloud Storage Systems
HAILIANG XIONG , (Member, IEEE), CHANGWU HU, (Student Member, IEEE),
YUJUN LI , (Member, IEEE), GUANGYUAN WANG, (Student Member, IEEE),
AND HONGCHAO ZHOU, (Member, IEEE)
School of Information Science and Engineering, Shandong University, Qingdao 266237, China
Corresponding author: Yujun Li (liyujun@sdu.edu.cn)
This work was supported in part by the Key Research and Development Program of Shandong Province under Grant 2017GGX201003,
in part by the National Key Research and Development Program of China under Grant 2018YFC0831006-02, in part by the Natural
Science Foundation of Shandong Province under Grant ZR2019MF038, and in part by the National Natural Science Foundation of China
under Grant 61401253.

ABSTRACT The development of cloud storage technology has brought us great convenience that we can
store data in remote servers and access it on any connected device. However, the frequent leakage of private
data has brought more and more attention to the protection of private data. To solve related problems,
the distributed storage schemes have been proposed. Considering security and fault tolerance, many of those
schemes adopt threshold secret sharing techniques which are wildly used in distributed storage systems
with disaster tolerance function. Nevertheless, in practical situations, the bandwidth between users and
different servers may be unbalanced or even unfixed, which leads to low communication efficiency of the
schemes when adopting original secret sharing. To obtain higher communication efficiency under different
communication loads, we proposed a novel adaptive bandwidth secret sharing scheme in distribution cloud
storage systems. In addition, we consider the scenario in which we use Shamir’s secret sharing scheme and
Staircase codes respectively in order to improve the applicability of the adaptive bandwidth scheme. We make
the comparative of performance analysis to show the advantages and disadvantages of these two schemes.
In general, compared with the non-adaptive schemes, the proposed adaptive bandwidth scheme can make
full use of unbalanced bandwidth and achieve higher average communication rate when the upper bound of
bandwidth is large enough.

INDEX TERMS Adaptive bandwidth, communication efficiency, distributed storage, secret sharing, privacy
and security.

I. INTRODUCTION and saving expensive hardware and software infrastructure


In recent years, with the rise of cloud computing and investment.
software-as-a-service, cloud storage has become a research However, considering security and usability, many com-
hotspot in the field of information storage [1]–[3]. Com- panies and individuals are reluctant to entrust their sensitive
pared with traditional storage devices, cloud storage is more data to third-party service providers. Although the providers
than just a piece of hardware, but a system consisting of can guarantee data durability, they still cannot fully guarantee
multiple parts such as network devices, storage devices, data confidentiality when faced with malicious employees.
servers, application software, public access interfaces, access Recent reports indicate that most of cloud services are still
networks, and client programs. Users who need storage short of key capabilities to ensure compliance. They lack
services no longer need to set up their own data centers, transparency in government oversight and security mecha-
just apply to the supply-side platform for storage services, nisms to protect data [4].
thus avoiding redundant construction of the storage platform In general, if data is stored in one cloud server, the only
way to ensure confidentiality is to encrypt the data on the
The associate editor coordinating the review of this manuscript and client, upload it to the cloud, and decrypt it as it is down-
approving it for publication was Yunlong Cai . loaded. Unfortunately, this method requires a large number of

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
108148 VOLUME 8, 2020
H. Xiong et al.: Secure Secret Sharing With Adaptive Bandwidth in Distributed Cloud Storage Systems

keys to be generated and maintained, and the computational to Reed-Solomon codes [27] and is generalized to MDS
overhead is very high [5]–[7]. In addition, this sort of key (Maximum Distance Separable) array codes [15], [23] later.
encryption method is only computationally safe. It can pre- In [15], an explicit structure of the first known system (n, k)
vent the current mainstream computing power enemies from MDS array code is described where n − k is equal to some
cracking, but it can not prevent future enemies with faster constants so that the amount of information needed to recon-
computing power and even quantum computer computing struct an erased column is equal to 1/(n − k), matching the
power. Another more serious shortcoming of a single server lower bound of information theory. Recently, there has been
cloud storage solution is that in the event of a disaster or other considerable interest in incorporating secrets into erasure
disruption [4], the cloud storage service will be completely codes for distributed storage, such as [33]–[36]. These codes
ineffective and the data may not be recoverable. can also be considered as threshold secret sharing schemes.
In order to solve above problems of a single server cloud Huang et al. [28] proposed communication efficient secret
storage solution, a new cloud storage solution called dis- sharing based on MDS code in 2015, and proved that the
tributed cloud [10] has been proposed and becomes more communication cost of this scheme reached the lower bound.
and more popular. Distributed cloud stores data redundantly The key idea of this scheme for achieving optimal com-
across multiple independent servers, which allows users to munication bandwidth is to let the user receive information
recover data by accessing a normal server when a limited from more than the necessary number of parties. Then Bitar
number of servers are unavailable. In addition, the research and Rouayheb [29] completely constructed a secret shar-
on distributed computing [11], [12] also makes the appli- ing scheme named Staircase codes on this basis. Besides,
cation of distributed cloud more and more extensive. The they described how Staircase codes can be used to construct
main data protection technology of this new type of cloud threshold changeable secret sharing codes and applied it into
storage scheme is generally secret sharing, which is a special secure multi-party computation [30]. Although considerable
encoding method that combines the user’s original data with theoretical effort has focused on reducing the computation
redundant random data to ensure that the original data can complexity of Shamir’s secret-sharing scheme while still
only be obtained by a certain amount of encoded fragment. making it information-theoretically secure [27], [31], [37],
Secret sharing scheme was proposed independently by [38], most of these solutions are based on balanced bandwidth
Shamir [8] and Blakley [9] in 1979 based on Lagrange conditions, and there is little research on unbalanced band-
interpolation method and the properties of multi-dimensional width load. Fortunately, recent technological advances have
space points. This problem describes that a secret data is brought new ideas. A new secret sharing algorithm, secure
shared by n participants, at least k participants can jointly RAID (Redundant Arrays of Independent Drives), was pro-
reconstruct the secret, and less than k participants cannot get posed. This algorithm can effectively decode partial data, and
any information about the secret. Secret sharing is a kind of its computational overhead can be compared with standard
cryptographic technology of dividing and storing secrets as erasure coding [31]. More importantly, the addition of RAID
well as an important means of information security and data technology allows distributed storage to match unbalanced
privacy. What secret sharing does is to prevent excessive con- bandwidth during communication [32].
centration of secrets, so as to achieve the purpose of spreading In this paper, we assume that we need to store some impor-
risks and tolerating intrusion. The biggest advantage of secret tant data on untrusted third-party servers. We can’t simply
sharing is that it is information-theoretic-security that even encrypt the data and store it on one server, because the data
an attacker with unlimited computing power can’t get any stored on a single server is easily cracked and will proba-
information about the stored data. However, compared to bly not be recoverable if something goes wrong. Therefore,
the traditional encryption method, secret sharing does not we use the distributed storage scheme based on threshold
require any encryption keys, but it will bring several times secret sharing to store data in multiple servers that are not easy
the storage overhead of the original data and additional cod- to collude. Compared with the common multi-copy schemes,
ing and decoding complexity, and needs to generate a large secret sharing can significantly save storage costs and provide
amount of random data (these data will only be used once higher ability of fault tolerance. However, when restoring the
after they are generated, and will not need to be stored and secret, the bandwidth that users access to each server may
reused like keys). Therefore, they are currently only used be different. If we do not consider the difference of band-
to remotely store small amounts of data, such as encryption width, it may waste a lot of bandwidth and even lead to low
keys [10], [13]. Since this problem was put forward, secret communication efficiency. In order to improve this situation,
sharing and erasure code in distributed storage have been we propose an adaptive bandwidth secret sharing scheme
developing continuously [14]–[27]. Now it has become a based on Shamir’s scheme and Staircase codes respectively.
fundamental cryptographic method and is used as a building We evaluate and analyze the bandwidth utilization and com-
block in numerous secure protocols, especially in threshold munication efficiency of the adaptive bandwidth scheme and
cryptography and secure multi-party computation. compare it with the non-adaptive one. The main conclusions
After Shamir used the idea of polynomial interpola- can be summarized as follows. (1) No matter which of the
tion to construct an elegant and efficient perfect threshold two secret sharing schemes is applied, the proposed scheme
scheme, Shamir’s scheme is proved to be closely related can achieve the bandwidth self-adaption between the user

VOLUME 8, 2020 108149


H. Xiong et al.: Secure Secret Sharing With Adaptive Bandwidth in Distributed Cloud Storage Systems

and the servers in distributed storage system and greatly An (n, k, z) Staircase code allows the user to encode the
increase the communication rate when the upper bound of data S into n shares and distribute them to n servers, such that
bandwidth is large enough. (2) The scheme constructed from shares from any set of k servers can obtain S and shares from
Staircase codes can recover the secret more efficiently with any set of z, z ≤ k ≤ n, servers obtain no information about S.
lower bandwidth but higher computational overhead. (3) The The user will reconstruct S with the data in any d, k ≤ d ≤ n
proposed adaptive scheme has strong scalability, so that we servers, i.e., d denotes the number of servers that are actually
can build it with other secret sharing algorithms to obtain used. In this paper, we regard z = k − 1, which is the most
additional features. common case in practice. The following are the construction
The remainder of the paper is organized as follows. processes of Staircase code:
Section II introduces Shamir’s scheme and Staircase codes Preparation: An (n, k, k −1) Staircase code requires divid-
which will be used as component of the adaptive scheme. ing the data into matrix S of dimension h × b/h, where
Section III illustrates the basic idea of adaptive bandwidth h = n − k + 1, b = LCM 1 {2, · · · , h}. Let d1 = n, d2 =
scheme and the generation of bandwidth matrix, which is n − 1, · · · , dh = k denote the number of servers contacted by
the kernel of self-adaption. We describe the construction the user. Let bi , di −k +1 for i = 1, . . . , h. The construction
process of adaptive bandwidth scheme and compare the uses random matrix R of dimension (k − 1) × 2b/bn−k whose
non-adaptive scheme with the adaptive one through an exam- elements are drawn independently and uniformly at random
ple in Section IV. Numerical analysis results are presented from GF(q) to ensure secrecy. Then the random matrix R
in Section V. Finally, we provide the concluding remarks in is partitioned into h matrices Ri , i = 1, . . . , h, each of
Section VI. dimension (k − 1) × b/bi bi−1 with b0 = 1. The matrix M
is the concatenation of h matrices Mi , i = 1, . . . , h, shown
II. SHAMIR’s SCHEME AND STAIRCASE CODES in Fig. 1. The elements appearing in each matrix Di are the
A. SHAMIR’s SCHEME elements of the (n − i + 1)th row of [M1 , . . . , Mi ] rearranged
Shamir’s Secret Sharing is an algorithm in cryptography to the dimension of bi × b/bi bi+1 for i = 1, . . . , h − 1. The
based on Lagrange interpolation method. The essential idea 0’s are the all zero matrices used to complete the Mi ’s to n
of Shamir’s threshold scheme is that 2 points are sufficient rows.
to define a line, 3 points are sufficient to define a parabola,
4 points are sufficient to define a cubic curve and so forth.
That is, it takes k points to define a polynomial of degree
k − 1. Suppose we want to use a (n, k) threshold scheme to
share our secret S, without loss of generality assumed to be
an element in a finite field F of size P where 0 < k ≤ n < P;
S < P and P is a prime number. Choose k −1 positive integers
a1 , · · · , ak−1 at random with ai < P, and let a0 = S. Build
the polynomial

f (x) = a0 + a1 x + a2 x 2 + a3 x 3 + · · · + ak−1 x k−1 . (1)

Let us construct any n points out of it, for example set


i = 1, . . . , n to retrieve (i, f (i)). Every participant is given
a point (a non-zero integer input to the polynomial, and the
FIGURE 1. The construction of matrix M with dimension n × b. Note that
corresponding integer output) along with the prime which the dimension of S is h × b/h, the dimension of Ri is (k − 1) × b/bi bi −1 ,
defines the finite field to use. Given any subset of k of these i = 1, . . . , h and the dimension of Di is bi × b/bi bi +1 . In addition,
the role of the zero matrix in M is to fill M into n rows.
pairs, we can use interpolation to find the coefficients of the
polynomial. The secret is the constant term a0 . However,
when given more than k pairs, we can still use the data of k Encoding: Let V be an n × n Vandermonde matrix defined
pairs. Namely, the extra pairs are wasted. over GF(q). The encoding of Staircase codes consists of
multiplying V by M to obtain the matrix C = VM . The n
B. STAIRCASE CODES rows of C form the n different shares.
Staircase code is a kind of communication efficient secret Decoding: The user contacts any di servers, i = 1, 2, . . . , h
sharing scheme constructed by Bitar and Rouayheb [29], and downloads the first b/bi symbols from each servers corre-
which referred to Huang’s paper [28]. The goal of this scheme sponding to V [M1 , M2 , . . . , Mh ]. The user is guaranteed [29]
is to minimize the communication overhead for the user to decode the secret.
interested in decoding the secret. Compared with Shamir’s The following Example illustrates the specific process of
scheme, this scheme describes a method that the user can encoding and decoding for Staircase codes.
download less data from the servers when more than k servers
are available with higher computational overhead. 1 Least Common Multiple.

108150 VOLUME 8, 2020


H. Xiong et al.: Secure Secret Sharing With Adaptive Bandwidth in Distributed Cloud Storage Systems

Example 1: Consider the secret sharing problem with TABLE 2. Three Shamir’s secret sharing projects for n = 4, k = 2. Each
project stores a secret of one symbol in 4 servers. When we want to
n = 3, k = 2. We assume now that the secret S is formed of 2 recover one secret, we need to download one symbol of data from any 2
symbols S1 , S2 uniformly distributed over GF(5) and we use of the 4 servers respectively. If we want to recover three secrets, in the
two keys R1 , R2 drawn independently and uniformly at ran- first scenario, 6 symbols are downloaded from server A and B, while no
data is downloaded from server C and D, that is, we download the data
dom from GF(5). To construct the Staircase codes, the secret underlined in the table. In the second scenario, 3 symbols are
symbols and keys are arranged in a matrix M as shown in (2). downloaded from server A, while the remaining 3 symbols are
downloaded respectively from server B, C, and D, that is, we download
The matrix M is multiplied by a 4 × 3 Vandermonde matrix the data indicated in blue font in the table.
V to obtain the matrix C = VM . The 3 rows of C form the 3
different shares and give the Staircase codes shown in Table 1.
  
1 1 1 S1 R1
C = VM = 1 2 4  S2 R2  (2)
1 3 4 R1 0

TABLE 1. The shares stored by three servers using Staircase codes for
n = 3, k = 2. In this example, each share is divided into two sub-shares
giving the user more decoding options. The user can decode S1 and S2 by
receiving either the first sub-share of each server (in blue) or two
sub-shares from any two servers (in blue and black). In the first case,
the communication overhead and read overhead are both 1 symbol.
In the second case, the communication overhead and read overhead are
both 2 symbols. The operations shown above are in GF (5). are 3, 1, 1, 1 symbol/s respectively, we use the second way.
That is to say, different bandwidth loads can be achieved by
changing different combinations of sub-secrets. Conversely,
when we know the actual bandwidth, we can adapt band-
width by allocating the combination of sub-secrets. From this
instance, we can see that in the case of dividing a secret
into multiple chunks, if we make a reasonable analysis of
the bandwidth, we can endow the secret sharing scheme
the capability of adapting bandwidth. Note that one chunk
in the Shamir’s scheme contains 1 symbol of data, while
III. ADAPTIVE BANDWIDTH SCHEME
one chunk in Staircase codes contains b = LCM {2, . . . , h}
We need to store a secret data in n untrusted servers among symbols.
which only less than k servers can collude (or the oppo-
nent can only break through less than k servers at the same B. GENERATION OF BANDWIDTH MATRIX
time), hence we need to adopt secret sharing. Assuming that After getting the bandwidth from all servers to the user,
the communication bandwidth between the user and these n we originally intended to calculate and save all possible
servers may be unbalanced or even changeable, we need to collocation schemes and results in a way similar to fingerprint
consider how to construct a secret sharing scheme to adapt to database, and then match the bandwidth situation with the
the change of bandwidth as much as possible. database in the adaptive bandwidth process and calculate the
recovery scheme for each sub-secret. However, we can notice
A. BASIC IDEA that above operation is not totally necessary. We just need to
Our basic idea is to store multiple data and recover it by using analyze and decompose the bandwidth into several vectors
secret sharing scheme in different ways, which is inspired that each vector denotes the bandwidth used for recovering
by the study in RDP (Row Diagonal Parity) code storage one chunk and use the bandwidth matrix to represent the
systems [32]. In simple terms, each symbol of secret S is vectors. Each column of the bandwidth matrix corresponds
stored on multiple servers using secret sharing scheme, and to each server, and each row represents the amount of data
each symbol of secret is recovered in same order. However, downloaded from each server to recover a secret chunk.
the sub-secrets selected for recovering each symbol is not We will introduce the generation of bandwidth matrix as
necessarily the same, but should be adjusted according to follows:
the bandwidth condition to achieve the purpose of bandwidth
self-adaption. 1) BANDWIDTH MATRIX OF SHAMIR’s SCHEME
Table 2 shows that when recovering a secret consists In order to obtain the combination of sub-secrets, we let
of 3 symbols, if the bandwidths of the user to the four the matrix MB of m × n dimension constructed from 0’s
servers are 3, 3, 0, 0 symbol/s respectively, we can recover and 1’s denote the matrix of bandwidth, where each row
the secret in the first scenario to achieve bandwidth self- contains k 1’s, and the sum of each column equals to the
adaption, if the bandwidths of the user to the four servers bandwidth from the corresponding server to the user. For

VOLUME 8, 2020 108151


H. Xiong et al.: Secure Secret Sharing With Adaptive Bandwidth in Distributed Cloud Storage Systems

(n, k) secret sharing scheme, we assume the bandwidth from for Bm . Note that the MBj ’s are replaced by Vi ’s here. Sim-
the servers to the user is B0 = [X01 , X02 , . . . , X0n ], where ilarly, we will show an Example of the generation of band-
0 ≤ X0i ≤ x, x denotes the upper bound of bandwidth. width matrix for Staircase code.
For (n, k) Shamir’s scheme, we will generate the band-
width matrix as follows: [X01 , X02 , . . . , X0n ] − V1 = B1
[X11 , X12 , . . . , X1n ] − V2 = B2
[X01 , X02 , . . . , X0n ] − MB1 = B1
···
[X11 , X12 , . . . , X1n ] − MB2 = B2
[X(m−1)1 , X(m−1)2 , . . . , X(m−1)n ] − Vm = Bm . (6)
···
[X(m−1)1 , X(m−1)2 , . . . , X(m−1)n ] − MBm = Bm . (3) Example 3: For a (4, 2) Staircase code, assuming that
the bandwidths from the servers to the user are 11, 5, 11, 2
This process continues until the number of 0’s in Bm = symbol/s, i.e., B0 = [11, 5, 11, 2], we can decompose B0 into
[Xm1 , Xm2 , . . . , Xmn ] is more than n−k, while MBj , 0 ≤ j ≤ m MB1 , MB2 , . . . , MBm from the following steps:
is the jth row of MB , and its 1’s denote the position of the first
k large number of [X(j−1)1 , X(j−1)2 , . . . , X(j−1)n ]. [11, 5, 11, 2] − [2, 2, 2, 2] = [9, 3, 9, 0]
The following example illustrates the generation processes [9, 3, 9, 0] − [3, 3, 3, 0] = [6, 0, 6, 0]
of bandwidth matrix MB for Shamir’s scheme.
[6, 0, 6, 0] − [6, 0, 6, 0] = [0, 0, 0, 0]. (7)
Example 2: For a (4, 2) Shamir’s scheme, we assume that
the bandwidths from the servers to the user are 2, 3, 4, 1
Similarly, we can combine MB1 , MB2 , . . . , MBm into
symbol/s, i.e., B0 = [X01 , X02 , . . . , X0n ] = [2, 3, 4, 1]. Then
we can order it from large to small and obtain the position 
2 2 2 2

of the first k large number of B0 , i.e., MB1 = [0, 1, 1, 0]. MB =  3 3 3 0  . (8)
And we can obtain MB1 , MB2 , . . . , MBm in the same way 6 0 6 0
from Eq. (4).
In this process, the order of decomposition depends on the
[2, 3, 4, 1] − [0, 1, 1, 0] = [2, 2, 3, 1] priority of Vi , and the higher priority Vi will be used to
[2, 2, 3, 1] − [0, 1, 1, 0] = [2, 1, 2, 1] decompose first.
[2, 1, 2, 1] − [1, 0, 1, 0] = [1, 1, 1, 1] It should be pointed out that since the generation process
of the bandwidth matrix only involves one cycle, its time
[1, 1, 1, 1] − [1, 1, 0, 0] = [0, 0, 1, 1]
complexity is O(n). Compared with the time complexity
[0, 0, 1, 1] − [0, 0, 1, 1] = [0, 0, 0, 0] (4) of the non-adaptive scheme to recover the secret process,
for example, that of non-adaptive Shamir’s scheme is (k 2 )
Finally, we can combine MB1 , MB2 , . . . , MBm to get (Lagrangian interpolation), that of non-adaptive Staircase
  codes is O(n3 ) (mainly calculated as multidimensional matrix
0 1 1 0
0 1 1 0  inversion), the time complexity of constructing bandwidth
matrix is trivial.
1 0 1 0  ,
 
MB =   (5)
1 1 0 0 
0 0 1 1 IV. CONSTRUCTION OF ADAPTIVE SCHEME
In this section, we present the proposed adaptive bandwidth
i.e., the matrix of bandwidth. secret sharing scheme via Shamir’s scheme and Staircase
codes respectively. The construction of proposed adaptive
2) BANDWIDTH MATRIX OF STAIRCASE CODES scheme includes three steps:
let MBm×n denote the matrix of bandwidth, where each row
consists of different vectors2 Vi = [b/bi , b/bi , · · · , b/bi , 0, A. SECRET DISTRIBUTION
· · · , 0], i = 1, · · · , n − k + 1, in which b/bi denotes the The secret is divided into multiple chunks (Each chunk S
amount of symbols that need to be downloaded from each consists of one symbol for Shamir’s scheme while each chunk
server for recovering one symbol of secret and i denotes the consists of h×b/h = b symbols for Staircase codes), and each
priority of Vi that the smaller i is, the higher the priority is. chunk is generated into n shares via secret sharing algorithm
And the number of b/bi is bi + k − 1, while the number of 0 and distributed to n servers. All secret chunks are stored in
is n − (bi + k − 1). these n servers through this way in sequence. We call the n
For (n, k) Staircase code, we will generate the bandwidth shares of a certain chunk in the server the same ‘‘row’’, so the
matrix by Eq. (6) until there is no suitable vector to split number of ‘‘rows’’ of server data represents the number of
stored secret chunks, which means that one can recover the
2 The order of b/b and 0 in V is arbitrary. For instance,
i i
secret S of a chunk by acquiring any k shares in one certain
[0, b/bi , 0, b/bi , · · · , ] is also feasible. ‘‘row’’ on the servers.

108152 VOLUME 8, 2020


H. Xiong et al.: Secure Secret Sharing With Adaptive Bandwidth in Distributed Cloud Storage Systems

B. BANDWIDTH ANALYSIS and differentiate it to


 
Measure the communication bandwidth from each server to 0 1 1
the user when restoring the secret and then decompose the 0
 1 1

bandwidth into a bandwidth matrix by using the method 1
MB =  0 1
 (9)
shown in Eq. (3) or Eq. (6). 0 1 1
1 0 1
C. SECRET RECOVERY that the sums of the columns equal to 2, 3, 5 respectively and
Each row of the bandwidth matrix denotes the distribution the sum of each row is k = 2. Then we will take the data from
of the share required to recover the secret of a chunk in the the servers in the form of matrix (9) and transfer it to the user
servers. Retrieve the data from the server according to the (the blue potion in Fig. 2(b)). At last the user fills the received
bandwidth matrix and transmit it to the user, then restore all data into matrix (9) and recovers the secret in Shamir’s
secret symbols in a way of restoring one symbol per row. secret sharing method. Thus, the bandwidth utilization of
Eventually we will get all the secrets that can be recovered the recovering process is 100%, while the communication
in one second. rate is 5 symbol/s (the number of Matrix’s rows). We can
To better illustrate the process of adaptive bandwidth see from the comparison of this example that the adaptive
secret sharing schemes, we present an example. In addi- bandwidth scheme can get higher communication rate than
tion, we compare the adaptive bandwidth scheme with the original scheme for n = 3, k = 2 when the bandwidths are
non-adaptive scheme in the example to demonstrate the 2, 3, 5 symbol/s.
advantages of the adaptive scheme. Two important perfor- Example 5: For a (4, 2) Staircase code, assuming that the
mance metrics, bandwidth utilization and communication bandwidths from 4 servers to the user are 13, 7, 13, 4 symbol
rate, are involved in this example. Communication rate refers per second respectively.
to the number of symbols that the user recovers from the If we adopt the original non-adaptive Staircase codes,
servers per second during the communication phase, which we can only use the data of the first b/b1 = 2 symbols in
intuitively reflects the performance of secret sharing schemes. all n = 4 servers for twice, because this is the bandwidth
Example 4: For a (3, 2) Shamir’s secret sharing scheme, limit of the fourth server. Then the bandwidth utilization
assuming that the bandwidths from 3 servers to the user are of the recovering process is 16/37 = 43.2%, while the
2, 3, 5 symbol per second respectively. communication rate is b × 2 = 12 symbol/s. If we adopt
the adaptive bandwidth Staircase codes, we can obtain the
bandwidth matrix
 
2 2 2 2
2 2 2 2 
MB =  3 3 3 0  . (10)

6 0 6 0
like example 3. Similarly, the bandwidth utilization of the
recovering process is 100%, while the communication rate
is b × 4 = 24 symbol/s.

V. NUMERICAL ANALYSIS
In this section, we analyze the bandwidth utilization and
communication rate of non-adaptive Shamir’s scheme (SS),
non-adaptive Staircase codes (SC), adaptive Shamir’s scheme
and adaptive Staircase codes to show the performance differ-
ences in different situations.
FIGURE 2. The process demonstration based on Shamir’s scheme.
A. CONSTRUCTION FROM SHAMIR’s SCHEME
The two figures in Fig. 2 show the demonstration of To demonstrate the effectiveness of adaptive secret shar-
non-adaptive and adaptive bandwidth scheme based on ing scheme, we calculate the average bandwidth utilization
Shamir’s scheme for n = 3, k = 2. If we adopt the original and communication rates of original non-adaptive Shamir’s
non-adaptive bandwidth scheme, we will use the data in schemes and adaptive Shamir’s schemes for (8, k) secret shar-
servers of the first k = 2 fast bandwidth, i.e., the blue portion ing. We assume that the bandwidth of each server to the user
of server B and server C in Fig. 2(a). Thus, the bandwidth is arbitrary value in 0 to 10 symbol/s. The average bandwidth
utilization of the recovering process is 6/10 = 60%, while utilization U b can be expressed as
the communication rate is 3 symbol/s (data from the same P(x+1)n Busei
i=1 Bi
row of servers can recover one symbol). If we adopt the
Ub = , (11)
adaptive bandwidth scheme, we can analyze the bandwidth (x + 1)n

VOLUME 8, 2020 108153


H. Xiong et al.: Secure Secret Sharing With Adaptive Bandwidth in Distributed Cloud Storage Systems

where B represents the physical bandwidth of communication B. CONSTRUCTION FROM STAIRCASE CODES
line, Buse represents the bandwidth which is actually used and Staircase codes can obtain optimal communication and read
x represents the upper bound of single bandwidth. costs, hence we try to replace Shamir’s scheme with Staircase
The average communication rates v can be expressed as codes in our construction. In order to highlight the differ-
P(x+1)n ence, we compare the non-adaptive SS, the non-adaptive SC,
rows(MBi ) the adaptive SS and the adaptive SC together. In the com-
v = i=1 , (12)
(x + 1)n parison calculation, we find that the performance of adaptive
where rows(MB ) represents the number of the bandwidth Staircase codes is not good in the case of lower bandwidth
matrix’s rows. upper bound. Therefore, we compare the bandwidth utiliza-
tion and communication rates under higher bandwidth bound,
which is more consistent with the practical situation.

FIGURE 3. The average bandwidth utilization and average communication


rates of adaptive and non-adaptive Shamir’s schemes for n = 8.

FIGURE 4. The bandwidth utilization of non-adaptive scheme and


Fig. 3(a) compares the average bandwidth utilization of adaptive scheme using Shamir’s scheme or Staircase codes for
n = 8, k = 4. The x here represents the upper bound of bandwidth of
original non-adaptive scheme to that of adaptive scheme every server to the user. For example, when x = 10, each bandwidth is
for (8, k) Shamir’s scheme. In this scheme one needs to limited in 0, 1, . . . , 10 symbol/s.

store secret data in 8 different servers redundantly. Simi-


larly, Fig. 3(b) compares the average communication rates Fig. 4(a) compares the average bandwidth utilization of
under the same conditions. It’s not difficult to find that non-adaptive scheme and adaptive schemes constructed from
the bandwidth utilization and communication rate of adap- Shamir’s scheme and Staircase codes for (8, 4) secret sharing,
tive Shamir’s scheme are both significantly higher than that while Fig. 4(b) compares the average communication rates
of non-adaptive ones. And when the threshold n is fixed, in the same case. We can observe that in the condition of
the smaller k is, the greater the difference of communication same bandwidth, after applying Staircase codes instead of
rates are. Shamir’s scheme, both non-adaptive scheme and adaptive

108154 VOLUME 8, 2020


H. Xiong et al.: Secure Secret Sharing With Adaptive Bandwidth in Distributed Cloud Storage Systems

scheme will lead to the decrease of bandwidth utilization. scheme, which means that the proposed adaptive scheme can
However, when the upper bound of the bandwidth is large improve SC scheme more obviously. In general, the adaptive
enough, the communication rate of the adaptive scheme using bandwidth secret sharing schemes constructed from Shamir’s
SC increases instead, which is a gratifying result. It shows that scheme and Staircase codes have their own features: Shamir’s
compared with the SS, the SC can achieve a more obvious scheme can adapt the bandwidth with small calculation cost,
effect after using the adaptive scheme that the promotion of while Staircase codes can recover data more efficiently with
average communication rate by adaptive SC is up to 216.61% less bandwidth usage but at the expense of higher computa-
when x = 100. This means that switching to Staircase tion complexity.
codes in the proposed adaptive scheme can achieve higher
communication rate with less bandwidth cost. VI. CONCLUSION
According to the trend of poly-line graph in Fig. 4, we can In this paper, we investigate the problem of adaptive band-
foresee that the proposed adaptive scheme can achieve better width of secret sharing in distributed cloud storage systems.
results when the upper bound of bandwidth is larger. In order We propose a novel adaptive bandwidth scheme which can be
to verify this conjecture, we compared the bandwidth uti- constructed from Shamir’s secret sharing scheme or Staircase
lization and communication rates of the above four schemes codes. No matter which secret sharing schemes is adopted,
under different values of n and k when x = 1000 and the proposed scheme can achieve bandwidth self-adaption
k = 0.5n. between the user and the servers in distributed storage system
and greatly increases the communication rate when the upper
bound of bandwidth is large enough. In addition, the simula-
tion results show that the adaptive Staircase codes can obtain
higher communication rate and lower bandwidth cost than
adaptive Shamir’s scheme at the same time. In other words,
if the bandwidth condition is the same and the upper bound
is large enough, the scheme constructed from Staircase codes
can recover the secret more efficiently with lower bandwidth
but higher computational overhead. We note that although
the adaptive bandwidth scheme requires calculating the band-
width matrix and changing the combination of the down-
loaded data when the bandwidth changes, which will bring a
little bit extra computational overhead, it is insignificant com-
pared with the improvement of communication efficiency
since the time complexity of generating bandwidth matrix is
approximated to O(n). Furthermore, we speculate the features
of the adaptive bandwidth scheme are influenced by the secret
sharing algorithm which constructs the scheme. Thus, we can
construct the adaptive bandwidth scheme from other secret
sharing algorithm and might obtain other excellent features.

REFERENCES
[1] S. Kamara and K. Lauter, ‘‘Cryptographic cloud storage,’’ in Proc. Int.
Conf. Financial Cryptogr. Data Secur., Berlin, Heidelberg, Jan. 2010,
pp. 136–149.
[2] J. Ousterhout, A. Gopalan, A. Gupta, A. Kejriwal, C. Lee, B. Montazeri,
D. Ongaro, S. J. Park, H. Qin, M. Rosenblum, S. Rumble, R. Stutsman, and
S. Yang, ‘‘The RAMCloud storage system,’’ ACM Trans. Comput. Syst.,
vol. 33, no. 3, pp. 1–55, Sep. 2015.
[3] F. Chen and X. Shao, ‘‘Broken-motifs diffusion LMS algorithm for
reducing communication load,’’ Signal Process., vol. 133, pp. 213–218,
Apr. 2017.
[4] R. Shor, G. Yadgar, W. Huang, E. Yaakobi, and J. Bruck, ‘‘How to best
FIGURE 5. The average bandwidth utilization and average share a big secret,’’ in Proc. 11th ACM Int. Syst. Storage Conf., Jun. 2018,
communication rates of adaptive and non-adaptive scheme constructed pp. 76–88.
from Shamir’s scheme or Staircase codes for x = 1000. [5] J. K. Resch and J. S. Plank, ‘‘AONT-RS: Blending security and perfor-
mance in dispersed storage systems,’’ in Proc. USENIX Conf. File Stroage
Technol., Feb. 2011, pp. 191–202.
Fig. 5 shows that when the upper bound of bandwidth is [6] M. W. Storer, K. M. Greenan, E. L. Miller, and K. Voruganti,
large enough, the problem of poor self-adaptability of SC ‘‘POTSHARDS—A secure, recoverable, long-term archival storage sys-
scheme at low bandwidth no longer exists, and the com- tem,’’ ACM Trans. Storage, vol. 5, no. 2, pp. 1–35, Jun. 2009.
[7] E. L. Miller, W. E. Freeman, D. D. E. Long, and B. C. Reed, ‘‘Strong secu-
munication rate improvement of SC scheme after adopt- rity for network-attached storage,’’ in Proc. USENIX Conf. File Storage
ing the adaptive algorithm is much higher than that of SS Technol., Feb. 2002, pp. 1–13.

VOLUME 8, 2020 108155


H. Xiong et al.: Secure Secret Sharing With Adaptive Bandwidth in Distributed Cloud Storage Systems

[8] A. Shamir, ‘‘How to share a secret,’’ Commun. ACM, vol. 22, no. 11, [33] S. Pawar, S. El Rouayheb, and K. Ramchandran, ‘‘Securing dynamic
pp. 612–613, Nov. 1979. distributed storage systems against eavesdropping and adversarial attacks,’’
[9] G. R. Blakley, ‘‘Safeguarding cryptographic keys,’’ in Proc. Int. Workshop IEEE Trans. Inf. Theory, vol. 57, no. 10, pp. 6734–6753, Oct. 2011.
Manag. Requirements Knowl. (MARK), Jun. 1979, pp. 313–318. [34] A. S. Rawat, O. O. Koyluoglu, N. Silberstein, and S. Vishwanath, ‘‘Optimal
[10] A. Bessani, M. Correia, B. Quaresma, F. André, and P. Sousa, ‘‘DepSky: locally repairable and secure codes for distributed storage systems,’’ IEEE
Dependable and secure storage in a cloud-of-clouds,’’ ACM Trans. Storage, Trans. Inf. Theory, vol. 60, no. 1, pp. 212–236, Jan. 2014.
vol. 9, no. 4, pp. 31–46, Nov. 2013. [35] C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder, P. Gopalan, J. Li, and
[11] Y. Hua, F. Chen, S. Deng, S. Duan, and L. Wang, ‘‘Secure distributed esti- S. Yekhanin, ‘‘Erasure coding in windows azure storage,’’ in Proc. USENIX
mation against false data injection attack,’’ Inf. Sci., vol. 515, pp. 248–262, Annu. Tech. Conf., Jun. 2012, pp. 15–26.
Apr. 2020. [36] D. Burihabwa, P. Felber, H. Mercier, and V. Schiavoni, ‘‘A performance
[12] F. Chen, T. Shi, S. Duan, L. Wang, and J. Wu, ‘‘Diffusion least logarithmic evaluation of erasure coding libraries for cloud-based data stores,’’ in
absolute difference algorithm for distributed estimation,’’ Signal Process., Proc. IFIP Int. Conf. Distrib. Appl. Interoper. Syst., Berlin, Heidelberg,
vol. 142, pp. 423–430, Jan. 2018. Jun. 2016, pp. 160–173.
[13] A. Bessani, R. Mendes, T. Oliveira, N. Neves, M. Correia, M. Pasin, [37] C. Lv, X. Jia, L. Tian, J. Jing, and M. Sun, ‘‘Efficient ideal threshold secret
and P. Verissimo, ‘‘SCFS: A shared cloud-backed file system,’’ in Proc. sharing schemes based on EXCLUSIVE-OR operations,’’ in Proc. 4th Int.
USENIX Annu. Tech. Conf., Jun. 2014, pp. 169–180. Conf. Netw. Syst. Secur., Sep. 2010, pp. 136–143.
[14] A. Beimel, ‘‘Secret-sharing schemes: A survey,’’ in Proc. 3rd Int. Workshop [38] Y. Wang, ‘‘Privacy-preserving data storage in cloud using array BP-XOR
Coding Cryptol. (IWCC), Qingdao, China, May 2011, pp. 11–46. codes,’’ IEEE Trans. Cloud Comput., vol. 3, no. 4, pp. 425–435, Oct. 2015.
[15] I. Tamo, Z. Wang, and J. Bruck, ‘‘Zigzag codes: MDS array codes
with optimal rebuilding,’’ IEEE Trans. Inf. Theory, vol. 59, no. 3,
pp. 1597–1616, Mar. 2013.
[16] A. G. Dimakis, K. Ramchandran, Y. Wu, and C. Suh, ‘‘A survey on network
codes for distributed storage,’’ Proc. IEEE, vol. 99, no. 3, pp. 476–489,
Mar. 2011.
[17] K. V. Rashmi, N. B. Shah, and P. V. Kumar, ‘‘Optimal exact-regenerating HAILIANG XIONG (Member, IEEE) received
codes for distributed storage at the MSR and MBR points via a the B.Sc. and Ph.D. degrees in communication
product-matrix construction,’’ IEEE Trans. Inf. Theory, vol. 57, no. 8, and information systems from Xidian University,
pp. 5227–5239, Aug. 2011. Xi’an, China, in 2005 and 2011, respectively.
[18] K. V. Rashmi, N. B. Shah, D. Gu, H. Kuang, D. Borthakur, and From 2009 to 2011, he was a Visiting Scholar
K. Ramchandran, ‘‘A solution to the network challenges of data recovery with The University of Sheffield, U.K., and the
in erasure-coded distributed storage systems: A study on the Facebook University of Bedfordshire, U.K. From 2014 to
warehouse cluster,’’ in Proc. USENIX Conf. Hot Topics Storage File Syst., 2015, he held a postdoctoral position with The
Jun. 2013, p. 8. University of British Columbia, Canada. He is
[19] S. Takahashi and K. Iwamura, ‘‘Secret sharing scheme suitable for cloud currently an Associate Professor with the School
computing,’’ in Proc. IEEE 27th Int. Conf. Adv. Inf. Netw. Appl. (AINA), of Information Science and Engineering, Shandong University, Qingdao,
Mar. 2013, pp. 530–537. China. His research interests include privacy and security, artificial intel-
[20] O. O. Koyluoglu, A. S. Rawat, and S. Vishwanath, ‘‘Secure cooperative ligence, navigation and positioning, cognitive radio networks, interference
regenerating codes for distributed storage systems,’’ IEEE Trans. Inf. The- suppression, electronic warfare, and satellite communications.
ory, vol. 60, no. 9, pp. 5228–5244, Sep. 2014.
[21] R. Tandon, S. Amuru, T. C. Clancy, and R. M. Buehrer, ‘‘Toward optimal
secure distributed storage systems with exact repair,’’ IEEE Trans. Inf.
Theory, vol. 62, no. 6, pp. 3477–3492, Jun. 2016.
[22] K. V. Rashmi, N. B. Shah, and K. Ramchandran, ‘‘A piggybacking design
framework for read-and download-efficient distributed storage codes,’’
IEEE Trans. Inf. Theory, vol. 63, no. 9, pp. 5802–5820, Sep. 2017. CHANGWU HU (Student Member, IEEE)
[23] J. Li, X. Tang, and C. Tian, ‘‘A generic transformation for optimal repair received the B.Sc. degree from the School of
bandwidth and rebuilding access in MDS codes,’’ in Proc. IEEE Int. Symp. Information Science and Engineering, Shandong
Inf. Theory (ISIT), Jun. 2017, pp. 1623–1627. University, Jinan, China, in 2017. He is currently
[24] R. Tajeddine and S. El Rouayheb, ‘‘Private information retrieval from MDS pursuing the M.S. degree with the School of
coded data in distributed storage systems,’’ in Proc. IEEE Int. Symp. Inf.
Information Science and Engineering, Shandong
Theory (ISIT), Jul. 2016, pp. 1411–1415.
University, Qingdao, China. His research interests
[25] V. Sharma, S. Kalyanaraman, K. Kar, K. K. Ramakrishnan, and
include the Internet of Things, wireless sensor
V. Subramanian, ‘‘MPLOT: A transport protocol exploiting multipath
diversity using erasure codes,’’ in Proc. IEEE INFOCOM 27th Conf. networks, distributed storage, information theory,
Comput. Commun., Apr. 2008, pp. 121–125. cryptography, and information security.
[26] A. Parakh and S. Kak, ‘‘Space efficient secret sharing for implicit data
security,’’ Inf. Sci., vol. 181, no. 2, pp. 335–341, Jan. 2011.
[27] R. J. McEliece and D. V. Sarwate, ‘‘On sharing secrets and Reed–Solomon
codes,’’ Commun. ACM, vol. 24, no. 9, pp. 583–584, Sep. 1981.
[28] W. Huang, M. Langberg, J. Kliewer, and J. Bruck, ‘‘Communication
efficient secret sharing,’’ IEEE Trans. Inf. Theory, vol. 62, no. 12,
YUJUN LI (Member, IEEE) received the Ph.D.
pp. 7195–7206, Dec. 2016.
degree from the Harbin Institute of Technology,
[29] R. Bitar and S. E. Rouayheb, ‘‘Staircase codes for secret sharing with
optimal communication and read overheads,’’ IEEE Trans. Inf. Theory,
Harbin, China, in 2001. Since 2005, he has
vol. 64, no. 2, pp. 933–943, Feb. 2018. been a Deputy Director with the State Key
[30] R. Bitar, P. Parag, and S. El Rouayheb, ‘‘Minimizing latency for secure Laboratory of Digital Multimedia Technology and
distributed computing,’’ in Proc. IEEE Int. Symp. Inf. Theory (ISIT), the Research and Development Center, Hisense
Jun. 2017, pp. 2900–2904. Group Company Ltd. He is currently a Full
[31] W. Huang and J. Bruck, ‘‘Secure RAID schemes for distributed storage,’’ Professor with the Department of Information
in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Jul. 2016, pp. 1401–1405. Science and Engineering, Shandong University,
[32] L. Xiang, Y. Xu, J. C. S. Lui, and Q. Chang, ‘‘Optimal recovery of single Qingdao, China. His research interests include
disk failure in RDP code storage systems,’’ ACM SIGMETRICS Perform. deep learning, natural language processing, big data, artificial intelligence
Eval. Rev., vol. 38, no. 1, pp. 119–130, Jun. 2010. algorithm, and sentiment analysis.

108156 VOLUME 8, 2020


H. Xiong et al.: Secure Secret Sharing With Adaptive Bandwidth in Distributed Cloud Storage Systems

GUANGYUAN WANG (Student Member, IEEE) HONGCHAO ZHOU (Member, IEEE) received
received the B.Sc. degree from the School of the B.Sc. degree in physics and mathematics, and
Electrical Engineering and Information, Dalian the M.Sc. degree in control science and engi-
Jiaotong University, Dalian, China, in 2017. He is neering from Tsinghua University, Beijing, China,
currently pursuing the M.S. degree with the School in 2006 and 2008, respectively, and the M.Sc. and
of Information Science and Engineering, Shan- Ph.D. degrees in electrical engineering from the
dong University, Qingdao, China. His research California Institute of Technology, Pasadena, CA,
interests include the Internet of Things, wireless USA, in 2009 and 2012, respectively. He is cur-
sensor networks, physical layer security, cooper- rently a Postdoctoral Researcher with the Research
ative communication, distributed cloud storage, Laboratory of Electronics, Massachusetts Institute
electronic warfare, and information theory. of Technology, Cambridge, MA, USA. He is also a Full Professor with the
Department of Information Science and Engineering, Shandong University,
Qingdao, China. His current research interests include information theory
and randomness, data storage, information security, and stochastic biological
networks. He was a recipient of the 2013 Charles Wilts Prize for the Best
Doctoral Thesis in electrical engineering from the California Institute of
Technology.

VOLUME 8, 2020 108157

You might also like