You are on page 1of 12

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBDATA.2019.2907133, IEEE
Transactions on Big Data
1

P-MOD: Secure Privilege-Based Multilevel


Organizational Data-Sharing in Cloud Computing
Ehab Zaghloul, Kai Zhou and Jian Ren

Abstract—Cloud computing has changed the way enterprises data. Data owners are becoming more interested in selectively
store, access and share data. Big data sets are constantly sharing information with data users based on different levels
being uploaded to the cloud and shared within a hierarchy of granted privileges. The desire to grant level-based access
of many different individuals with different access privileges.
With more data storage needs turning over to the cloud, finding results in higher computational complexity and complicates
a secure and efficient data access structure has become a the methods in which data is shared on the cloud. Research
major research issue. In this paper, a Privilege-based Multilevel in this field focuses on finding enhanced schemes that can
Organizational Data-sharing scheme (P-MOD) is proposed that securely, efficiently and intelligently share data on the cloud
incorporates a privilege-based access structure into an attribute- among users according to granted access levels.
based encryption mechanism to handle the management and
sharing of big data sets. Our proposed privilege-based access Based on a study conducted by the National Institute of
structure helps reduce the complexity of defining hierarchies as Standards and Technology (NIST), Role-Based Access Con-
the number of users grows, which makes managing healthcare trol (RBAC) models are the most widely used to share data
records using mobile healthcare devices feasible. It can also facil- in hierarchical enterprises of 500 or more individuals [4].
itate organizations in applying big data analytics to understand RBAC models aim to restrict system access to authorized
populations in a holistic way. Security analysis shows that P-MOD
is secure against adaptively chosen plaintext attack assuming users as they provide access control mechanisms. The access
the DBDH assumption holds. The comprehensive performance control mechanisms are based on predefined and fixed roles
and simulation analyses using the real U.S. Census Income data making the models identity-centric. Each individual within the
set demonstrate that P-MOD is more efficient in computational organization is assigned to a role that defines the privileges of
complexity and storage space than the existing schemes. the user. However, the limitations of this model are evident
Index Terms—Cloud computing, big data, hierarchy, privilege- when presented with a large complex matrix of data users in
based access, sensitive data, attribute-based encryption, mobile an organization. The foundation of RBAC is based on abstract
healthcare. choices for roles. This would require a continuously increasing
number of RBAC roles to properly encapsulate the privileges
I. I NTRODUCTION assigned to each user of the system. Managing a substantial
number of rules can become a resource-intensive task, referred
It was estimated that data breaches cost the United to as role explosion [5].
States’ healthcare industry approximately $6.2 billion in 2016 To better comprehend the importance of this study, con-
alone [1]. To mitigate financial loss and implications on the sider a scenario where patients share their Public Health
reputation associated with data breaches, large multilevel orga- Records (PHR) on the cloud to be accessed by health providers
nizations, such as healthcare networks, government agencies, and administrators of a hospital. In most cases, the patient
banking institutions, commercial enterprises and etc., began wishes to grant the physician access to most parts of the PHR
allocating resources into data security research to develop and (including its most sensitive parts, e.g. medical history) while
improve accessibility and storage of highly sensitive data. granting an administrator access to limited parts that are less
One major way that large enterprises are adapting to in- sensitive (e.g. date of birth). In order to achieve that, the patient
creased sensitive data management is the utilization of the needs to define a hierarchy of data access privileges ranking
cloud environment. It was reported that more than half of all various types of hospital employees. Next, the patient needs
U.S. businesses have turned over to the cloud for their business to clarify the privileges at each level to define the content
data management needs [2]. The on-demand cloud access and that can be accessed by each data user. It is important to
data sharing can greatly reduce data management cost, storage realize that patients have different conservative beliefs when it
flexibility, and capacity [3]. However, data owners have deep comes to their PHRs. For example, some would prefer granting
concerns when sharing data on the cloud due to security issues. access to the most sensitive parts of their PHRs to only specific
Once uploaded and shared, the data owner inevitably loses physicians while denying others.
control over the data, opening the door to unauthorized data In this paper, a Privilege-based Multilevel Organizational
access. Data-sharing scheme (P-MOD) is proposed. It builds on
A critical issue for data owners is how to efficiently and concepts presented in [6] to solve the problems of sharing
securely grant privilege level-based access rights to a set of data within organizations with complex hierarchies. The main
contributions presented in this paper can be summarized as
The authors are with the Department of Electrical and Computer Engineer-
ing, Michigan State University, East Lansing, MI 48824-1226, Email: {ebz, follows:
zhoukai, renjian}@msu.edu • We present multiple data file partitioning techniques and

2332-7790 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBDATA.2019.2907133, IEEE
Transactions on Big Data
2

propose a privilege-based access structure that facilitate set of descriptive attributes, while each private key is integrated
data sharing in hierarchical settings. with an access policy. For authorized data users to decrypt the
• We formally prove the security of P-MOD and show that ciphertext, they must first obtain a private key from the key-
it is secure against adaptively chosen plaintext attacks issuer to use in decryption. The key-issuer integrates the access
under the Decisional Bilinear Diffie-Hellman (DBDH) policy into the keys generated. Data users can successfully
assumption. decrypt a ciphertext if the set of descriptive attributes associ-
• We present present a performance analysis for P-MOD ated with the ciphertext satisfies the access policy integrated
and compare it to three existing schemes [7]–[9] that aim within their private keys. KP-ABE can achieve fine-grained
to achieve similar hierarchical goals. access control and is more flexible than Fuzzy IBE. However,
• We implement P-MOD and conduct comprehensive simu- the data owner must trust the key-issuer to only issue private
lations under various scenarios using the real U.S. Census keys to data users granted the privilege of access. This is a
Income data set [10]. We also compare our results to limitation since the data owner ultimately forfeits control over
simulations we have conducted for two other schemes [7], which data users are granted access.
[9] under the same conditions. On the other hand, CP-ABE is considered to be conceptually
The rest of this paper is organized as follows. In Section II, similar to Role-Based Access Control (RBAC) [14]. It gives
the related work is reviewed. In Section III, preliminaries are the data owner control over which data user is able to
introduced that summarize key concepts used in this research. decrypt certain ciphertexts. This is due to the access structure
Next, in Section IV, the problem formulation is described being integrated by the data owner into the ciphertext during
outlining the design goals and system model. In Section V, encryption. It allows the private key generated by the key-
the proposed P-MOD scheme is presented in detail. Following issuer to only contain the set of attributes possessed by the data
that, in Section VI we formally prove the security of P- user. Some CP-ABE schemes [15]–[19] were later introduced
MOD based on the hardness of the Decisional Bilinear Diffie- that can provide higher flexibility and better efficiency.
Hellman (DBDH) problem. In Section VII, a performance Most attribute-based encryption schemes such as Fuzzy
analysis of P-MOD is conducted. Finally, in Section IX, a IBE, KP-ABE, and CP-ABE serve as a better solution when
conclusion is drawn to summarize the work done in this data users are not ranked into a hierarchy and each is inde-
research. pendent of one another (i.e. no relationships). However, they
share a common limitation of high computational complexity
in the case of large multilevel organizations. These schemes
II. R ELATED W ORK
require a single data file to be encrypted with a large number
Fuzzy Identity-Base Encryption (Fuzzy IBE) was introduced of attributes (from different levels) to grant them access to it.
in [11] to handle data sharing on the cloud in a flexible Hierarchical Attribute-Based Encryption (HABE) that com-
approach using encryption. The ciphertext is shared on the bines the Hierarchical Identity-Based Encryption (HIBE) [20]
cloud to restrict access to authorized users. In order for an scheme and CP-ABE [7] was later introduced in [8], [21].
authorized individual to obtain the data, the user must request HABE is able to achieve fine-grained access control in a
a private key from a key-issuer to decrypt the encrypted data. hierarchical organization. It consists of a root master that
Fuzzy IBE is a specific type of function encryption [12] in generates and distributes parameters and keys, multiple domain
which both the private key of the data user and ciphertext are masters that delegate keys to domain masters at the following
affiliated with attributes. Attributes are descriptive pieces of levels, and numerous users. In this scheme, keys are generated
information that can be assigned to any user or object. Since in the same hierarchical key generation approach as the HIBE
attributes can be any variable, they provide more flexibility scheme. To express an access policy, HABE uses a disjunctive
when granting data access. The scheme enables a set of normal form where all attributes are administered from the
descriptive attributes to be associated with a private key and same domain authority into one conjunctive clause. This
the ciphertext shared on the cloud. If the private key of the scheme becomes unsuitable for practical implementation when
data user incorporates the minimum threshold requirement of replicas of the same attributes are administered by other do-
attributes that match those integrated within the ciphertext, the main authorities. Synchronizing attribute administration might
data user can decrypt it. Although this scheme allows complex become a challenging issue with complex organizations that
systems to be easily defined using attributes, it becomes less have multiple domain authorities. Examples of other hierar-
efficient when used to express large systems or when the chical schemes were later introduced in [22]–[25].
number of attributes increases. File Hierarchy Ciphertext Policy Attribute-Based Encryp-
Attribute-Based Encryption (ABE) schemes later emerged tion (FH-CP-ABE) [9] is one of the most recent hierarchical
to provide more versatility when sharing data. These schemes solutions available today. It proposes a leveled access structure
integrate two types of constructs: attributes and access policies. to manage a hierarchical organization that shares data of
Access policies are statements that join attributes to express various sensitivity. A single access structure was proposed that
which users of the system are granted access and which users represents both the hierarchy and the access policies of an
are denied. ABE schemes were introduced via two differ- organization. This access structure consists of a root node,
ent approaches: Key-Policy Attribute-Based Encryption (KP- transport nodes, and leaf nodes. The root node and transport
ABE) [13] and Ciphertext Policy Attribute-Based Encryption nodes are in the form of gates (i.e. AND or OR). The leaf
(CP-ABE) [7]. In KP-ABE, each ciphertext is labeled with a nodes represent attributes that are possessed by data users.

2332-7790 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBDATA.2019.2907133, IEEE
Transactions on Big Data
3

Based on the possession of certain attributes, each data user is • Computability: e(g, g) is an efficiently computable algo-
mapped into specific transport nodes (certain levels within the rithm.
hierarchy) based on the access structure that the user satisfies.
If the data user satisfies a full branch of the access structure,
then the data user is ranked at the root node (highest level C. Decisional Bilinear Diffie-Hellman (DBDH) Assumption
within the hierarchy). Data users ranked at the highest level
The DBDH assumption [27] is a computational hardness
(root node) can decrypt a ciphertext of highest sensitivity and
assumption and is defined as follows:
any other ciphertext with less sensitivity in the lower levels of
the hierarchy. The nodes ranked in the lower levels (transport Let G0 be a group of prime order p, g be a generator, and
nodes) can not decrypt any ciphertexts in the levels above. a, b, c ∈ Zp be chosen at random.
The main advantage of this scheme is that it provides leveled It is infeasible for the adversary to distinguish between any
access structures which are integrated into a single access given (g, g a , g b , g c , e(g, g)abc ) and (g, g a , g b , g c , R), where
structure. As a result, storage space is saved as only one copy R ∈r G1 is a random element and ∈r denotes a random
of the ciphertext is needed to be shared on the cloud for all selection. An algorithm A that outputs a guess z ∈ {0, 1},
data users. However, since this scheme uses a single access has advantage ε in solving the DBDH problem in G0 if:
P r[A(g, g a , g b , g c , T = e(g, g)abc ) = 0]

structure to represent the full hierarchy, the higher levels are
(1)
forced to accommodate attributes of all the levels below. As −P r[A(g, g a , g b , g c , T = R) = 0] ≥ ε.

the number of levels increases in the hierarchy, the number of The DBDH assumption holds if no polynomial algorithm
attributes grows exponentially making this scheme infeasible has a non-negligible advantage in solving the DBDH problem.
on a large scale. A simplified and reduced access structure is
proposed to reduce the computational complexity by removing
all branches of the single access structure while keeping one D. Access Structure
full branch. The full branch consists of the root node, a set
of transport nodes (one for each level), and the leaf nodes An access structure represents access policies for a set of
(attributes). However, in real-life applications, relationships individuals interested in gaining individual access to a secret.
within an organization are often built in a cross-functional The access structure defines sets of attributes that can be
matrix, making this a complicated solution when assigning possessed by a single individual to allow access to the secret.
privileges. It is defined as follows:
Let {P1 , . . . , Pn } be a set of parties. A set of parties that can
III. P RELIMINARIES reconstruct the secret is defined as a collection. The collection
is monotone meaning that, if A ⊆ 2{P1 ,...,Pn } then ∀B ∈ A
A. Cryptographic Hash Function
and B ⊆ C implies C ∈ A . An access structure is a monotone
A cryptographic hash function h is a mathematical algo- collection A of non-empty subsets {P1 , . . . , Pn }, i.e., A ⊆
rithm that maps data of arbitrary size to a bit string of fixed 2{P1 ,...,Pn } \ ∅. The sets in A are called the authorized sets
size. It is cryptographically secure if it satisfies the following and the sets not in A are called the unauthorized sets.
requirements: In this paper, we use parties to represent the attributes.
• Preimage-Resistance: It should be computationally in- This means that an access structure A may consist of both
feasible to find any input for any pre-specified output authorized and unauthorized sets of attributes.
which hashes to that output, i.e. for any given y, it
should be computationally infeasible to find an x such
that h(x) = y. E. Leveled Access Tree Ti
• Week Collision Resistance: For any given x, it should
be computationally infeasible to find x0 6= x such that An access tree Ti at level Li represents an access structure
h(x0 ) = h(x) [26]. that determines whether a data user can decrypt the ciphertext
• Strong Collision-Resistance: It should be computationally
eski at that level or not. A Ti may consist of multiple nodes.
infeasible to find any two distinct inputs x and x0 , such We use xil ∈ Ti to represent the lth node of Ti . The non-
that h(x) = h(x0 ). leaf nodes of Ti are in the form of threshold gates, while the
leaf nodes represent possible attribute values possessed by data
users.
B. Bilinear Maps For every node xil ∈ Ti , a threshold value kxil is assigned. A
Let G0 and G1 be two multiplicative cyclic groups of the node in the form of an AND gate is associated with a threshold
same prime order p. The generator of G0 is denoted as g. A value kxil = numxil , where numxil represents the number of
bilinear map from G0 × G0 to G1 is a function e : G0 × G0 → children of node xil . A node in the form of an OR gate or
G1 that satisfies the following properties: any leaf node representing an attribute is associated with a
a b
• Bilinearity: e(g , g ) = e(g, g)
ab
for any a, b ∈ Zp . threshold value kxil = 1.
a b ab
• Symmetry: e(g , g ) = e(g, g) = e(g b , g a ) for any The root node xi1 of each Ti carries a secret ski . The data
a, b ∈ Zp . user that possesses the correct set of attributes can satisfy Ti
• Non-degeneracy: e(g, g) 6= 1. and obtain ski .

2332-7790 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBDATA.2019.2907133, IEEE
Transactions on Big Data
4

TABLE I: Notations summary


Most Sensitive
Symbol Definition 1.Encrypt
Data Less Sensitive
DO A data owner
Data Owner Least Sensitive
H The hierarchical layout of the O Key Issuer
Cloud Server 3.Data 2.Granting
Li ith level within H where 1 ≤ i ≤ k Fetching Keys
Data Users in a Hierarchy
Ti ith access tree at Li where 1 ≤ i ≤ k
Fi ith data file part where 1 ≤ i ≤ k Fig. 1: General scheme of privilege-based data sharing. (1)
ski ith symmetric key used to encrypt Fi where 1 ≤ i ≤ k Data owner encrypts the data under P-MOD. (2) Data users
EFi ith sym. encrypted Fi under ski where 1 ≤ i ≤ k
eski ith cp-abe encrypted ski under Ti at Li where 1 ≤ i ≤ k
are granted keys based on their level. (3) Data users fetch the
DUj j th data user in set DU where 1 ≤ j ≤ m data from the cloud and decrypt it with their keys.
SKj j th data user’s private key where 1 ≤ j ≤ m
uth attribute within the j th data user’s attribute set Aj where
Aj,u
1 ≤ u ≤ n and 1 ≤ j ≤ m Fine-grained access control: The data owner has the capability
xil lth node of Ti where 1 ≤ l ≤ ∞ and 1 ≤ i ≤ k
to encrypt any part of F using any set of descriptive attributes
kxi Threshold value of xil where 1 ≤ l ≤ ∞ and 1 ≤ i ≤ k
l he/she wishes, limiting access to only authorized data users.
The set of descriptive attributes is defined by the data owner
at the time of encryption and can be selected from an infinite
IV. P ROBLEM F ORMULATION
pool.
Consider a data owner that possesses a data file F and Collusion resistant: Two or more data users at the
wishes to selectively share different segments of it on the cloud same/different level can not combine their private keys to gain
among a set of data users based on certain access privileges. access to any part of F they are not authorized to access
We assume that the data users can be ranked into a hierarchy independently.
that defines their access privileges.
Selectively sharing data files on the cloud becomes a burden B. System Model
on the data owner as the hierarchy grows (the access privileges
The general model of privilege-based data sharing among
increase in number) and/or as the access restrictions become
hierarchically-ranked data users is illustrated in Fig. 1. The
more complex due to an increase in the sensitivity of the
system consists of four main entities:
file segments. A trivial solution involves the data owner to
use public key encryption. This solution would require the Data owner (DO): An individual that owns a data file and
data owner to encrypt the same part of the data file once for wishes to selectively share it with multiple data users based
each data user being granted access then upload the resulting on certain desired privacy preferences.
ciphertexts to the cloud. The data users would then fetch their Data users (DU ): A set of hierarchically-ranked individuals
uniquely encrypted parts of the file from the cloud and utilize {DUj ∈ DU | 1 ≤ j ≤ ∞} interested in obtaining different
their private keys to decrypt them. This method ensures that no segments of a shared data file. Data users fall into different
unprivileged data user will gain access to any part of the data levels within the hierarchy based on specific sets of attributes
file even if that user is able to download the ciphertexts from Aj = {Aj,1 , Aj,2 . . . Aj,n } they possess.
the cloud. However, on a large scale, public key encryption Key-issuer: A fully trusted entity that generates private keys
becomes an inefficient solution due to the increase in the for the data users that possess a correct set of attributes.
number of encryptions and large storage spaces required. Cloud server: A non-trusted entity used to store the encrypted
Therefore, the challenge is to provide the data owners with an segments of the data file.
efficient, secure and privilege-based method that allows them As shown by Fig. 1, the data users ranked at the higher
to selectively share their data files among multiple data users levels are granted access to more sensitive segments of file
while minimizing the required cloud storage space needed to F than those ranked at lower levels. In our proposed scheme,
store the encrypted data segments. the hierarchy is not fixed nor predefined. It is defined by the
data owners as they encrypt their files to be shared with a
set of data users. A hierarchy can consist of multiple levels
A. Design Goals
{L1 , L2 , . . . , Lk }, where 1 ≤ k ≤ ∞. Level L1 represents
Based on the problem described above, we have the follow- the highest rank while level Lk represents the lowest rank.
ing design goals: At each level Li of the hierarchy, the data owner defines the
Privilege-Based Access: Data is shared in a hierarchical man- desired leveled access tree Ti , where 1 ≤ i ≤ k. The access
ner based on user privileges. Data users with more privileges tree identifies the policies required in order for a user to gain
(ranked at the higher levels of the hierarchy) are granted access access to the data file at a certain level.
to more sensitive parts of F than those with fewer privileges
(ranked at the lower levels of the hierarchy). V. T HE P ROPOSED P-MOD S CHEME
Data Confidentiality: All parts of F are completely protected This section presents the construction of P-MOD. We as-
from unprivileged data users (including the storage space). sume that file F is partitioned into k parts based on data
Data users are entitled to access the parts of F corresponding sensitivity defined by its owner. Each part of F is then inde-
to the levels they fall in and/or any other parts corresponding pendently encrypted and shared among the data users of the
to the levels below with respect to their own. system under our proposed privilege-based access structure.

2332-7790 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBDATA.2019.2907133, IEEE
Transactions on Big Data
5

F1 F2 Fk F1 F2 Fk
L1
T 1 at andom()
...
F1 L2 e t To R
Highest
T 2 at sk 1) sk 1 =s G
F2 Sensitivity h(
...
… sk 2 =
Lk G
T k at k k

...
1)

...
Lowest G G
Fk = h(s …
Sensitivity sk k
Case (1) Case (2) Case (3) G G G A A A A

Fig. 2: File partitioning A A A A


G G

A A A A

A. Data File Partitioning and Encryption Fig. 3: Privilege-based multilevel access structure
The DO partitions file F into a set of k data sections, that
is F = {F1 , F2 , . . . , Fk }. Each Fi ∈ F is treated as a new file
that is associated with a sensitivity value used to assign access corresponding level Li . Each Ti is associated with the ap-
rights to the data users based on their privileges. The process propriate leaf nodes (attributes) that define the privileges of
of partitioning F is performed based on the structure of F. the level. Data users that possess the correct sets of attributes
We assume that F consists of at least one record, resulting in which can satisfy Ti at Li are granted access to symmetric
multiple ways to partition it as shown in Fig. 2. key ski , hence, can derive {ski+1 , . . . , skk } and are able to
If F consists of a single record, then each Fi ∈ F represents decrypt parts {EFi , EFi+1 , . . . , EFk }.
one or more record attribute(s) associated with the record, As shown in Fig. 3, an access tree Ti may consist of non-
as shown in case (1) in Fig. 2. However, if F consists of leaf nodes and leaf nodes. The non-leaf nodes are threshold
multiple records, then the DO has flexibility in choosing how gates, represented as ‘G’, while the leaf nodes are attributes,
to partition it. One approach is to handle each record as a represented as ‘A’. The DO may construct access trees from
whole, where records are clustered into groups of similar any number and layout of nodes that satisfy the privacy
sensitivity. In this case, each Fi ∈ F represents one or preferences desired.
more record(s), as shown case (2) in Fig. 2. Alternatively, One of the advantages of our proposed privilege-based
partitioning can be performed over specific record attributes, access structure is the ability to reduce attribute replication
versus the whole record. In this case, Fi ∈ F represents one when defining the hierarchy. Data users that possess attributes
or more record attribute(s) of the records, as shown case (3) which can satisfy access tree Ti are granted access to ski ,
in Fig. 2. however, they do not need to possess attributes that can also
Regardless of how partitioning is performed, each Fi ∈ F satisfy the access trees {Ti+1 , . . . , Tk } in order to obtain
is then treated as a new data file. Suppose F1 contains the {ski+1 , . . . , skk }. This helps simplify the process of defining
most sensitive information of F that can only be accessed by a hierarchy as the number of users and/or access constraints
a data user at the highest level L1 and Fk contains the least grow. With reduced-size access trees, we can greatly reduce
sensitive information of F that can be accessed by all data the computational complexity when encrypting file partitions,
users at any level of the hierarchy. Before the DO uploads generating private keys for privileged users and decrypting
{F1 , F2 , . . . , Fk } to the cloud, each Fi ∈ F is encrypted ciphertexts.
separately using a symmetric encryption algorithm such as
the Advanced Encryption Algorithm with a secret key ski to
produce an encrypted file C. The P-MOD Construction
EFi = Encski (Fi ). (2) The scheme is based on the construction presented in [7]
For key selection, the DO randomly selects sk1 . The remaining and formally divides the process into four main functions:
symmetric keys {sk2 , . . . , skk } are then derived from sk1 Setup(1κ ): This is a probabilistic function carried out by the
using a one-way cryptographic hash function h, that is key-issuer. The Setup function takes a security parameter κ
ski+1 = h(ski ). (3) and randomly chooses values α, β ∈ Zp . The outputs of this
Next, the symmetric keys are encrypted as discussed in function are public key P K and master key M K defined as:
Section V-C, to be accessed only by the data users that have P K = {G0 , g, B = g β , e(g, g)α }, (4)
been granted the privilege of access. The privileged data users M K = {β, g }. α
(5)
that are successful in obtaining ski corresponding to level
Li can derive {ski+1 , . . . , skk } using equation (3). However, KeyGen(M K, Aj ): This is a probabilistic function carried
given the properties of hash function h, ski cannot be used to out by the key-issuer. The inputs to this function are M K
derive any of the symmetric keys {sk1 , . . . , ski−1 }. generated by the Setup function, and the attribute set Aj of
j th data user, where Aj,u ∈ Aj represents the uth attribute
within the set. The KeyGen function outputs a unique private
B. The P-MOD Privilege-Based Access Structure key SKj for the data user. In order to guarantee a unique
Fig. 3 illustrates the general privilege-based access structure SKj , it generates a random value rj ∈ Zp and incorporates
of P-MOD. Data users are ranked into k levels of privileges, it within the private key. Based on the number of attributes in
{L1 , L2 , . . . , Lk }. The DO defines an access tree Ti at each the input set Aj , the KeyGen function also generates a random

2332-7790 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBDATA.2019.2907133, IEEE
Transactions on Big Data
6

value rj,u ∈r Zp for each attribute within the set. The SKj is is defined as:
defined as: e(Dj,u , Cxil )
DecryptNode(eski , SKj , xil ) =
SKj = Dj = g (α+rj )/β , 0 , C0 )
e(Dj,u xi l
0
{Dj,u = g rj · h(u)rj,u , Dj,u = g rj,u | ∀Aj,u ∈ Aj } .

qxi (0)
(6) e(g rj · h(u)rj,u , g l )
= q i (0)
e(g rj,u , h(xil ) xl )
The purpose of the randomly selected rj is to ensure that rj ·qxi (0)
each SKj is unique and the attribute components within the = e(g, g) l . (8)
SKj are associated. It should be infeasible for data users to If xil
is a non-leaf node, DecryptNode(eski , SKj , xil ) op-
i
collude by combining components of their private keys (Dj,u erates recursively. For each node zl,c that is a child of xil ,
0 i
and Dj,u ) to decrypt data beyond their individual access rights. DecryptNode(eski , SKj , zl,c ) is computed and the output is
This means, the attribute components from different private stored in Fzl,c
i .

keys cannot be combined to access unauthorized data. This recursive function is based on Lagrange interpolation.
The Lagrange coefficient ∆a,Aj for a ∈ Zp and the set of
Encrypt(P K,ski ,Ti ): This is a probabilistic function carried attributes Aj is defined as:
out by the DO to encrypt the symmetric keys that are to be Y x−k
shared with the privileged data users. The inputs to this func- ∆a,Aj (x) = . (9)
a−k
tion are P K, the public key generated by the Setup function, k∈Aj ,k6=a

the symmetric key ski derived in equation (3) representing Let Aj,xil be an arbitrary kxil -sized set of child nodes
i
the data that will be encrypted, and the access tree Ti that zl,c such that Fzl,c
i 6= ∅. If no such set exists then the
defines the authorized set of attributes at Li . The output of function returns Fzl,c
i = ∅. Otherwise, Fzl,c
i is computed using
this function is the encrypted symmetric key eski . Lagrange interpolation as follows:
∆a,A0 (0)
j,xi
Y
For {sk1 , . . . , skk }, the Encrypt function will run k times, Fxil = Fzi l

once for each ski . At each run, the Encrypt function chooses i ∈A
zl,c
l,c
j,xi
a polynomial qxil with degree dxil = kxil − 1 for each node l
∆a,A0
 (0)
xil ∈ Ti . The process of assigning polynomials to each xil rj ·qzi (0) j,xi
Y
= e(g,g) l,c l

occurs in a top-bottom approach starting from the root node i ∈A


zl,c j,xi
in Ti . The Encrypt function chooses a secret si ∈ Zp and l
∆a,A0 (0)
sets the value of qxi1 (0) = si . Next, it randomly chooses the

Y rj ·qparent(zi )
(index(zil,c )) j,xi
remaining points of the polynomial to completely define it. For = e(g,g) l,c l

i ∈A
any other node xil ∈ Ti , the Encrypt function sets the value zl,c j,xi
l

qxil (0) = qparent (index(xil )), where qparent is the parent node Y rj ·qxi (i)·∆a,A0 (0)
l j,xi
polynomial of xil . The remaining points of those polynomials = e(g,g) l

i ∈A
zl,c
are then randomly chosen. j,xi
l
rj ·qxi (0)
Let Xi be the set of leaf nodes in Ti . The encrypted =e(g,g) l ,
(10)
symmetric key eski at Li is then constructed as:
where a = index(zil,c ) and A0j,xi = {index(zl,c i i
) : zl,c ∈
eski = Ti , C̃i = ski · e(g, g)α·si , Ci = g β·si , l

(7) Aj,xil }.
q i (0) q i (0)
{Cxil = g xl , Cx0 i = h(xil ) xl | ∀xil ∈ Xi } .

If the attributes in Aj satisfy Ti , then the following is
l
computed at the root node xi1 .
Decrypt(eski , SKj ): This is a deterministic function carried Ri = DecryptNode(eski , SKj , xi1 )
out by the data user. The inputs to this function are an rj ·q i (0)
= e(g, g) x1 (11)
encrypted symmetric key eski corresponding to Li , and the rj ·si
= e(g, g) .
private key SKj of the j th data user.
To obtain ski from the result derived in equation (11), we
The Decrypt function operates in a recursive manner prop- compute the following:
agating through the nodes in Ti by calling a recursive function C̃i ski · e(g, g)α·si
= = ski . (12)
defined as DecryptNode(eski , SKj , xil ). The inputs to this e(Ci ,Dj ) e(g β·si ,g (α+rj )/β )
Ri
function are eski , SKj and a node xil within Ti . If the e(g,g)rj ·si
attributes incorporated within SKj satisfy the rules within Ti , At this point, the data user can simply decrypt EFi using
the data user can decrypt eski and obtain ski . the ski derived from equation (12) to obtain the plaintext Fi
as follows:
DecryptNode(eski , SKj , xil ) performs differently depend- Fi = Decski (EFi ). (13)
ing on whether xil is a leaf or non-leaf node. If xil is a leaf node
and att(xil ) ∈
/ Aj , DecryptNode(eski , SKj , xil ) returns ∅, oth- If the data user is interested in attaining data files
erwise, att(xil ) = Aj,u ∈ Aj and DecryptNode(eski , SKj , xil ) {Fi+1 . . . Fk } belonging to levels {Li+1 . . . Lk } respectively,

2332-7790 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBDATA.2019.2907133, IEEE
Transactions on Big Data
7

the user can compute the symmetric keys {ski+1 , . . . , skk } of Phase 2: Repeat phase 1 with the restriction that none of the
the lower level using the derived ski as previously discussed newly generated private keys (SKq1 +1 , . . . , SKq ) correspond-
in equation (3). Finally, any {EFi+1 . . . EFk } at the lower ing to the different sets of attributes (Aq1 +1 , . . . , Aq ) contain
levels can be decrypted as in equation (13). correct sets of attributes that satisfy T ∗ .
Guess: The adversary outputs a guess µ0 of µ. The adversary
VI. S ECURITY A NALYSIS wins the security game if µ0 = µ and loses otherwise.
In this section, a formal proof of security for P-MOD is
Definition 1 (Secure against adaptively chosen plaintext at-
presented. It is assumed that a symmetric encryption technique
tack.). P-MOD is said to be secure against an adaptively
such as AES is used to secure each individual data file Fi ∈ F.
chosen plaintext attack if any polynomial-time adversary has
It is also assumed that the process of attribute authentication
only a negligible advantage in the security game, where the
between a data user and the key-issuer, in order for the data
advantage is defined as Adv = P r[µ0 = µ] − 21 .
user to obtain a private key, is secure and efficient.
The security of P-MOD is reduced to the hardness of the
Theorem 1. P-MOD is secure against unprivileged accesses
DBDH problem. Based on Theorem 1 and by proving that a
assuming the hash function is collision resistant.
single eski at any Li is secure, the whole system is proved
Proof. Let Aj = {Aj,1 , Aj,2 . . . Aj,n } be a set of attributes to be secure since all ciphertexts at any level follow the same
possessed by user DUj and Ti an access tree at level Li of a rules.
hierarchy. If Aj ∈ Ti , the user can obtain ski as discussed in
Theorem 2. P-MOD is secure against adaptively chosen
equations (8-12). Given hash function h, the user cannot derive
plaintext attack if the DBDH assumption holds.
{sk1 . . . ski−1 }. Therefore, P-MOD is immune to unprivileged
accesses. Proof. Assume there is an adversary that has non-negligible
advantage ε = AdvA . We construct a simulator that can distin-
Following the work presented in [27], we also provide a
guish a DBDH element from a random element with advantage
security proof based on ciphertext indistinguishability which
ε. Let e : G0 × G0 → G1 be an efficiently computable bilinear
proves that the adversary is not able to distinguish pairs of
map and G0 is of prime order p with generator g. The
ciphertexts. A cryptosystem is considered to be secure under
DBDH challenger begins by selecting the random parameters:
this property if the probability of an adversary to identify a
a, b, c ∈r Zp . Let g ∈ G0 be a generator and T is defined
data file that has been randomly selected from a two-element
as T = e(g, g)abc if µ = 0, and T = R otherwise, where
data file chosen by the adversary and encrypted does not
µ ∈r {0, 1} and R ∈r G1 . The simulator acts as the challenger
significantly exceed 12 . We first present an Indistinguishability
in the following game:
under Chosen-Plaintext Attack (IND-CPA) security game.
Next, based on the IND-CPA security game, a formal proof Initialization The simulator accepts the DBDH challenge re-
of security is provided for P-MOD. quested by the adversary who selects the T ∗ .
IND-CPA is a game used to test for security of asymmetric Setup: The simulator runs the Setup function. It chooses a
key encryption algorithms. In this game, the adversary is random α∗ ∈r Zp and computes the value α = α∗ + ab. Next,
∗ ∗
modeled as a probabilistic polynomial-time algorithm. The it simulates e(g, g)α ← e(g, g)α +ab = e(g, g)α e(g, g)ab and
algorithms in the game must be completed and the results B = g β ← g b , where b represents a simulation of the value
returned within a polynomial number of time steps. The β. Finally, it sends all components of P K = {G0 , g, B =
adversary will choose to be challenged on an encryption under g β , e(g, g)α } to the adversary.
a leveled access tree T ∗ . The adversary can impersonate any Phase 1: In this phase, the adversary requests multiple private
data user and request many private keys SKj . However, the keys (SK1 , . . . , SKq1 ) corresponding to q1 different sets of
game rules require that any attribute set Aj that the adversary attributes (A1 , . . . , Aq1 ). After receiving an SKj query for
claims to possess does not satisfy T ∗ . The security game is a given set Aj where Aj ∈ / T ∗ (i.e. ∀Aj,u ∈ Aj does not
divided into the following steps: satisfy T ), the simulator chooses a random rj0 ∈r Zp and

Initialization: The adversary selects an access tree T ∗ to be defines rj = rj0 − b. Next, it simulates Dj = g (α+rj )/β ←
∗ 0
challenged against and commits to it. g α/β g rj /β = g (α +ab)/b g (rj −b)/b . Then, ∀Aj,u ∈ Aj , it selects
Setup: The challenger runs the Setup function and sends the a random rj,u ∈r Zp and simulates Dj,u = g rj · h(u)rj,u ←
0
public key P K to the adversary. g rj −b h(u)rj,u and Dj,u 0
← g rj,u . Finally, the simulated values
0
Phase 1: The adversary requests multiple private keys of SKj = (Dj , {Dj,u , Dj,u | rj,u ∈r Zp , ∀Aj,u ∈ Aj }) are
(SK1 , . . . , SKq1 ) corresponding to q1 different sets of at- sent to the adversary.
tributes (A1 , . . . , Aq1 ). Challenge: The adversary sends two plaintext data files sk0
Challenge: The adversary submits two equal length data files and sk1 to the simulator who randomly chooses a µ ∈r {0, 1}
F0 and F1 to the challenger. The adversary also sends T ∗ by flipping a coin to select one of the files. The simulator
such that none of (SK1 , . . . , SKq1 ) generated from Phase 1 then runs the Encrypt function and derives a ciphertext CT ∗ .

contain correct sets of attributes that satisfy it. The challenger It simulates C̃ = skµ · e(g, g)α·skµ ← skµ · e(g, g)(α +ab)c =

flips a coin µ randomly and encrypts Fµ under T ∗ . Finally, skµ · T e(g, g)α c , where c represents a simulation of the value
the challenger sends the ciphertext CT ∗ generated according skµ and T = e(g, g)abc . Next, it simulates C = g β·skµ ← g bc .
to equation (7) to the adversary. Finally, for each attribute x ∈ X ∗ (set of leaf nodes in T ∗ )

2332-7790 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBDATA.2019.2907133, IEEE
Transactions on Big Data
8

T 1 at
L1 levels L2 . . . Lk consist of an OR gate to accommodate the
L2 G
T 2 at ... policies of the levels above. When generating keys for the data
OR
Lk
………
... A1 A2 users, the key issuer needs to incorporate only a subset of the
T k at ………
OR G G attributes rather than the entire set belonging to the access tree.
Case (1) As a result, the size of each private key is therefore optimized
A1 A2 A3 A4
G G ….. G
T 1 at
L1 and the key management process becomes less resource-
A1 A2 A3 A4 An Am L2
T 2 at ... G intensive. However, since levels are independent, attributes
G must be repeatedly incorporated into each access tree. As a
Lk
………
... A1 … An
T k at ………
Case (2)
B1 … Bm
result, the sizes of the access trees at lower levels will increase,
G
as shown in case (1) of Fig. 4. For complex organizations with
C1 … Cp
a large number of levels, the access trees will become even
Fig. 4: CP-ABE utilized in a hierarchical organization larger. This results in an increase in encryption and decryption
complexities.
On the other hand, case (2) favors minimizing encryption
it computes Cx = g qx (0) and Cx0 = h(x)qx (0) . The simulated and decryption complexities over key management. Each data
values of CT ∗ = {T ∗ , C̃, C, ∀x ∈ X ∗ : Cx , Cx0 } are then sent file is encrypted under an access policy with a set of unique
to the adversary. attributes at each level without considering privileges and
Phase 2: Repeat Phase 1 with the restriction that the requested relationships. This results in simpler access trees at each level
private keys are associated with attribute sets such that, ∀Aj | of the hierarchy and therefore lower encryption and decryption
q1 + 1 ≤ j ≤ q and Aj ∈ / T ∗. complexities. However, to grant access to an individual at a
Guess: The adversary tries to guess the value µ. If the specific level, the key-issuer must generate a private key for
adversary guesses the correct value, the simulator outputs 0 that individual that incorporates the attributes at that level and
to indicate that T = e(g, g)abc , or 1 to indicate that T = R, a all the levels below. Complex hierarchies that include a large
random group element in G1 . number of attributes, could result in complicated key manage-
Given a simulator A, if T = e(g, g)abc , then CT ∗ is a valid ment. As a result, private keys will require incorporating a
ciphertext, Adv = ε and large number of attributes.
 1
P r A g, g a , g b , g c , T = e(g, g)abc = 0 = + ε. (14)
 
2 B. Computational Cost
If T = R then C̃ is nothing more than a random value to
We formulate the encryption and decryption costs based on
the adversary. Therefore,
 1 the number of group operations fG0 , fG1 for groups G0 , G1
P r A g, g a , g b , g c , T = R = 0 = .
 
(15) respectively and the number of bilinear mapping operations e
2 involved in the Encrypt and the Decrypt functions for each
From equation (14) and equation (15), we can conclude that
P r A g, g a , g b , g c , T = e(g, g)abc = 0
   scheme. Table II summarizes the number of operations for
(16) each scheme.
−P r A g, g a , g b , g c , T = R = 0 = ε.
  
1) Encryption Cost: Encryption cost is measured as the
Therefore, the simulator plays the DBDH game with a non- number of basic operations involved in generating the cipher-
negligible advantage and the proof is complete. text from the plaintext. It is formulated based on the Encrypt
function which involves group operations fG0 and fG1 .
VII. P ERFORMANCE A NALYSIS The number of operations involved in sharing an indepen-
In this section, we present a performance analysis for dent piece of data using CP-ABE is (2|X| + 1) and 2 for
P-MOD and compare it with three existing schemes, CP- fG0 and fG1 respectively, where |X| denotes the number of
ABE [7], HABE [21] and FH-CP-ABE [9]. leaf nodes (attributes) of the access tree T . For a hierarchical
organization, we present the encryption complexity of case (1)
in Fig. 4 as it involves a relationship between all levels. In a
A. Traditional CP-ABE in a Hierarchical Setting real life application, case (2) would not satisfy a hierarchical
CP-ABE [7] handles sharing of independent pieces of data organization as it requires attributes to be shared by all
based on independent access policies. It was not designed to users regardless which level they belong to. As shown in
support a privilege-based access structure (i.e. hierarchical or- Table II, the number of operations involved in the encryption
ganization). Therefore, to adapt CP-ABE to a privilege-based process is formulated as (2(|X1 | + · · · + |Xk |) + k) and 2k
access structure, the Encrypt function runs once for each level. for fG0 and fG1 respectively, where |X1 |, |X2 |, · · · , |Xk |
However, if it were to be used in a hierarchical organization, are the number of leaf nodes (attributes) associated with
there would be a trade off between the key management and access trees T1 , T2 , · · · , Tk respectively. However, the reuse
the complexity of the encryption and decryption processes. of attributes needed at each level in this scheme increases
Fig. 4 shows the two general cases in which CP-ABE is the computational complexity, making it an overall inefficient
utilized to share data with users in a hierarchical organization. solution for hierarchical organizational structures.
In case (1), key management is favored over encryption and Similarly, P-MOD generates a ciphertext for each level of
decryption complexities. The root node of each access tree at the hierarchy. The number of operations involved is (2(|Y1 | +

2332-7790 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBDATA.2019.2907133, IEEE
Transactions on Big Data
9

· · · + |Yk |) + k) and 2k for fG0 and fG1 respectively, where of operations required are |Aj − 1| and 2|S| + 2 respectively.
|Y1 |, |Y2 |, · · · , |Yk | are the number of leaf nodes (attributes) Therefore, for both schemes, encryption complexity would
associated with access trees T1 , T2 , · · · , Tk respectively. How- greatly depend on the number of attributes possessed by the
ever, P-MOD leverages a privilege-based access structure as user and the access tree generated during encryption.
discussed in Section V-B. This results in smaller sized and FH-CP-ABE [9] aims to satisfy a certain transport node
level-specific access trees which minimize the number of (level) of the single access tree T allowing the data user to
attributes where, |Yi | < |Xi |, ∀i ∈ k. decrypt certain encrypted files up to that level. The number
When comparing P-MOD with hierarchical schemes such of operations involved in this process are (2|Aj | + 1) and
as HABE and FH-CP-ABE, P-MOD minimizes the overall (2|S| + v|AT | + 2k) for operations e and fG1 respectively.
number of operations. This is because the encryption process Again, since FH-CP-ABE uses a single access tree T , the
for schemes such as HABE and FH-CP-ABE involve more number of transport nodes |AT | can be large, resulting in
complex hierarchies and access trees that contain all the higher decryption complexity. The least number of interior
access policies for all the levels. On the contrary, P-MOD nodes |S| that satisfy T can also be large if the construction
involves smaller access trees, each one limited to level-specific of the access tree T is not optimized. Constructing a single
attributes and policies. access tree that accommodates a large number of attributes
In HABE, a single hierarchy represents the root master, do- is resource-intensive and could become complicated as the
main masters, users and attributes. This may result in complex access rules become more sophisticated. When comparing the
hierarchies as the number of users increase. Therefore, the decryption costs of P-MOD and FH-CP-ABE, the decryption
size of |X| may end up being large making the encryption cost of P-MOD depends on |Aj | and |S| while the decryption
process expensive. Similary, FH-CP-ABE [9] uses a single complexity of FH-CP-ABE depends on |S|, v, |AT | and k. The
access tree T to encrypt all data files to be shared. The number size of the sets in FH-CP-ABE will always be greater than the
of operations involved are (2|X| + k) and (2v|AT | + 2k) for size of the sets in P-MOD due to the different constructions
operations fG0 and fG1 respectively, where |AT | is the number of access trees in each scheme.
of transport nodes (levels), and v is the number of children
nodes associated with a transport node. This may also result in
C. Storage Cost
large sets |X|, |AT | and v and lead to an expensive encryption
process. To evaluate the storage efficiency, we formulate the bit-
2) Decryption Cost: Decryption cost is measured as the length of the private keys and ciphertexts generated by each
number of basic operations involved in decrypting the cipher- scheme. Table III represents a comparison of the storage
text into plaintext. It is formulated based on the Decrypt func- costs in bit-length for all schemes. The bit-length of a single
tion which involves bilinear operations e and group operations element in G0 , G1 and Zp are denoted as LG0 , LG1 and LZp
fG1 . respectively.
For a single Decrypt run, CP-ABE [7] involves (2|Aj |) As shown in table, the bit-length of the private keys for all
and (2|S| + 2) number of operations e and fG1 respectively, schemes are similar. In terms of space complexity, this can be
where |Aj | is the number of attributes possessed by the j th reduced to O(Aj ). On the other hand, the size of ciphertexts
data user and |S| is the least number of interior nodes that differ. The size of a ciphertext is based on the output of the
satisfy T . However, to adapt CP-ABE to a privilege-based Encrypt function for each scheme.
access structure, the Decrypt function is run as many times as For CP-ABE, the total size of all generated ciphertexts from
the number of ciphertexts a data user wishes to decrypt. The all levels consists of (2(|X1 | + · · · + |Xk |) + k) elements
number of operations involved are k(2|Aj | + 1) and (2[|S1 | + from G0 and k elements from G1 . Using this scheme, the
· · · + |Sk |] + 2k) for operations e and fG1 respectively, where lower levels must accommodate attributes of the higher levels.
|S1 |, |S2 |, · · · , |Sk | are the least number of interior nodes that Ciphertext size can potentially end up large in size due to
satisfy the access trees T1 , T2 , · · · , Tk respectively. attribute replication at each level.
The number of operations involved in the decryption process For HABE, the ciphertext consists of |X| elements from
for P-MOD is similar to a single CP-ABE decryption run. G0 and 1 element from G1 . Similar to the discussion in the
P-MOD needs to run the Decrypt function only one time, encryption cost of HABE, the size of X can be large due
even if the data user needs to obtain more than one data to the complexity of how the hierarchy is defined. Likewise,
file. The Decrypt function requires (2|Aj |) and (2|S| + 2) the single ciphertext generated by FH-CP-ABE [9] consists of
for operations e and fG1 respectively. Once the data user (2|X| + k) elements from G0 and (v|AT | + k) elements from
successfully decrypts the ciphertext (obtains the symmetric G1 . In this scheme, the ciphertext size depends on |X|, |AT |,
key at his/her level), the user can derive the remaining lower v and k. As the size of these sets grow, the ciphertext size can
level keys as described in equation (3). The complexity of grow exponentially based on how the tree T is constructed.
the operations involved in deriving the symmetric keys and P-MOD generates ciphertexts in a similar approach to those
decrypting ciphertexts at the lower levels are negligible in generated by CP-ABE. The total size of all generated cipher-
comparison with the group and bilinear operations involved in texts consists of (2[|Y1 | + · · · + |Yk |] + k) elements from G0
running the Decrypt function, and therefore could be ignored. and k elements from G1 . However, the size of the ciphertext
When comparing P-MOD to HABE, the e number of oper- generated by P-MOD is shown to be smaller in size than CP-
ations involved are slightly similar. However, the fG1 number ABE in all instances. This is based on the composition of our

2332-7790 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBDATA.2019.2907133, IEEE
Transactions on Big Data
10

TABLE II: Comparison of number of operations


Function Operation CP-ABE [7] HABE [8], [21] FH-CP-ABE [9] P-MOD
fG0 2(|X1 | + · · · + |Xk |) + k |X| 2|X| + k 2(|Y1 | + · · · + |Yk |) + k
Encrypt
fG 1 2k 1 2v|AT | + 2k 2
e k(2|Aj | + 1) 3|Aj | 2|Aj | + 1 2|Aj |
Decrypt
fG 1 2(|S1 | + · · · + |Sk |) + 2k |Aj | − 1 2|S| + v|AT | + 2k 2|S| + 2

TABLE III: Comparison of private key and ciphertext sizes


Component Length CP-ABE [7] HABE [8], [21] FH-CP-ABE [9] P-MOD
Private Key LG0 2|Aj | + 1 |Aj | 2|Aj | + 1 2|Aj | + 1
LG0 2(|X1 | + · · · + |Xk |) + k |X| 2|X| + k 2(|Y1 | + · · · + |Yk |) + k
Ciphertext
LG1 k 1 v|AT | + k k

proposed access structure which does not duplicate attributes, F is partitioned into k sections, where ∀Fi ∈ F and 1 ≤
therefore generates smaller ciphertexts. i ≤ k, each Fi represents one or more full columns of F.
Table IV represents the 9 record attributes distribution of F
into each partition Fi , based on the given value for k in each
VIII. S IMULATIONS scenario.
In this section, the results of various simulations are pre-
sented to support the performance analysis discussed in Sec- TABLE IV: Attribute distribution in each partition Fi
tion VII. P-MOD is implemented and simulated in Java using k F1 F2 F3 F4 F5 F6 F7 F8 F9
3 2 3 4 - - - - - -
the CP-ABE toolkit [28] and the Java Pairing-Based Cryptog- 6 1 1 1 2 2 2 - - -
raphy library (JPBC) [29]. For comparison, simulations are 9 1 1 1 1 1 1 1 1 1
also conducted for CP-ABE [7] and FH-CP-ABE [9] under
the same conditions as P-MOD. All simulations are conducted All simulations are performed in consideration of users at
on an Intel(R) Core(TM) i5-4200M at 2.50 GHz and 4.00 GB the most sensitive (highest) level within the hierarchy. This
RAM machine running the Windows 10 OS. approach considers the most complicated applications, where
The data set used in our simulations is the real U.S. a user needs to gain access to the entire file. Fig. 5 summarizes
Census Income data set [10]. It consists of 30,163 records, the time expended by the three schemes to generate a private
and each record is composed of 9 different record attributes. key for the user, encrypt all partitions of F and decrypt all
The records are stored in a Microsoft Excel file F, with a partitions respectively, in all 6 scenarios. By analyzing the
total size of 2.42MB. For simulation purposes, it is assumed figures, some observations can be derived. The results are
that data sensitivity is defined over record attributes (vertical summarized in the following subsections.
columns), corresponding to the third approach described in
Section V-A. That means that F can be partitioned up to
9 different segments where each segment contains the entire A. Key Generation Time-Cost
vertical column of the file of different sensitivity. To measure the time to generate a private key for a user,
In the simulations, the number of levels k of the hierarchy H the same attribute and level conditions are applied to all three
is equivalent to the number of file partitions being shared. We schemes. P-MOD outperforms CP-ABE and FH-CP-ABE in
also define the total number of different user attributes applied all experimental evaluations. As illustrated in Fig. 5(a), the
to users across all levels as N . While alternating the values time taken to generate a private key for a user at the highest
of k and N , we compare the key generation, encryption, and level in CP-ABE and FH-CP-ABE, is independent of the value
decryption time-costs of all schemes. The experiments include of k and remains nearly constant when N is kept constant, for
applying the values for k = {3, 6, 9}, to each value of N = both values of N = {10, 100} tested.
{10, 100}, resulting in 6 experimental cases. Within each case, In comparison, P-MOD reacts differently. When the total
it is possible to distribute the user attributes among the levels in number of user attributes is normally distributed among the
numerous ways. For example, when testing the schemes with levels, the time taken to generate a private key by P-MOD
a hierarchy H for k = 3 and N = 10, there are 36 different is approximately the reciprocal of the value of k multiplied
ways to distribute the N = 10 user attributes among the k = 3 by the equivalent time taken by CP-ABE or FH-CP-ABE to
levels, excluding the cases of having zero user attributes at any perform the same function. For example, the time taken by
level. A possible way of distributing them could be AL1 = 3, CP-ABE and FH-CP-ABE in the case where k = 6 and N =
AL2 = 3 and AL3 = 4, where AL1 represents the set of user 100 is approximately 2.2 × 104 seconds. The time taken by
attributes possessed by users in the highest level within H and P-MOD in the same conditions is approximately 0.35 × 104
AL3 the set of user attributes possessed by users in the lowest seconds, which is nearly 16 times the time-cost of the other
level. As the values of k and N increase, the possible ways of schemes. Therefore, the time taken to generate a private key
user attribute distribution among all levels increases. Without by P-MOD is inversely proportional to the value of k in the
loss of generality, we assume that the number of attributes for hierarchy. This is true for a constant value N with normally
each level follows the uniform distribution in our simulations. distributed attributes.

2332-7790 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBDATA.2019.2907133, IEEE
Transactions on Big Data
11

#10 4 Key Generation #10 4 Encryption Decryption


2.5 12 8000
CP-ABE 10A
FH-CP-ABE 10A 7000
10 P-MOD 10A
2
CP-ABE 100A
FH-CP-ABE 100A
6000
P-MOD 100A
8
CP-ABE 10A 5000 CP-ABE 10A
1.5
Time in ms

Time in ms
Time in ms
FH-CP-ABE 10A FH-CP-ABE 10A
P-MOD 10A P-MOD 10A
6 4000
CP-ABE 100A CP-ABE 100A
FH-CP-ABE 100A FH-CP-ABE 100A
1
P-MOD 100A 3000 P-MOD 100A
4

2000
0.5
2
1000

0 0 0
3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9
Number of levels (k) Number of levels (k) Number of levels (k)

(a) (b) (c)


Fig. 5: Performance comparison using the real U.S. Census Income Data [10]: (a) Key generation time, (b) Encryption time,
and (c) Decryption time

Another observation that can be made from Fig. 5(a) is For example, when k = 9, δP-MOD is approximately 21.6% of
the effect of the values N on the time expended. The time- δCP-ABE and approximately 82% of δFH-CP-ABE .
cost is directly related to the value N and the degree of this
proportional increase is different for each scheme as the value C. Decryption Time-Cost
N increases. As previously discussed, the experiments are performed in
We define variable δ as the difference in time-cost of two the perspective of a user that appears at the highest level
experimental evaluations of the same scheme at N = 10 and of the hierarchy. Taking this into account, the decryption
N = 100, while keeping k fixed. In Fig. 5(a), consider the time-cost is defined as the time for the user to successfully
simulated values for each scheme at k = 9. Both CP-ABE and decrypt all ciphertexts EFi corresponding to all partitions
FH-CP-ABE result in large time-cost changes, approximately Fi , if the user possesses the correct set of user attributes.
δCP-ABE = 19039ms and δFH-CP-ABE = 18821ms, while P- Fig. 5(c) illustrates the time to perform the decryption function
MOD results in a smaller value, δP-MOD = 2204ms. Based by each scheme. The decryption function of both CP-ABE
on these values, δP-MOD is approximately 11.7% of δCP-ABE and FH-CP-ABE both involve e and fG1 operations that are
and δFH-CP-ABE , resulting in an approximately 88.3% im- dependent on the values k and N . As these values increase,
provement. The significance of this observation can be seen in the decryption time-cost increases linearly for both schemes,
applications where N is a large value. Efficiency can be gained proving the correctness of the decryption complexity analysis
in time expenditure by utilizing P-MOD, versus CP-ABE and in Section VII-B2.
FH-CP-ABE. When measuring the decryption time-cost, P-MOD outper-
forms both CP-ABE and FH-CP-ABE. This is due to the time-
cost being inversely proportional to the value of k. The time-
B. Encryption Time-Cost cost decreases as the value of k increases while N is kept
The encryption time-cost is the time it takes each scheme constant.
to perform the encryption function over all partitions of F. The decryption time-cost of P-MOD does not severely
Fig. 5(b) represents the time expenditure of each scheme increase while the value N changes from 10 to 100. In
under all six experimental scenarios. P-MOD surpasses both contrast to this, CP-ABE and FH-CP-ABE are greatly affected,
CP-ABE and FH-CP-ABE in every experimental case. For as seen in Fig. 5(c). For example, when k = 9, δP-MOD is
example, compare the time duration of the three schemes approximately 13.4% of δCP-ABE and approximately 15.8% of
at k = 9 and N = 100. The time duration for CP-ABE is δFH-CP-ABE . The decryption time-cost of P-MOD is expected
approximately 4.3 times of P-MOD to perform encryption. to drop when k increases while keeping the file size constant.
Similarly, FH-CP-ABE is approximately 1.2 times of P-MOD In summary, for a hierarchical organization with many
to perform encryption. levels, the simulation results show that P-MOD is significantly
All schemes follow a direct proportional pattern as the more efficient at generating keys, encryption, and decryption
values k and N increase. These results prove the correctness than that of both CP-ABE and FH-CP-ABE schemes.
of the encryption computational complexity analysis presented
in Section VII-B1. The encryption function in all schemes in- IX. C ONCLUSION
volves a number of fG0 and fG1 operations that are dependent The numerous benefits provided by the cloud have driven
on both the values k and N . However, based on the proposed many large multilevel organizations to store and share their
hierarchical access structure, P-MOD is able to outperform data on it. This paper begins by pointing out major security
both CP-ABE and FH-CP-ABE. In addition to this, the effect concerns data owners have when sharing their data on the
of changing the value N is also illustrated clearly in Fig. 5(b). cloud. Next, the most widely implemented and researched

2332-7790 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBDATA.2019.2907133, IEEE
Transactions on Big Data
12

data sharing schemes are briefly discussed revealing points [20] C. Gentry and A. Silverberg, “Hierarchical id-based cryptography,” in
of weakness in each. To address the concerns, this paper International Conference on the Theory and Application of Cryptology
and Information Security, pp. 548–566, Springer, 2002.
proposes a Privilege-based Multilevel Organizational Data- [21] G. Wang, Q. Liu, and J. Wu, “Hierarchical attribute-based encryption for
sharing scheme (P-MOD) that allows data to be shared ef- fine-grained access control in cloud storage services,” in Proceedings of
ficiently and securely on the cloud. P-MOD partitions a data the 17th ACM Conference on Computer and Communications Security,
pp. 735–737, ACM, 2010.
file into multiple segments based on user privileges and data [22] R. Bobba, H. Khurana, and M. Prabhakaran, “Attribute-sets: A practi-
sensitivity. Each segment of the data file is then shared depend- cally motivated enhancement to attribute-based encryption,” in European
ing on data user privileges. We formally prove that P-MOD is Symposium on Research in Computer Security, pp. 587–604, Springer,
2009.
secure against adaptively chosen plaintext attack assuming that [23] Z. Wan, J. Liu, and R. H. Deng, “Hasbe: a hierarchical attribute-based
the DBDH assumption holds. Our comprehensive performance solution for flexible and scalable access control in cloud computing,”
and simulation comparisons with the three most representative IEEE Transactions on Information Forensics and Security, vol. 7, no. 2,
pp. 743–754, 2012.
schemes show that P-MOD can significantly reduce the com- [24] H. Deng, Q. Wu, B. Qin, J. Domingo-Ferrer, L. Zhang, J. Liu, and
putational complexity while minimizing the storage space. Our W. Shi, “Ciphertext-policy hierarchical attribute-based encryption with
proposed scheme lays a foundation for future attribute-based, short ciphertexts,” Information Sciences, vol. 275, pp. 370–384, 2014.
[25] J. Li, Q. Wang, C. Wang, and K. Ren, “Enhancing attribute-based
secure data management and smart contract development. encryption with attribute hierarchy,” Mobile Networks and Applications,
vol. 16, no. 5, pp. 553–561, 2011.
[26] M. Naor and M. Yung, “Universal one-way hash functions and their
R EFERENCES cryptographic applications,” in Proceedings of the twenty-first annual
ACM Symposium on Theory of Computing, pp. 33–43, ACM, 1989.
[1] P. Institute, “Sixth annual benchmark study on privacy and security of [27] B. Waters, “Ciphertext-policy attribute-based encryption: An expressive,
healthcare data,” tech. rep., Ponemon Institute LLC, 2016. efficient, and provably secure realization,” in International Workshop on
[2] R. Cohen, “The cloud hits the mainstream: More than half of U.S. Public Key Cryptography, pp. 53–70, Springer, 2011.
businesses now use cloud computing.” http://www.forbes.com, April [28] J. Wang, “ciphertext-policy attribute based encryption java toolkit.”
2013. Online; posted 10-January-2017. https://github.com/junwei-wang/cpabe, 2015.
[3] P. Mell and T. Grance, “The NIST definition of cloud computing,” 2011. [29] A. De Caro and V. Iovino, “JPBC: Java pairing based cryptography,”
[4] A. C. OConnor and R. J. Loomis, “2010 economic analysis of role-based in IEEE Symposium on Computers and Communications (ISCC), 2011,
access control,” NIST, Gaithersburg, MD, vol. 20899, 2010. pp. 850–855, IEEE, 2011.
[5] A. Elliott and S. Knight, “Role explosion: Acknowledging the problem.,”
in Software Engineering Research and Practice, pp. 349–355, 2010.
[6] E. Zaghloul, T. Li, and J. Ren, “An attribute-based distributed data
sharing scheme,” in IEEE Globeocm 2019, (Abu Dhabi, UAE.), 9-13
December 2018.
[7] J. Bethencourt, A. Sahai, and B. Waters, “Ciphertext-policy attribute- Ehab Zaghloul received his B.S. and M.Sc de-
based encryption,” in 2007 IEEE Symposium on Security and Privacy grees in Computer Engineering from Arab Academy
(SP’07), pp. 321–334, IEEE, 2007. for Science and Technology (AAST), Alexandria,
[8] G. Wang, Q. Liu, J. Wu, and M. Guo, “Hierarchical attribute-based Egypt in 2012 and 2015 respectively. He is currently
encryption and scalable user revocation for sharing data in cloud pursing his Ph.D degree in Electrical and Computer
servers,” Computers & Security, vol. 30, no. 5, pp. 320–331, 2011. Engineering at Michigan State University (MSU),
[9] S. Wang, J. Zhou, J. K. Liu, J. Yu, J. Chen, and W. Xie, “An efficient East Lansing, USA. His research interests include
file hierarchy attribute-based encryption scheme in cloud computing,” applied cryptography, secure and private data shar-
IEEE Transactions on Information Forensics and Security, vol. 11, no. 6, ing, distributed systems, and blockchain.
pp. 1265–1277, 2016.
[10] M. Lichman, “UCI machine learning repository,” 2013.
[11] A. Sahai and B. Waters, “Fuzzy identity-based encryption,” in Annual Kai Zhou (S’16) received his Ph.D. degree in
International Conference on the Theory and Applications of Crypto- Electrical and Computer Engineering from Michigan
graphic Techniques, pp. 457–473, Springer, 2005. State University in 2018. Prior to that, he obtained a
[12] D. Boneh, A. Sahai, and B. Waters, “Functional encryption: Definitions B.S. degree in Electrical Engineering from Shanghai
and challenges,” in Theory of Cryptography Conference, pp. 253–273, Jiao Tong University, China, in 2013. His research
Springer, 2011. interests include applied cryptography, data security
[13] V. Goyal, O. Pandey, A. Sahai, and B. Waters, “Attribute-based encryp- and privacy, adversarial social network analysis, and
tion for fine-grained access control of encrypted data,” in Proceedings of game theory. He is currently a postdoctoral research
the 13th ACM Conference on Computer and Communications Security, associate in the department of computer science and
pp. 89–98, Acm, 2006. engineering, Washington University in St. Louis.
[14] D. F. Ferraiolo and D. R. Kuhn, “Role-based access controls,” arXiv
preprint arXiv:0903.2171, 2009.
[15] L. Cheung and C. Newport, “Provably secure ciphertext policy ABE,” Jian Ren (SM’09) received his Ph.D. degree in
in Proceedings of the 14th ACM Conference on Computer and Commu- in Electrical Engineering from Xidian University,
nications Security, pp. 456–465, ACM, 2007. China. Currently, he is an Associate Professor in the
[16] S. Roy and M. Chuah, “Secure data retrieval based on ciphertext policy Department of Electrical and Computer Engineering
attribute-based encryption (CP-ABE) system for the dtns,” tech. rep., at Michigan State University. Prior to joining MSU,
Citeseer, 2009. Dr. Ren was the Leading Secure Architect at Avaya
[17] J. Lai, R. H. Deng, and Y. Li, “Fully secure cipertext-policy hiding CP- Lab (2000-2002), Bell Lab (1998-2000) and Racal
ABE,” in International Conference on Information Security Practice and Datacom (1997-1998). Dr. Ren’s research interests
Experience, pp. 24–39, Springer, 2011. include network security, distributed storage, cloud
[18] J. Lai, R. H. Deng, and Y. Li, “Expressive CP-ABE with partially computing, big data security and privacy, and wire-
hidden access structures,” in Proceedings of the 7th ACM Symposium on less communications. Dr. Ren received the National
Information, Computer and Communications Security, pp. 18–19, ACM, Science Foundation (NSF) CAREER award in 2009. He served as the TPC
2012. Chair of ICNC’17 and General Chair of ICNC’18. He serves as the Executive
[19] F. Guo, Y. Mu, W. Susilo, D. S. Wong, and V. Varadharajan, “CP-ABE Chair of ICNC’19. Dr. Ren serves as an Associate Editor of ACM Transactions
with constant-size keys for lightweight devices,” IEEE Transactions on on Sensor Networks. Dr. Ren is a senior member of the IEEE.
Information Forensics and Security, vol. 9, no. 5, pp. 763–771, 2014.

2332-7790 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like